[PATCH libdrm] libdrm: Fix issue about differrent domainID but same BDF

2019-02-13 Thread Emily Deng
For multiple GPUs which has the same BDF, but has different domain ID,
the drmOpenByBusid will return the wrong fd when startx.

The reproduce sequence as below:
1. Call drmOpenByBusid to open Card0, then will return the right fd0, and the
fd0 is master privilege;
2. Call drmOpenByBusid to open Card1. In function drmOpenByBusid, it will
open Card0 first, this time, the fd1 for opening Card0 is not master
privilege, and will call drmSetInterfaceVersion to identify the
domain ID feature, as the fd1 is not master privilege, then 
drmSetInterfaceVersion
will fail, and then won't compare domain ID, then return the wrong fd for Card1.

Solution:
First loop search the best match fd about drm 1.4.

Signed-off-by: Emily Deng 
---
 xf86drm.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 336d64d..b60e029 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -584,11 +584,34 @@ static int drmOpenByBusid(const char *busid, int type)
 if (base < 0)
 return -1;
 
+/* We need to try for 1.4 first for proper PCI domain support */
 drmMsg("drmOpenByBusid: Searching for BusID %s\n", busid);
 for (i = base; i < base + DRM_MAX_MINOR; i++) {
 fd = drmOpenMinor(i, 1, type);
 drmMsg("drmOpenByBusid: drmOpenMinor returns %d\n", fd);
 if (fd >= 0) {
+sv.drm_di_major = 1;
+sv.drm_di_minor = 4;
+sv.drm_dd_major = -1;/* Don't care */
+sv.drm_dd_minor = -1;/* Don't care */
+if (!drmSetInterfaceVersion(fd, )) {
+buf = drmGetBusid(fd);
+drmMsg("drmOpenByBusid: drmGetBusid reports %s\n", buf);
+if (buf && drmMatchBusID(buf, busid, 1)) {
+drmFreeBusid(buf);
+return fd;
+}
+if (buf)
+drmFreeBusid(buf);
+}
+close(fd);
+}
+}
+
+   for (i = base; i < base + DRM_MAX_MINOR; i++) {
+fd = drmOpenMinor(i, 1, type);
+drmMsg("drmOpenByBusid: drmOpenMinor returns %d\n", fd);
+if (fd >= 0) {
 /* We need to try for 1.4 first for proper PCI domain support
  * and if that fails, we know the kernel is busted
  */
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Error handling issues about CHECKED_RETURN

2019-02-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Bo YU 
> Sent: Thursday, February 14, 2019 12:46 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; airl...@linux.ie; dan...@ffwll.ch; Zhu, Rex
> ; Grodzovsky, Andrey
> ; dri-de...@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Cc: Bo Yu ; amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Error handling issues about
> CHECKED_RETURN
> 
> From: Bo Yu 
> 
> Calling "amdgpu_ring_test_helper" without checking return value

We could need to continue to ring test even there is one ring test failed.

-David

> 
> Signed-off-by: Bo Yu 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 57cb3a51bda7..48465a61516b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -4728,7 +4728,9 @@ static int gfx_v8_0_cp_test_all_rings(struct
> amdgpu_device *adev)
> 
>   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>   ring = >gfx.compute_ring[i];
> - amdgpu_ring_test_helper(ring);
> + r = amdgpu_ring_test_helper(ring);
> + if (r)
> + return r;
>   }
> 
>   return 0;
> --
> 2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Move dsc rate params compute into drm

2019-02-13 Thread kbuild test robot via amd-gfx
Hi David,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.0-rc4 next-20190213]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/David-Francis/Make-DRM-DSC-helpers-more-generally-usable/20190214-052541
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   net/mac80211/sta_info.h:590: warning: Function parameter or member 
'tx_stats.last_rate' not described in 'sta_info'
   net/mac80211/sta_info.h:590: warning: Function parameter or member 
'tx_stats.msdu' not described in 'sta_info'
   kernel/rcu/tree.c:711: warning: Excess function parameter 'irq' description 
in 'rcu_nmi_exit'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_excl.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_excl.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_excl.active' not described in 'dma_buf'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_shared.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_shared.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:304: warning: Function parameter or member 
'cb_shared.active' not described in 'dma_buf'
   include/linux/dma-fence-array.h:54: warning: Function parameter or member 
'work' not described in 'dma_fence_array'
   include/linux/firmware/intel/stratix10-svc-client.h:1: warning: no 
structured comments found
   include/linux/gpio/driver.h:371: warning: Function parameter or member 
'init_valid_mask' not described in 'gpio_chip'
   include/linux/iio/hw-consumer.h:1: warning: no structured comments found
   include/linux/input/sparse-keymap.h:46: warning: Function parameter or 
member 'sw' not described in 'key_entry'
   drivers/mtd/nand/raw/nand_base.c:420: warning: Function parameter or member 
'chip' not described in 'nand_fill_oob'
   drivers/mtd/nand/raw/nand_bbt.c:173: warning: Function parameter or member 
'this' not described in 'read_bbt'
   drivers/mtd/nand/raw/nand_bbt.c:173: warning: Excess function parameter 
'chip' description in 'read_bbt'
   include/linux/regulator/machine.h:199: warning: Function parameter or member 
'max_uV_step' not described in 'regulation_constraints'
   include/linux/regulator/driver.h:228: warning: Function parameter or member 
'resume' not described in 'regulator_ops'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 
'esw.esw0' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 
'esw.esw1' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 
'esw.esw2' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 
'esw.esw3' not described in 'irb'
   arch/s390/include/asm/cio.h:245: warning: Function parameter or member 
'esw.eadm' not described in 'irb'
   drivers/slimbus/stream.c:1: warning: no structured comments found
   include/linux/spi/spi.h:180: warning: Function parameter or member 
'driver_override' not described in 'spi_device'
   drivers/target/target_core_device.c:1: warning: no structured comments found
   drivers/usb/typec/bus.c:1: warning: no structured comments found
   drivers/usb/typec/class.c:1: warning: no structured comments found
   include/linux/w1.h:281: warning: Function parameter or member 
'of_match_table' not described in 'w1_family'
   fs/direct-io.c:257: warning: Excess function parameter 'offset' description 
in 'dio_complete'
   fs/file_table.c:1: warning: no structured comments found
   fs/libfs.c:477: warning: Excess function parameter 'available' description 
in 'simple_write_end'
   fs/posix_acl.c:646: warning: Function parameter or member 'inode' not 
described in 'posix_acl_update_mode'
   fs/posix_acl.c:646: warning: Function parameter or member 'mode_p' not 
described in 'posix_acl_update_mode'
   fs/posix_acl.c:646: warning: Function parameter or member 'acl' not 
described in 'posix_acl_update_mode'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function 
parameter 'mm' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function 
parameter 'start' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function 
parameter 'end' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:343: warning: Excess function 
parameter 'mm' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:343: warning: Excess function 
parameter 'start' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/

Re: "ring gfx timeout" with Vega 64 on mesa 19.0.0-rc2 and kernel 5.0.0-rc6 (GPU reset still not works)

2019-02-13 Thread Grodzovsky, Andrey
Looks like you are still running this without the latest hang fix since i see 
the deadlock again, but actually what i forgot to ask you is to load amdgpu 
with vm_fault_stop=2 to freeze the ASIC once VM_FAULT is encountered - sorry 
about that. So please retest with amdgpu.vm_fault_stop=2 parameter in GRUB 
loader.

Andrey

On 2/13/19 3:08 PM, Mikhail Gavrilov wrote:
On Wed, 13 Feb 2019 at 23:40, Grodzovsky, Andrey 
mailto:andrey.grodzov...@amd.com>> wrote:
>
> Regarding the original VM_FAULT we can try to debug that a bit to - enable 
> this from trace-cmd
>
> sudo trace-cmd start -e dma_fence -e gpu_scheduler -e amdgpu -v -e 
> "amdgpu:amdgpu_mm_rreg" -e "amdgpu:amdgpu_mm_wreg" -e "amdgpu:amdgpu_iv"
>
> and when the hang happens
>
> as root
> cd /sys/kernel/debug/tracing && cat trace > event_dump
>
> + as usual would be nice to have the relevant wave dump and registers from 
> UMR + dmesg.
>
> Andrey
[https://ssl.gstatic.com/docs/doclist/images/icon_10_generic_list.png] 
gfx.tar.xz


Just in case, I duplicated all the files on the  file sharing service Mega:
https://mega.nz/#F!pgYCjYrS!NkeTFIja_qwmxqLoSEUyzA


--
Best Regards,
Mike Gavrilov.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 3/3] drm/dsc: Change infoframe_pack to payload_pack

2019-02-13 Thread Manasi Navare via amd-gfx
On Wed, Feb 13, 2019 at 09:45:36AM -0500, David Francis wrote:
> The function drm_dsc_pps_infoframe_pack only
> packed the payload portion of the infoframe.
> Change the input struct to the PPS payload
> to clarify the function's purpose and allow
> for drivers with their own handling of sdp.
> (e.g. drivers with their own struct for
> all SDP transactions)
>

I think if we are just sending pps_payload as an argument to this function
to pack payload, it also makes sense to just send pps_header as an input
to drm_dsc_dp_pps_header_init() to follow the consistency there.
So with that the caller will have to call the header_init first , initialize the
sdp header and then call the _payload_pack to pack the payload bytes.
And then send the entire infoframe to the sink.

Could you please also make that change in this patch?

Regards
Manasi
 
> Signed-off-by: David Francis 
> ---
>  drivers/gpu/drm/drm_dsc.c | 86 +++
>  drivers/gpu/drm/i915/intel_vdsc.c |  2 +-
>  include/drm/drm_dsc.h |  2 +-
>  3 files changed, 45 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index 9e675dd39a44..4ada4d4f59ac 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -38,42 +38,42 @@ void drm_dsc_dp_pps_header_init(struct 
> drm_dsc_pps_infoframe *pps_sdp)
>  EXPORT_SYMBOL(drm_dsc_dp_pps_header_init);
>  
>  /**
> - * drm_dsc_pps_infoframe_pack() - Populates the DSC PPS infoframe
> + * drm_dsc_pps_payload_pack() - Populates the DSC PPS payload
>   * using the DSC configuration parameters in the order expected
>   * by the DSC Display Sink device. For the DSC, the sink device
>   * expects the PPS payload in the big endian format for the fields
>   * that span more than 1 byte.
>   *
> - * @pps_sdp:
> - * Secondary data packet for DSC Picture Parameter Set
> + * @pps_payload:
> + * DSC Picture Parameter Set
>   * @dsc_cfg:
>   * DSC Configuration data filled by driver
>   */
> -void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe *pps_sdp,
> +void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set 
> *pps_payload,
>   const struct drm_dsc_config *dsc_cfg)
>  {
>   int i;
>  
>   /* Protect against someone accidently changing struct size */
> - BUILD_BUG_ON(sizeof(pps_sdp->pps_payload) !=
> + BUILD_BUG_ON(sizeof(*pps_payload) !=
>DP_SDP_PPS_HEADER_PAYLOAD_BYTES_MINUS_1 + 1);
>  
> - memset(_sdp->pps_payload, 0, sizeof(pps_sdp->pps_payload));
> + memset(pps_payload, 0, sizeof(*pps_payload));
>  
>   /* PPS 0 */
> - pps_sdp->pps_payload.dsc_version =
> + pps_payload->dsc_version =
>   dsc_cfg->dsc_version_minor |
>   dsc_cfg->dsc_version_major << DSC_PPS_VERSION_MAJOR_SHIFT;
>  
>   /* PPS 1, 2 is 0 */
>  
>   /* PPS 3 */
> - pps_sdp->pps_payload.pps_3 =
> + pps_payload->pps_3 =
>   dsc_cfg->line_buf_depth |
>   dsc_cfg->bits_per_component << DSC_PPS_BPC_SHIFT;
>  
>   /* PPS 4 */
> - pps_sdp->pps_payload.pps_4 =
> + pps_payload->pps_4 =
>   ((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT) |
>   dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
> @@ -82,7 +82,7 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
>  
>   /* PPS 5 */
> - pps_sdp->pps_payload.bits_per_pixel_low =
> + pps_payload->bits_per_pixel_low =
>   (dsc_cfg->bits_per_pixel & DSC_PPS_LSB_MASK);
>  
>   /*
> @@ -93,103 +93,103 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>*/
>  
>   /* PPS 6, 7 */
> - pps_sdp->pps_payload.pic_height = cpu_to_be16(dsc_cfg->pic_height);
> + pps_payload->pic_height = cpu_to_be16(dsc_cfg->pic_height);
>  
>   /* PPS 8, 9 */
> - pps_sdp->pps_payload.pic_width = cpu_to_be16(dsc_cfg->pic_width);
> + pps_payload->pic_width = cpu_to_be16(dsc_cfg->pic_width);
>  
>   /* PPS 10, 11 */
> - pps_sdp->pps_payload.slice_height = cpu_to_be16(dsc_cfg->slice_height);
> + pps_payload->slice_height = cpu_to_be16(dsc_cfg->slice_height);
>  
>   /* PPS 12, 13 */
> - pps_sdp->pps_payload.slice_width = cpu_to_be16(dsc_cfg->slice_width);
> + pps_payload->slice_width = cpu_to_be16(dsc_cfg->slice_width);
>  
>   /* PPS 14, 15 */
> - pps_sdp->pps_payload.chunk_size = 
> cpu_to_be16(dsc_cfg->slice_chunk_size);
> + pps_payload->chunk_size = cpu_to_be16(dsc_cfg->slice_chunk_size);
>  
>   /* PPS 16 */
> - pps_sdp->pps_payload.initial_xmit_delay_high =
> + pps_payload->initial_xmit_delay_high =
>   ((dsc_cfg->initial_xmit_delay &
> DSC_PPS_INIT_XMIT_DELAY_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT);
>  
>   /* PPS 17 */
> - 

Re: [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs.

2019-02-13 Thread Alex Deucher via amd-gfx
On Wed, Jan 30, 2019 at 5:43 AM Christian König
 wrote:
>
> Am 30.01.19 um 02:53 schrieb Bas Nieuwenhuizen:
> > Some blocks in amdgpu can have 0 rqs.
> >
> > Job creation already fails with -ENOENT when entity->rq is NULL,
> > so jobs cannot be pushed. Without a rq there is no scheduler to
> > pop jobs, and rq selection already does the right thing with a
> > list of length 0.
> >
> > So the operations we need to fix are:
> >- Creation, do not set rq to rq_list[0] if the list can have length 0.
> >- Do not flush any jobs when there is no rq.
> >- On entity destruction handle the rq = NULL case.
> >- on set_priority, do not try to change the rq if it is NULL.
> >
> > Signed-off-by: Bas Nieuwenhuizen 
>
> One minor comment on patch #2, apart from that the series is
> Reviewed-by: Christian König .
>
> I'm going to make the change on #2 and pick them up for inclusion in
> amd-staging-drm-next.

Hi Christian,

I haven't seen these land yet.  Just want to make sure they don't fall
through the cracks.

Alex

>
> Thanks for the help,
> Christian.
>
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 39 
> >   1 file changed, 26 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 4463d3826ecb..8e31b6628d09 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity 
> > *entity,
> >   {
> >   int i;
> >
> > - if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
> > + if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
> >   return -EINVAL;
> >
> >   memset(entity, 0, sizeof(struct drm_sched_entity));
> >   INIT_LIST_HEAD(>list);
> > - entity->rq = rq_list[0];
> > + entity->rq = NULL;
> >   entity->guilty = guilty;
> >   entity->num_rq_list = num_rq_list;
> >   entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
> > @@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity 
> > *entity,
> >
> >   for (i = 0; i < num_rq_list; ++i)
> >   entity->rq_list[i] = rq_list[i];
> > +
> > + if (num_rq_list)
> > + entity->rq = rq_list[0];
> > +
> >   entity->last_scheduled = NULL;
> >
> >   spin_lock_init(>rq_lock);
> > @@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity 
> > *entity, long timeout)
> >   struct task_struct *last_user;
> >   long ret = timeout;
> >
> > + if (!entity->rq)
> > + return 0;
> > +
> >   sched = entity->rq->sched;
> >   /**
> >* The client will not queue more IBs during this fini, consume 
> > existing
> > @@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct 
> > drm_sched_entity *entity)
> >*/
> >   void drm_sched_entity_fini(struct drm_sched_entity *entity)
> >   {
> > - struct drm_gpu_scheduler *sched;
> > + struct drm_gpu_scheduler *sched = NULL;
> >
> > - sched = entity->rq->sched;
> > - drm_sched_rq_remove_entity(entity->rq, entity);
> > + if (entity->rq) {
> > + sched = entity->rq->sched;
> > + drm_sched_rq_remove_entity(entity->rq, entity);
> > + }
> >
> >   /* Consumption of existing IBs wasn't completed. Forcefully
> >* remove them here.
> >*/
> >   if (spsc_queue_peek(>job_queue)) {
> > - /* Park the kernel for a moment to make sure it isn't 
> > processing
> > -  * our enity.
> > -  */
> > - kthread_park(sched->thread);
> > - kthread_unpark(sched->thread);
> > + if (sched) {
> > + /* Park the kernel for a moment to make sure it isn't 
> > processing
> > +  * our enity.
> > +  */
> > + kthread_park(sched->thread);
> > + kthread_unpark(sched->thread);
> > + }
> >   if (entity->dependency) {
> >   dma_fence_remove_callback(entity->dependency,
> > >cb);
> > @@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct 
> > drm_sched_entity *entity,
> >   for (i = 0; i < entity->num_rq_list; ++i)
> >   drm_sched_entity_set_rq_priority(>rq_list[i], 
> > priority);
> >
> > - drm_sched_rq_remove_entity(entity->rq, entity);
> > - drm_sched_entity_set_rq_priority(>rq, priority);
> > - drm_sched_rq_add_entity(entity->rq, entity);
> > + if (entity->rq) {
> > + drm_sched_rq_remove_entity(entity->rq, entity);
> > + drm_sched_entity_set_rq_priority(>rq, priority);
> > + drm_sched_rq_add_entity(entity->rq, entity);
> > + }
> >
> >   spin_unlock(>rq_lock);
> >   }
>
> ___
> amd-gfx 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Daniel Vetter
On Wed, Feb 13, 2019 at 07:10:00PM +0100, Mario Kleiner wrote:
> On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter  wrote:
> >
> > On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
> >  wrote:
> > >
> > > On 2/13/19 10:14 AM, Daniel Vetter wrote:
> > > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
> > > >  wrote:
> > > >>
> > > >> On 2/13/19 4:50 AM, Daniel Vetter wrote:
> > > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > >  On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  
> > >  wrote:
> > > >
> > > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> > > >  wrote:
> > > >>
> > > >> On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > > >>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> > >  The pageflip completion timestamps transmitted to userspace
> > >  via pageflip completion events are supposed to describe the
> > >  time at which the first pixel of the new post-pageflip scanout
> > >  buffer leaves the video output of the gpu. This time is
> > >  identical to end of vblank, when active scanout starts.
> > > 
> > >  For a crtc in standard fixed refresh rate, the end of vblank
> > >  is identical to the vblank timestamps calculated by
> > >  drm_update_vblank_count() at each vblank interrupt, or each
> > >  vblank dis-/enable. Therefore pageflip events just carry
> > >  that vblank timestamp as their pageflip timestamp.
> > > 
> > >  For a crtc switched to variable refresh rate mode (vrr), the
> > >  pageflip completion timestamps are identical to the vblank
> > >  timestamps iff the pageflip was executed early in vblank,
> > >  before the minimum vblank duration elapsed. In this case
> > >  the time of display onset is identical to when the crtc
> > >  is running in fixed refresh rate.
> > > 
> > >  However, if a pageflip completes later in the vblank, inside
> > >  the "extended front porch" in vrr mode, then the vblank will
> > >  terminate at a fixed (back porch) duration after flip, so
> > >  the display onset time is delayed correspondingly. In this
> > >  case the vblank timestamp computed at vblank irq time would
> > >  be too early, and we need a way to calculate an estimated
> > >  pageflip timestamp that will be later than the vblank timestamp.
> > > 
> > >  How a driver determines such a "late flip" timestamp is hw
> > >  and driver specific, but this patch adds a new helper function
> > >  that allows the driver to propose such an alternate "late flip"
> > >  timestamp for use in pageflip events:
> > > 
> > >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > > 
> > >  When sending out pageflip events, we now compare that proposed
> > >  flip_timestamp against the vblank timestamp of the current
> > >  vblank of flip completion and choose to send out the greater/
> > >  later timestamp as flip completion timestamp.
> > > 
> > >  The most simple way for a kms driver to supply a suitable
> > >  flip_timestamp in vrr mode would be to simply take a timestamp
> > >  at start of the pageflip completion handler, e.g., pageflip
> > >  irq handler: flip_timestamp = ktime_get(); and then set that
> > >  as proposed "late" alternative timestamp via ...
> > >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > > 
> > >  More clever approaches could try to add some corrective offset
> > >  for fixed back porch duration, or ideally use hardware features
> > >  like hw timestamps to calculate the exact end time of vblank.
> > > 
> > >  Signed-off-by: Mario Kleiner 
> > >  Cc: Nicholas Kazlauskas 
> > >  Cc: Harry Wentland 
> > >  Cc: Alex Deucher 
> > > >>>
> > > >>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to 
> > > >>> only give us
> > > >>> the right timestampe, once? With this I guess if you do a vblank 
> > > >>> query in
> > > >>> between the wrong and the right vblank you'll get the bogus 
> > > >>> value. Not
> > > >>> really great for userspace.
> > > >>> -Daniel
> > > >>
> > > >> I think we calculate the timestamp and send the vblank event both 
> > > >> within
> > > >> the pageflip IRQ handler so calculating the right pageflip 
> > > >> timestamp
> > > >> once could probably be done. I'm not sure if it's easier than 
> > > >> proposing
> > > >> a later flip time with an API like this though.
> > > >>
> > > >> The actual scanout time should be known from the page-flip handler 
> > > >> so
> > > >> the semantics for VRR on/off remain the same. This is because the
> > 

Re: [PATCH] drm/amd/display: Fix reference counting for struct dc_sink.

2019-02-13 Thread Alex Deucher via amd-gfx
Add amd-gfx and some DC people.

Alex

On Sun, Feb 10, 2019 at 5:13 AM  wrote:
>
> From: Mathias Fröhlich 
>
> Reference counting in amdgpu_dm_connector for amdgpu_dm_connector::dc_sink
> and amdgpu_dm_connector::dc_em_sink as well as in dc_link::local_sink seems
> to be out of shape. Thus make reference counting consistent for these
> members and just plain increment the reference count when the variable
> gets assigned and decrement when the pointer is set to zero or replaced.
> Also simplify reference counting in selected function sopes to be sure the
> reference is released in any case. In some cases add NULL pointer check
> before dereferencing.
> At a hand full of places a comment is placed to stat that the reference
> increment happened already somewhere else.
>
> This actually fixes the following kernel bug on my system when enabling
> display core in amdgpu. There are some more similar bug reports around,
> so it probably helps at more places.
>
>kernel BUG at mm/slub.c:294!
>invalid opcode:  [#1] SMP PTI
>CPU: 9 PID: 1180 Comm: Xorg Not tainted 5.0.0-rc1+ #2
>Hardware name: Supermicro X10DAi/X10DAI, BIOS 3.0a 02/05/2018
>RIP: 0010:__slab_free+0x1e2/0x3d0
>Code: 8b 54 24 30 48 89 4c 24 28 e8 da fb ff ff 4c 8b 54 24 28 85 c0 0f 85 
> 67 fe ff ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 49 3b 5c 24 
> 28 75 ab 48 8b 44 24 30 49 89 4c 24 28 49 89 44
>RSP: 0018:b0978589fa90 EFLAGS: 00010246
>RAX: 92f12806c400 RBX: 80200019 RCX: 92f12806c400
>RDX: 92f12806c400 RSI: dd6421a01a00 RDI: 92ed2f406e80
>RBP: b0978589fb40 R08: 0001 R09: c0ee4748
>R10: 92f12806c400 R11: 0001 R12: dd6421a01a00
>R13: 92f12806c400 R14: 92ed2f406e80 R15: dd6421a01a20
>FS:  7f4170be0ac0() GS:92ed2fb4() 
> knlGS:
>CS:  0010 DS:  ES:  CR0: 80050033
>CR2: 562818aaa000 CR3: 00045745a002 CR4: 003606e0
>DR0:  DR1:  DR2: 
>DR3:  DR6: fffe0ff0 DR7: 0400
>Call Trace:
> ? drm_dbg+0x87/0x90 [drm]
> dc_stream_release+0x28/0x50 [amdgpu]
> amdgpu_dm_connector_mode_valid+0xb4/0x1f0 [amdgpu]
> drm_helper_probe_single_connector_modes+0x492/0x6b0 [drm_kms_helper]
> drm_mode_getconnector+0x457/0x490 [drm]
> ? drm_connector_property_set_ioctl+0x60/0x60 [drm]
> drm_ioctl_kernel+0xa9/0xf0 [drm]
> drm_ioctl+0x201/0x3a0 [drm]
> ? drm_connector_property_set_ioctl+0x60/0x60 [drm]
> amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> do_vfs_ioctl+0xa4/0x630
> ? __sys_recvmsg+0x83/0xa0
> ksys_ioctl+0x60/0x90
> __x64_sys_ioctl+0x16/0x20
> do_syscall_64+0x5b/0x160
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>RIP: 0033:0x7f417110809b
>Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff 
> ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 
> 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48
>RSP: 002b:7ffdd8d1c268 EFLAGS: 0246 ORIG_RAX: 0010
>RAX: ffda RBX: 562818a8ebc0 RCX: 7f417110809b
>RDX: 7ffdd8d1c2a0 RSI: c05064a7 RDI: 0012
>RBP: 7ffdd8d1c2a0 R08: 562819012280 R09: 0007
>R10:  R11: 0246 R12: c05064a7
>R13: 0012 R14: 0012 R15: 7ffdd8d1c2a0
>Modules linked in: nfsv4 dns_resolver nfs lockd grace fscache fuse vfat 
> fat amdgpu intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp 
> kvm_intel kvm irqbypass crct10dif_pclmul chash gpu_sched crc32_pclmul 
> snd_hda_codec_realtek ghash_clmulni_intel amd_iommu_v2 iTCO_wdt 
> iTCO_vendor_support ttm snd_hda_codec_generic snd_hda_codec_hdmi 
> ledtrig_audio snd_hda_intel drm_kms_helper snd_hda_codec intel_cstate 
> snd_hda_core drm snd_hwdep snd_seq snd_seq_device intel_uncore snd_pcm 
> intel_rapl_perf snd_timer snd soundcore ioatdma pcspkr intel_wmi_thunderbolt 
> mxm_wmi i2c_i801 lpc_ich pcc_cpufreq auth_rpcgss sunrpc igb crc32c_intel 
> i2c_algo_bit dca wmi hid_cherry analog gameport joydev
>
> This patch is based on agd5f/drm-next-5.1-wip. This patch does not require
> all of that, but agd5f/drm-next-5.1-wip contains at least one more dc_sink
> counting fix that I could spot.
>
> Signed-off-by: Mathias Fröhlich 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 43 +++
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  1 +
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c |  1 +
>  3 files changed, 37 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 3a6f595f295e..20fa01bff685 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ 

[pull] amdgpu, sched drm-fixes-5.0

2019-02-13 Thread Alex Deucher via amd-gfx
Hi Dave, Daniel,

A few small fixes for 5.0.

amdgpu:
- Vega20 psp fix
- Add vrr range to debugfs for freesync debugging

sched:
- Scheduler race fix

The following changes since commit 78eb1ca47589f0cd9db2ceb28b60434e8d512131:

  Merge branch 'vmwgfx-fixes-5.0-2' of 
git://people.freedesktop.org/~thomash/linux into drm-fixes (2019-02-07 10:36:47 
+1000)

are available in the Git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-fixes-5.0

for you to fetch changes up to 1d69511e49b0107c0a60ff5ef488f5a2512a50ae:

  drm/amdgpu/psp11: TA firmware is optional (v3) (2019-02-13 09:44:05 -0500)


Alex Deucher (1):
  drm/amdgpu/psp11: TA firmware is optional (v3)

Eric Anholt (1):
  drm/sched: Always trace the dependencies we wait on, to fix a race.

Nicholas Kazlauskas (1):
  drm/amd/display: Expose connector VRR range via debugfs

 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c|  9 +--
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 28 --
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c  | 22 -
 drivers/gpu/drm/scheduler/sched_entity.c   |  7 ++
 4 files changed, 46 insertions(+), 20 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu/psp11: TA firmware is optional (v3)

2019-02-13 Thread James Zhu
This patch is
Reviewed-by: James Zhu
Tested-by: James Zhu

On 2019-02-12 10:29 p.m., Zhang, Hawking wrote:
> Reviewed-by: Hawking Zhang 
>
> Regards,
> Hawking
> -Original Message-
> From: amd-gfx  On Behalf Of Alex 
> Deucher via amd-gfx
> Sent: 2019年2月13日 5:11
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu/psp11: TA firmware is optional (v3)
>
> Don't warn or fail if it's missing.
>
> v2: handle xgmi case more gracefully.
> v3: handle older kernels properly
>
> Signed-off-by: Alex Deucher 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c |  9 ++--  
> drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 28 ++---
>   2 files changed, 23 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 8fab0d637ee5..3a9b48b227ac 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -90,8 +90,10 @@ static int psp_sw_fini(void *handle)
>   adev->psp.sos_fw = NULL;
>   release_firmware(adev->psp.asd_fw);
>   adev->psp.asd_fw = NULL;
> - release_firmware(adev->psp.ta_fw);
> - adev->psp.ta_fw = NULL;
> + if (adev->psp.ta_fw) {
> + release_firmware(adev->psp.ta_fw);
> + adev->psp.ta_fw = NULL;
> + }
>   return 0;
>   }
>   
> @@ -435,6 +437,9 @@ static int psp_xgmi_initialize(struct psp_context *psp)
>   struct ta_xgmi_shared_memory *xgmi_cmd;
>   int ret;
>   
> + if (!psp->adev->psp.ta_fw)
> + return -ENOENT;
> +
>   if (!psp->xgmi_context.initialized) {
>   ret = psp_xgmi_init_shared_buf(psp);
>   if (ret)
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index 0c6e7f9b143f..189fcb004579 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -152,18 +152,22 @@ static int psp_v11_0_init_microcode(struct psp_context 
> *psp)
>   
>   snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_ta.bin", chip_name);
>   err = request_firmware(>psp.ta_fw, fw_name, adev->dev);
> - if (err)
> - goto out2;
> -
> - err = amdgpu_ucode_validate(adev->psp.ta_fw);
> - if (err)
> - goto out2;
> -
> - ta_hdr = (const struct ta_firmware_header_v1_0 *)adev->psp.ta_fw->data;
> - adev->psp.ta_xgmi_ucode_version = 
> le32_to_cpu(ta_hdr->ta_xgmi_ucode_version);
> - adev->psp.ta_xgmi_ucode_size = le32_to_cpu(ta_hdr->ta_xgmi_size_bytes);
> - adev->psp.ta_xgmi_start_addr = (uint8_t *)ta_hdr +
> - le32_to_cpu(ta_hdr->header.ucode_array_offset_bytes);
> + if (err) {
> + release_firmware(adev->psp.ta_fw);
> + adev->psp.ta_fw = NULL;
> + dev_info(adev->dev,
> +  "psp v11.0: Failed to load firmware \"%s\"\n", 
> fw_name);
> + } else {
> + err = amdgpu_ucode_validate(adev->psp.ta_fw);
> + if (err)
> + goto out2;
> +
> + ta_hdr = (const struct ta_firmware_header_v1_0 
> *)adev->psp.ta_fw->data;
> + adev->psp.ta_xgmi_ucode_version = 
> le32_to_cpu(ta_hdr->ta_xgmi_ucode_version);
> + adev->psp.ta_xgmi_ucode_size = 
> le32_to_cpu(ta_hdr->ta_xgmi_size_bytes);
> + adev->psp.ta_xgmi_start_addr = (uint8_t *)ta_hdr +
> + le32_to_cpu(ta_hdr->header.ucode_array_offset_bytes);
> + }
>   
>   return 0;
>   
> --
> 2.20.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu/powerplay: declare firmware for CI cards

2019-02-13 Thread James Zhu
This patch is
Reviewed-by: James Zhu 

Tested-by: James Zhu 

On 2019-02-13 3:10 p.m., Alex Deucher via amd-gfx wrote:
> Missing firmware declaration caused firmware requirement to
> not be noted by the module and may cause firmware to not
> be available in initrd.
>
> Fixes: bc4b539e385088 "drm/amdgpu: remove old CI DPM implementation"
> Signed-off-by: Alex Deucher 
> ---
>   drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 4 
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> index a6edd5df33b0..4240aeec9000 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> @@ -29,6 +29,10 @@
>   #include 
>   #include "smumgr.h"
>   
> +MODULE_FIRMWARE("amdgpu/bonaire_smc.bin");
> +MODULE_FIRMWARE("amdgpu/bonaire_k_smc.bin");
> +MODULE_FIRMWARE("amdgpu/hawaii_smc.bin");
> +MODULE_FIRMWARE("amdgpu/hawaii_k_smc.bin");
>   MODULE_FIRMWARE("amdgpu/topaz_smc.bin");
>   MODULE_FIRMWARE("amdgpu/topaz_k_smc.bin");
>   MODULE_FIRMWARE("amdgpu/tonga_smc.bin");
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/powerplay: declare firmware for CI cards

2019-02-13 Thread Alex Deucher via amd-gfx
Missing firmware declaration caused firmware requirement to
not be noted by the module and may cause firmware to not
be available in initrd.

Fixes: bc4b539e385088 "drm/amdgpu: remove old CI DPM implementation"
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
index a6edd5df33b0..4240aeec9000 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
@@ -29,6 +29,10 @@
 #include 
 #include "smumgr.h"
 
+MODULE_FIRMWARE("amdgpu/bonaire_smc.bin");
+MODULE_FIRMWARE("amdgpu/bonaire_k_smc.bin");
+MODULE_FIRMWARE("amdgpu/hawaii_smc.bin");
+MODULE_FIRMWARE("amdgpu/hawaii_k_smc.bin");
 MODULE_FIRMWARE("amdgpu/topaz_smc.bin");
 MODULE_FIRMWARE("amdgpu/topaz_k_smc.bin");
 MODULE_FIRMWARE("amdgpu/tonga_smc.bin");
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 18/35] drm/amd/display: Move enum gamut_remap_select to hw_shared.h

2019-02-13 Thread sunpeng.li
From: Eric Bernstein 

This enum definition is shared, so move it to a shared location.

Signed-off-by: Eric Bernstein 
Reviewed-by: Tony Cheng 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c| 7 ---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c | 7 ---
 drivers/gpu/drm/amd/display/dc/inc/hw/hw_shared.h   | 6 ++
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
index cd1ebe5..f91e4b4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c
@@ -91,13 +91,6 @@ enum dscl_mode_sel {
DSCL_MODE_DSCL_BYPASS = 6
 };
 
-enum gamut_remap_select {
-   GAMUT_REMAP_BYPASS = 0,
-   GAMUT_REMAP_COEFF,
-   GAMUT_REMAP_COMA_COEFF,
-   GAMUT_REMAP_COMB_COEFF
-};
-
 void dpp_read_state(struct dpp *dpp_base,
struct dcn_dpp_state *s)
 {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
index 41f0f4c..882bcc5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
@@ -88,13 +88,6 @@ enum dscl_mode_sel {
DSCL_MODE_DSCL_BYPASS = 6
 };
 
-enum gamut_remap_select {
-   GAMUT_REMAP_BYPASS = 0,
-   GAMUT_REMAP_COEFF,
-   GAMUT_REMAP_COMA_COEFF,
-   GAMUT_REMAP_COMB_COEFF
-};
-
 static const struct dpp_input_csc_matrix dpp_input_csc_matrix[] = {
{COLOR_SPACE_SRGB,
{0x2000, 0, 0, 0, 0, 0x2000, 0, 0, 0, 0, 0x2000, 0} },
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/hw_shared.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/hw_shared.h
index da85537..4c8e2c6 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/hw_shared.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/hw_shared.h
@@ -146,6 +146,12 @@ struct out_csc_color_matrix {
uint16_t regval[12];
 };
 
+enum gamut_remap_select {
+   GAMUT_REMAP_BYPASS = 0,
+   GAMUT_REMAP_COEFF,
+   GAMUT_REMAP_COMA_COEFF,
+   GAMUT_REMAP_COMB_COEFF
+};
 
 enum opp_regamma {
OPP_REGAMMA_BYPASS = 0,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 03/35] drm/amd/display: remove screen flashes on seamless boot

2019-02-13 Thread sunpeng.li
From: Anthony Koo 

[Why]
We want boot to desktop to be seamless

[How]
During init pipes, avoid touching the pipes where GOP has already
enabled the HW to the state we want.

Signed-off-by: Anthony Koo 
Reviewed-by: Aric Cyr 
Acked-by: Leo Li 
---
 .../amd/display/dc/dce110/dce110_hw_sequencer.c| 10 +++-
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 30 +-
 drivers/gpu/drm/amd/display/include/dal_asic_id.h  |  3 +++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index e1b285e..453ff07 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1521,6 +1521,14 @@ void dce110_enable_accelerated_mode(struct dc *dc, 
struct dc_state *context)
struct dc_link *edp_link = get_link_for_edp(dc);
bool can_edp_fast_boot_optimize = false;
bool apply_edp_fast_boot_optimization = false;
+   bool can_apply_seamless_boot = false;
+
+   for (i = 0; i < context->stream_count; i++) {
+   if (context->streams[i]->apply_seamless_boot_optimization) {
+   can_apply_seamless_boot = true;
+   break;
+   }
+   }
 
if (edp_link) {
/* this seems to cause blank screens on DCE8 */
@@ -1549,7 +1557,7 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct 
dc_state *context)
}
}
 
-   if (!apply_edp_fast_boot_optimization) {
+   if (!apply_edp_fast_boot_optimization && !can_apply_seamless_boot) {
if (edp_link_to_turnoff) {
/*turn off backlight before DP_blank and encoder 
powered down*/
dc->hwss.edp_backlight_control(edp_link_to_turnoff, 
false);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 7f95808..d42fade 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -959,9 +959,25 @@ static void dcn10_disable_plane(struct dc *dc, struct 
pipe_ctx *pipe_ctx)
 static void dcn10_init_pipes(struct dc *dc, struct dc_state *context)
 {
int i;
+   bool can_apply_seamless_boot = false;
+
+   for (i = 0; i < context->stream_count; i++) {
+   if (context->streams[i]->apply_seamless_boot_optimization) {
+   can_apply_seamless_boot = true;
+   break;
+   }
+   }
 
for (i = 0; i < dc->res_pool->pipe_count; i++) {
struct timing_generator *tg = 
dc->res_pool->timing_generators[i];
+   struct pipe_ctx *pipe_ctx = >res_ctx.pipe_ctx[i];
+
+   /* There is assumption that pipe_ctx is not mapping irregularly
+* to non-preferred front end. If pipe_ctx->stream is not NULL,
+* we will use the pipe, so don't disable
+*/
+   if (pipe_ctx->stream != NULL)
+   continue;
 
if (tg->funcs->is_tg_enabled(tg))
tg->funcs->lock(tg);
@@ -975,7 +991,9 @@ static void dcn10_init_pipes(struct dc *dc, struct dc_state 
*context)
}
}
 
-   dc->res_pool->mpc->funcs->mpc_init(dc->res_pool->mpc);
+   /* Cannot reset the MPC mux if seamless boot */
+   if (!can_apply_seamless_boot)
+   dc->res_pool->mpc->funcs->mpc_init(dc->res_pool->mpc);
 
for (i = 0; i < dc->res_pool->pipe_count; i++) {
struct timing_generator *tg = 
dc->res_pool->timing_generators[i];
@@ -983,6 +1001,16 @@ static void dcn10_init_pipes(struct dc *dc, struct 
dc_state *context)
struct dpp *dpp = dc->res_pool->dpps[i];
struct pipe_ctx *pipe_ctx = >res_ctx.pipe_ctx[i];
 
+   // W/A for issue with dc_post_update_surfaces_to_stream
+   hubp->power_gated = true;
+
+   /* There is assumption that pipe_ctx is not mapping irregularly
+* to non-preferred front end. If pipe_ctx->stream is not NULL,
+* we will use the pipe, so don't disable
+*/
+   if (pipe_ctx->stream != NULL)
+   continue;
+
dpp->funcs->dpp_reset(dpp);
 
pipe_ctx->stream_res.tg = tg;
diff --git a/drivers/gpu/drm/amd/display/include/dal_asic_id.h 
b/drivers/gpu/drm/amd/display/include/dal_asic_id.h
index 4f501dd..34d6fdc 100644
--- a/drivers/gpu/drm/amd/display/include/dal_asic_id.h
+++ b/drivers/gpu/drm/amd/display/include/dal_asic_id.h
@@ -131,6 +131,7 @@
 #define INTERNAL_REV_RAVEN_A0 0x00/* First spin of Raven */
 #define RAVEN_A0 0x01
 #define RAVEN_B0 0x21
+#define PICASSO_A0 0x41
 #if 

[PATCH 05/35] drm/amd/display: Ungate stream before programming registers

2019-02-13 Thread sunpeng.li
From: Gary Kattan 

[Why]
Certain tests fail after a fresh reboot. This is caused by writing to
registers prior to ungating the stream we're trying to program.

[How]
Make sure the stream is ungated before writing to its registers.
This also enables power-gating plane resources before init_hw
initializes them.
Additionally, this does some refactoring to move gating/ungating
from enable/disable_plane functions to where stream resources are
enabled/disabled.

Signed-off-by: Gary Kattan 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 6 ++
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c   | 8 ++--
 drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h   | 4 
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 453ff07..42ee0a6 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1300,6 +1300,10 @@ static enum dc_status apply_single_controller_ctx_to_hw(
struct drr_params params = {0};
unsigned int event_triggers = 0;
 
+   if (dc->hwss.disable_stream_gating) {
+   dc->hwss.disable_stream_gating(dc, pipe_ctx);
+   }
+
if (pipe_ctx->stream_res.audio != NULL) {
struct audio_output audio_output;
 
@@ -2684,6 +2688,8 @@ static const struct hw_sequencer_funcs dce110_funcs = {
.set_static_screen_control = set_static_screen_control,
.reset_hw_ctx_wrap = dce110_reset_hw_ctx_wrap,
.enable_stream_timing = dce110_enable_stream_timing,
+   .disable_stream_gating = NULL,
+   .enable_stream_gating = NULL,
.setup_stereo = NULL,
.set_avmute = dce110_set_avmute,
.wait_for_mpcc_disconnect = dce110_wait_for_mpcc_disconnect,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index d42fade..d4dde1d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -1165,11 +1165,13 @@ static void reset_hw_ctx_wrap(
struct clock_source *old_clk = 
pipe_ctx_old->clock_source;
 
reset_back_end_for_pipe(dc, pipe_ctx_old, 
dc->current_state);
+   if (dc->hwss.enable_stream_gating) {
+   dc->hwss.enable_stream_gating(dc, pipe_ctx);
+   }
if (old_clk)
old_clk->funcs->cs_power_down(old_clk);
}
}
-
 }
 
 static bool patch_address_for_sbs_tb_stereo(
@@ -2786,7 +2788,9 @@ static const struct hw_sequencer_funcs dcn10_funcs = {
.edp_wait_for_hpd_ready = hwss_edp_wait_for_hpd_ready,
.set_cursor_position = dcn10_set_cursor_position,
.set_cursor_attribute = dcn10_set_cursor_attribute,
-   .set_cursor_sdr_white_level = dcn10_set_cursor_sdr_white_level
+   .set_cursor_sdr_white_level = dcn10_set_cursor_sdr_white_level,
+   .disable_stream_gating = NULL,
+   .enable_stream_gating = NULL
 };
 
 
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
index 341b481..fc03320 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
@@ -68,6 +68,10 @@ struct stream_resource;
 
 struct hw_sequencer_funcs {
 
+   void (*disable_stream_gating)(struct dc *dc, struct pipe_ctx *pipe_ctx);
+
+   void (*enable_stream_gating)(struct dc *dc, struct pipe_ctx *pipe_ctx);
+
void (*init_hw)(struct dc *dc);
 
void (*init_pipes)(struct dc *dc, struct dc_state *context);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 06/35] drm/amd/display: Raise dispclk value for dce11

2019-02-13 Thread sunpeng.li
From: Roman Li 

[Why]
The visual corruption due to low display clock value.
Observed on Carrizo 4K@60Hz.

[How]
There was earlier patch for dce_update_clocks:
Adding +15% workaround also to to dce11_update_clocks

Signed-off-by: Roman Li 
Reviewed-by: Nicholas Kazlauskas 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
index bbe0517..6e142c2 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
@@ -696,6 +696,11 @@ static void dce11_update_clocks(struct clk_mgr *clk_mgr,
 {
struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(clk_mgr);
struct dm_pp_power_level_change_request level_change_req;
+   int patched_disp_clk = context->bw.dce.dispclk_khz;
+
+   /*TODO: W/A for dal3 linux, investigate why this works */
+   if (!clk_mgr_dce->dfs_bypass_active)
+   patched_disp_clk = patched_disp_clk * 115 / 100;
 
level_change_req.power_level = dce_get_required_clocks_state(clk_mgr, 
context);
/* get max clock state from PPLIB */
@@ -705,9 +710,9 @@ static void dce11_update_clocks(struct clk_mgr *clk_mgr,
clk_mgr_dce->cur_min_clks_state = 
level_change_req.power_level;
}
 
-   if (should_set_clock(safe_to_lower, context->bw.dce.dispclk_khz, 
clk_mgr->clks.dispclk_khz)) {
-   context->bw.dce.dispclk_khz = dce_set_clock(clk_mgr, 
context->bw.dce.dispclk_khz);
-   clk_mgr->clks.dispclk_khz = context->bw.dce.dispclk_khz;
+   if (should_set_clock(safe_to_lower, patched_disp_clk, 
clk_mgr->clks.dispclk_khz)) {
+   context->bw.dce.dispclk_khz = dce_set_clock(clk_mgr, 
patched_disp_clk);
+   clk_mgr->clks.dispclk_khz = patched_disp_clk;
}
dce11_pplib_apply_display_requirements(clk_mgr->ctx->dc, context);
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 13/35] drm/amd/display: Refactor for setup periodic interrupt.

2019-02-13 Thread sunpeng.li
From: Yongqiang Sun 

[Why]
Current periodic interrupt start point calc in optc
is not clear.

[How]
1. DM convert delta time to lines number and dc will calculate the
   start position as per lines number and interrupt type.
2. hwss calculates the start point as per line offset.
3. optc programs vertical interrupts register as per start point
   and interrupt source.

Signed-off-by: Yongqiang Sun 
Reviewed-by: Tony Cheng 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c   |  12 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h |  24 +++-
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|   6 +-
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 145 -
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h  |   2 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c  | 133 +++
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h  |  13 +-
 .../drm/amd/display/dc/inc/hw/timing_generator.h   |  23 ++--
 drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h  |   8 ++
 9 files changed, 215 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 8879cd4..c68fbd5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1626,13 +1626,13 @@ static void commit_planes_do_stream_update(struct dc 
*dc,
stream_update->adjust->v_total_min,
stream_update->adjust->v_total_max);
 
-   if (stream_update->periodic_vsync_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)
-   
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, 
_ctx->stream->timing, VLINE0, >periodic_vsync_config);
+   if (stream_update->periodic_interrupt0 &&
+   dc->hwss.setup_periodic_interrupt)
+   dc->hwss.setup_periodic_interrupt(pipe_ctx, 
VLINE0);
 
-   if (stream_update->enhanced_sync_config && 
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)
-   
pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-   pipe_ctx->stream_res.tg, 
_ctx->stream->timing, VLINE1, >enhanced_sync_config);
+   if (stream_update->periodic_interrupt1 &&
+   dc->hwss.setup_periodic_interrupt)
+   dc->hwss.setup_periodic_interrupt(pipe_ctx, 
VLINE1);
 
if ((stream_update->hdr_static_metadata && 
!stream->use_dynamic_meta) ||
stream_update->vrr_infopacket ||
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h 
b/drivers/gpu/drm/amd/display/dc/dc_stream.h
index a798694..5657cb3 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
@@ -51,9 +51,19 @@ struct freesync_context {
bool dummy;
 };
 
-union vline_config {
-   unsigned int line_number;
-   unsigned long long delta_in_ns;
+enum vertical_interrupt_ref_point {
+   START_V_UPDATE = 0,
+   START_V_SYNC,
+   INVALID_POINT
+
+   //For now, only v_update interrupt is used.
+   //START_V_BLANK,
+   //START_V_ACTIVE
+};
+
+struct periodic_interrupt_config {
+   enum vertical_interrupt_ref_point ref_point;
+   int lines_offset;
 };
 
 
@@ -106,8 +116,8 @@ struct dc_stream_state {
/* DMCU info */
unsigned int abm_level;
 
-   union vline_config periodic_vsync_config;
-   union vline_config enhanced_sync_config;
+   struct periodic_interrupt_config periodic_interrupt0;
+   struct periodic_interrupt_config periodic_interrupt1;
 
/* from core_stream struct */
struct dc_context *ctx;
@@ -158,8 +168,8 @@ struct dc_stream_update {
struct dc_info_packet *hdr_static_metadata;
unsigned int *abm_level;
 
-   union vline_config *periodic_vsync_config;
-   union vline_config *enhanced_sync_config;
+   struct periodic_interrupt_config *periodic_interrupt0;
+   struct periodic_interrupt_config *periodic_interrupt1;
 
struct dc_crtc_timing_adjust *adjust;
struct dc_info_packet *vrr_infopacket;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 42ee0a6..5e4db37 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1333,10 +1333,8 @@ static enum dc_status apply_single_controller_ctx_to_hw(
if (!pipe_ctx->stream->apply_seamless_boot_optimization)
dc->hwss.enable_stream_timing(pipe_ctx, context, dc);
 
-   if 

[PATCH 23/35] drm/amd/display: Allow for plane-less resource reservation

2019-02-13 Thread sunpeng.li
From: Dmytro Laktyushkin 

This change changes dc add plane logic to allow plane-less resource
reservation (pipe split).

If a free pipe_ctx (no plane_state attached) is the head pipe, and is
found with a bottom pipe attached, assign the plane to add on the bottom
pipe.

In addition, prepend dcn10 to dcn10-specific reset_back_end_for_pipe
and reset_hw_ctx_wrap

Signed-off-by: Dmytro Laktyushkin 
Reviewed-by: Charlene Liu 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c |  3 +++
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 349ab80..0c3e866 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -1214,6 +1214,9 @@ bool dc_add_plane_to_context(
free_pipe->clock_source = tail_pipe->clock_source;
free_pipe->top_pipe = tail_pipe;
tail_pipe->bottom_pipe = free_pipe;
+   } else if (free_pipe->bottom_pipe && 
free_pipe->bottom_pipe->plane_state == NULL) {
+   ASSERT(free_pipe->bottom_pipe->stream_res.opp != 
free_pipe->stream_res.opp);
+   free_pipe->bottom_pipe->plane_state = plane_state;
}
 
/* assign new surfaces*/
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 4ed8e3d..ddd4f4c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -732,7 +732,7 @@ static enum dc_status dcn10_enable_stream_timing(
return DC_OK;
 }
 
-static void reset_back_end_for_pipe(
+static void dcn10_reset_back_end_for_pipe(
struct dc *dc,
struct pipe_ctx *pipe_ctx,
struct dc_state *context)
@@ -1173,7 +1173,7 @@ static void dcn10_init_hw(struct dc *dc)
memset(>res_pool->clk_mgr->clks, 0, 
sizeof(dc->res_pool->clk_mgr->clks));
 }
 
-static void reset_hw_ctx_wrap(
+static void dcn10_reset_hw_ctx_wrap(
struct dc *dc,
struct dc_state *context)
 {
@@ -1195,10 +1195,9 @@ static void reset_hw_ctx_wrap(
pipe_need_reprogram(pipe_ctx_old, pipe_ctx)) {
struct clock_source *old_clk = 
pipe_ctx_old->clock_source;
 
-   reset_back_end_for_pipe(dc, pipe_ctx_old, 
dc->current_state);
-   if (dc->hwss.enable_stream_gating) {
+   dcn10_reset_back_end_for_pipe(dc, pipe_ctx_old, 
dc->current_state);
+   if (dc->hwss.enable_stream_gating)
dc->hwss.enable_stream_gating(dc, pipe_ctx);
-   }
if (old_clk)
old_clk->funcs->cs_power_down(old_clk);
}
@@ -2944,7 +2943,7 @@ static const struct hw_sequencer_funcs dcn10_funcs = {
.pipe_control_lock = dcn10_pipe_control_lock,
.prepare_bandwidth = dcn10_prepare_bandwidth,
.optimize_bandwidth = dcn10_optimize_bandwidth,
-   .reset_hw_ctx_wrap = reset_hw_ctx_wrap,
+   .reset_hw_ctx_wrap = dcn10_reset_hw_ctx_wrap,
.enable_stream_timing = dcn10_enable_stream_timing,
.set_drr = set_drr,
.get_position = get_position,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 10/35] drm/amd/display: Fix update type mismatches in atomic check

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
Whenever a stream or plane is added or removed from the context the
pointer will change from old to new. We set lock and validation
needed in these cases. But not all of these cases match update_type
from dm_determine_update_type_for_commit - an example being overlay
plane updates.

There are warnings for a few of these cases that should be fixed.

[How]
We can closer align to DC (and lock_and_validation_needed) by
comparing stream and plane pointers.

Since the old stream/old plane state is never freed until sometime
after the commit tail work finishes we are guaranteed to never get
back the same block of memory when we remove and create a stream or
plane state in the same commit.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 62d280e..a7c8583 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5889,14 +5889,13 @@ dm_determine_update_type_for_commit(struct dc *dc,
old_dm_crtc_state = to_dm_crtc_state(old_crtc_state);
num_plane = 0;
 
-   if (!new_dm_crtc_state->stream) {
-   if (!new_dm_crtc_state->stream && 
old_dm_crtc_state->stream) {
-   update_type = UPDATE_TYPE_FULL;
-   goto cleanup;
-   }
+   if (new_dm_crtc_state->stream != old_dm_crtc_state->stream) {
+   update_type = UPDATE_TYPE_FULL;
+   goto cleanup;
+   }
 
+   if (!new_dm_crtc_state->stream)
continue;
-   }
 
for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
new_plane_state, j) {
new_plane_crtc = new_plane_state->crtc;
@@ -5907,6 +5906,11 @@ dm_determine_update_type_for_commit(struct dc *dc,
if (plane->type == DRM_PLANE_TYPE_CURSOR)
continue;
 
+   if (new_dm_plane_state->dc_state != 
old_dm_plane_state->dc_state) {
+   update_type = UPDATE_TYPE_FULL;
+   goto cleanup;
+   }
+
if (!state->allow_modeset)
continue;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 27/35] drm/amd/display: Set flip pending for pipe split

2019-02-13 Thread sunpeng.li
From: Wesley Chalmers 

[WHY]
When doing split pipe, if one pipe is pending on flip, the entire
plane's status should be flip pending, otherwise corruption can occur
when OS writes to a surface prematurely.

[HOW]
Clear the flip pending bit before checking pipes, then OR the flip
pending bits from all pipes together to create the flip pending status
of the entire plane.

Signed-off-by: Wesley Chalmers 
Reviewed-by: Jun Lei 
Acked-by: Eryk Brol 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc_surface.c  | 13 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
index ee6bd50..a5e86f9 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
@@ -119,6 +119,19 @@ const struct dc_plane_status *dc_plane_get_status(
if (core_dc->current_state == NULL)
return NULL;
 
+   /* Find the current plane state and set its pending bit to false */
+   for (i = 0; i < core_dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *pipe_ctx =
+   _dc->current_state->res_ctx.pipe_ctx[i];
+
+   if (pipe_ctx->plane_state != plane_state)
+   continue;
+
+   pipe_ctx->plane_state->status.is_flip_pending = false;
+
+   break;
+   }
+
for (i = 0; i < core_dc->res_pool->pipe_count; i++) {
struct pipe_ctx *pipe_ctx =
_dc->current_state->res_ctx.pipe_ctx[i];
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 9840a1d..1194dc5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -2689,7 +2689,7 @@ static void dcn10_update_pending_status(struct pipe_ctx 
*pipe_ctx)
flip_pending = pipe_ctx->plane_res.hubp->funcs->hubp_is_flip_pending(
pipe_ctx->plane_res.hubp);
 
-   plane_state->status.is_flip_pending = flip_pending;
+   plane_state->status.is_flip_pending = 
plane_state->status.is_flip_pending || flip_pending;
 
if (!flip_pending)
plane_state->status.current_address = 
plane_state->status.requested_address;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 21/35] drm/amd/display: Add DCN_VM aperture registers

2019-02-13 Thread sunpeng.li
From: Eryk Brol 

[Why]
For later use by the DC VM implementation

Signed-off-by: Eryk Brol 
Reviewed-by: Jun Lei 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
index a6d6dfe..3268ab0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
@@ -595,6 +595,9 @@
type AGP_BASE;\
type AGP_BOT;\
type AGP_TOP;\
+   type DCN_VM_SYSTEM_APERTURE_DEFAULT_SYSTEM;\
+   type DCN_VM_SYSTEM_APERTURE_DEFAULT_ADDR_MSB;\
+   type DCN_VM_SYSTEM_APERTURE_DEFAULT_ADDR_LSB;\
/* todo:  get these from GVM instead of reading registers ourselves */\
type PAGE_DIRECTORY_ENTRY_HI32;\
type PAGE_DIRECTORY_ENTRY_LO32;\
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 35/35] drm/amd/display: Fix issue with link_active state not correct for MST

2019-02-13 Thread sunpeng.li
From: Anthony Koo 

[Why]
For MST, link not disabled until all streams disabled

[How]
Add check for stream_count before setting link_active = false for MST

Signed-off-by: Anthony Koo 
Reviewed-by: Wenjing Liu 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 7f5a947..66b862b 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -2037,6 +2037,9 @@ static enum dc_status enable_link(
break;
}
 
+   if (status == DC_OK)
+   pipe_ctx->stream->link->link_status.link_active = true;
+
return status;
 }
 
@@ -2060,6 +2063,14 @@ static void disable_link(struct dc_link *link, enum 
signal_type signal)
dp_disable_link_phy_mst(link, signal);
} else
link->link_enc->funcs->disable_output(link->link_enc, signal);
+
+   if (signal == SIGNAL_TYPE_DISPLAY_PORT_MST) {
+   /* MST disable link only when no stream use the link */
+   if (link->mst_stream_alloc_table.stream_count <= 0)
+   link->link_status.link_active = false;
+   } else {
+   link->link_status.link_active = false;
+   }
 }
 
 static bool dp_active_dongle_validate_timing(
@@ -2623,8 +2634,6 @@ void core_link_enable_stream(
}
}
 
-   stream->link->link_status.link_active = true;
-
core_dc->hwss.enable_audio_stream(pipe_ctx);
 
/* turn off otg test pattern if enable */
@@ -2659,8 +2668,6 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx, 
int option)
core_dc->hwss.disable_stream(pipe_ctx, option);
 
disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal);
-
-   pipe_ctx->stream->link->link_status.link_active = false;
 }
 
 void core_link_set_avmute(struct pipe_ctx *pipe_ctx, bool enable)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 19/35] drm/amd/display: Remove redundant 'else' statement in dcn1_update_clocks

2019-02-13 Thread sunpeng.li
From: Fatemeh Darbehani 

[Why]
DM has impelemented new pp_smu interface. 'Else' is not longer needed.

Signed-off-by: Fatemeh Darbehani 
Reviewed-by: Eric Yang 
Acked-by: Leo Li 
Acked-by: Yongqiang Sun 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
index a1014e3..3b91505 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
@@ -246,10 +246,6 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
pp_smu->set_hard_min_fclk_by_freq(_smu->pp_smu, 
smu_req.hard_min_fclk_mhz);
pp_smu->set_hard_min_dcfclk_by_freq(_smu->pp_smu, 
smu_req.hard_min_dcefclk_mhz);
pp_smu->set_min_deep_sleep_dcfclk(_smu->pp_smu, 
smu_req.min_deep_sleep_dcefclk_mhz);
-   } else {
-   if (pp_smu->set_display_requirement)
-   
pp_smu->set_display_requirement(_smu->pp_smu, _req);
-   dcn1_pplib_apply_display_requirements(dc, context);
}
}
 
@@ -272,10 +268,6 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
pp_smu->set_hard_min_fclk_by_freq(_smu->pp_smu, 
smu_req.hard_min_fclk_mhz);
pp_smu->set_hard_min_dcfclk_by_freq(_smu->pp_smu, 
smu_req.hard_min_dcefclk_mhz);
pp_smu->set_min_deep_sleep_dcfclk(_smu->pp_smu, 
smu_req.min_deep_sleep_dcefclk_mhz);
-   } else {
-   if (pp_smu->set_display_requirement)
-   
pp_smu->set_display_requirement(_smu->pp_smu, _req);
-   dcn1_pplib_apply_display_requirements(dc, context);
}
}
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/35] drm/amd/display: Don't expose support for DRM_FORMAT_RGB888

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
This format isn't supported in DC and some IGT tests fail since we
expose support for it.

[How]
Remove it.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Harry Wentland 
Reviewed-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4c51922..62d280e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3877,7 +3877,6 @@ static const struct drm_plane_helper_funcs 
dm_plane_helper_funcs = {
  * check will succeed, and let DC implement proper check
  */
 static const uint32_t rgb_formats[] = {
-   DRM_FORMAT_RGB888,
DRM_FORMAT_XRGB,
DRM_FORMAT_ARGB,
DRM_FORMAT_RGBA,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 20/35] drm/amd/display: make seamless boot work generically

2019-02-13 Thread sunpeng.li
From: Anthony Koo 

[Why]
Seamless boot code not working on all ASICs because of
some underflow issues caused by some uninitialized HW
state.

[How]
Keep some logical and power gating init code in hw_init.
Move some per pipe init code to enable accelerated mode

Signed-off-by: Anthony Koo 
Reviewed-by: Aric Cyr 
Acked-by: Leo Li 
---
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|  3 ++
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 35 +++---
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 5c7fb92..21a6218 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1551,6 +1551,9 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct 
dc_state *context)
}
}
 
+   if (dc->hwss.init_pipes)
+   dc->hwss.init_pipes(dc, context);
+
if (edp_link) {
/* this seems to cause blank screens on DCE8 */
if ((dc->ctx->dce_version == DCE_VERSION_8_0) ||
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index d1a8f1c..15c1a94 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -889,22 +889,23 @@ void hwss1_plane_atomic_disconnect(struct dc *dc, struct 
pipe_ctx *pipe_ctx)
dcn10_verify_allow_pstate_change_high(dc);
 }
 
-static void plane_atomic_power_down(struct dc *dc, struct pipe_ctx *pipe_ctx)
+static void plane_atomic_power_down(struct dc *dc,
+   struct dpp *dpp,
+   struct hubp *hubp)
 {
struct dce_hwseq *hws = dc->hwseq;
-   struct dpp *dpp = pipe_ctx->plane_res.dpp;
DC_LOGGER_INIT(dc->ctx->logger);
 
if (REG(DC_IP_REQUEST_CNTL)) {
REG_SET(DC_IP_REQUEST_CNTL, 0,
IP_REQUEST_EN, 1);
dpp_pg_control(hws, dpp->inst, false);
-   hubp_pg_control(hws, pipe_ctx->plane_res.hubp->inst, false);
+   hubp_pg_control(hws, hubp->inst, false);
dpp->funcs->dpp_reset(dpp);
REG_SET(DC_IP_REQUEST_CNTL, 0,
IP_REQUEST_EN, 0);
DC_LOG_DEBUG(
-   "Power gated front end %d\n", 
pipe_ctx->pipe_idx);
+   "Power gated front end %d\n", hubp->inst);
}
 }
 
@@ -931,7 +932,9 @@ static void plane_atomic_disable(struct dc *dc, struct 
pipe_ctx *pipe_ctx)
hubp->power_gated = true;
dc->optimized_required = false; /* We're powering off, no need to 
optimize */
 
-   plane_atomic_power_down(dc, pipe_ctx);
+   plane_atomic_power_down(dc,
+   pipe_ctx->plane_res.dpp,
+   pipe_ctx->plane_res.hubp);
 
pipe_ctx->stream = NULL;
memset(_ctx->stream_res, 0, sizeof(pipe_ctx->stream_res));
@@ -1001,9 +1004,6 @@ static void dcn10_init_pipes(struct dc *dc, struct 
dc_state *context)
struct dpp *dpp = dc->res_pool->dpps[i];
struct pipe_ctx *pipe_ctx = >res_ctx.pipe_ctx[i];
 
-   // W/A for issue with dc_post_update_surfaces_to_stream
-   hubp->power_gated = true;
-
/* There is assumption that pipe_ctx is not mapping irregularly
 * to non-preferred front end. If pipe_ctx->stream is not NULL,
 * we will use the pipe, so don't disable
@@ -1108,6 +1108,22 @@ static void dcn10_init_hw(struct dc *dc)
link->link_status.link_active = true;
}
 
+   /* If taking control over from VBIOS, we may want to optimize our first
+* mode set, so we need to skip powering down pipes until we know which
+* pipes we want to use.
+* Otherwise, if taking control is not possible, we need to power
+* everything down.
+*/
+   if (dcb->funcs->is_accelerated_mode(dcb)) {
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct hubp *hubp = dc->res_pool->hubps[i];
+   struct dpp *dpp = dc->res_pool->dpps[i];
+
+   dc->res_pool->opps[i]->mpc_tree_params.opp_id = 
dc->res_pool->opps[i]->inst;
+   plane_atomic_power_down(dc, dpp, hubp);
+   }
+   }
+
for (i = 0; i < dc->res_pool->audio_count; i++) {
struct audio *audio = dc->res_pool->audios[i];
 
@@ -1137,9 +1153,6 @@ static void dcn10_init_hw(struct dc *dc)
enable_power_gating_plane(dc->hwseq, true);
 
memset(>res_pool->clk_mgr->clks, 0, 
sizeof(dc->res_pool->clk_mgr->clks));
-
-   if (dc->hwss.init_pipes)
-   

[PATCH 33/35] drm/amd/display: Add ability to override bounding box in DC construct

2019-02-13 Thread sunpeng.li
From: Jun Lei 

Add a dc_bounding_box_overrides struct to define bb overrides. It is
loaded in during DC init.

Signed-off-by: Jun Lei 
Reviewed-by: Tony Cheng 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c |  2 ++
 drivers/gpu/drm/amd/display/dc/dc.h  | 10 ++
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 1bfd9ba..5dfc2e3 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -621,6 +621,8 @@ static bool construct(struct dc *dc,
 #endif
 
enum dce_version dc_version = DCE_VERSION_UNKNOWN;
+   memcpy(>bb_overrides, _params->bb_overrides, 
sizeof(dc->bb_overrides));
+
dc_dceip = kzalloc(sizeof(*dc_dceip), GFP_KERNEL);
if (!dc_dceip) {
dm_error("%s: failed to create dceip\n", __func__);
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index e98e19c..ed11b3c5 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -268,6 +268,14 @@ struct dc_debug_data {
uint32_t auxErrorCount;
 };
 
+struct dc_bounding_box_overrides {
+   int sr_exit_time_ns;
+   int sr_enter_plus_exit_time_ns;
+   int urgent_latency_ns;
+   int percent_of_ideal_drambw;
+   int dram_clock_change_latency_ns;
+};
+
 struct dc_state;
 struct resource_pool;
 struct dce_hwseq;
@@ -277,6 +285,7 @@ struct dc {
struct dc_cap_funcs cap_funcs;
struct dc_config config;
struct dc_debug_options debug;
+   struct dc_bounding_box_overrides bb_overrides;
struct dc_context *ctx;
 
uint8_t link_count;
@@ -330,6 +339,7 @@ struct dc_init_data {
struct hw_asic_id asic_id;
void *driver; /* ctx */
struct cgs_device *cgs_device;
+   struct dc_bounding_box_overrides bb_overrides;
 
int num_virtual_links;
/*
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 25/35] drm/amd/display: Reset planes that were disabled in init_pipes

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
Seamless boot tries to reuse planes that were enabled for the first
commit applied.

In the case where Raven is booting with two monitors connected and the
first commit contains two streams the screen corruption would occur
because the second stream was trying to re-use a tg and plane that
weren't previously enabled.

The state on the first commit looks something like the following:

TG0: enabled=1
TG1: enabled=0
TG2: enabled=0
TG3: enabled=0

New state: pipe=0, stream=0,plane=0,   new_tg=0
New state: pipe=1, stream=1,plane=1,   new_tg=1
New state: pipe=2, stream=NULL, plane=NULL,new_tg=NULL
New state: pipe=3, stream=NULL, plane=NULL,new_tg=NULL

Only one plane/tg is setup before we enter accelerated mode so
we really want to disabling everything but that first plane.

[How]

Check if the stream is not NULL and if the tg is enabled before
deciding whether to skip the plane disable.

Also ensure we're also disabling on the current state's pipe_ctx so
we don't overwrite the fields in the new pending state.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Anthony Koo 
Acked-by: Harry Wentland 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index ddd4f4c..9840a1d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -1026,9 +1026,14 @@ static void dcn10_init_pipes(struct dc *dc, struct 
dc_state *context)
 * to non-preferred front end. If pipe_ctx->stream is not NULL,
 * we will use the pipe, so don't disable
 */
-   if (pipe_ctx->stream != NULL)
+   if (pipe_ctx->stream != NULL &&
+   pipe_ctx->stream_res.tg->funcs->is_tg_enabled(
+   pipe_ctx->stream_res.tg))
continue;
 
+   /* Disable on the current state so the new one isn't cleared. */
+   pipe_ctx = >current_state->res_ctx.pipe_ctx[i];
+
dpp->funcs->dpp_reset(dpp);
 
pipe_ctx->stream_res.tg = tg;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 04/35] drm/amd/display: Increase precision for backlight curve

2019-02-13 Thread sunpeng.li
From: Anthony Koo 

[Why]
We are currently losing precision when we convert from
16 bit --> 8 bit --> 16 bit.

[How]
We shouldn't down convert unnecessarily and lose precision.
Keep values at 16 bit and use directly.

Signed-off-by: Anthony Koo 
Reviewed-by: Aric Cyr 
Acked-by: Leo Li 
---
 .../drm/amd/display/modules/power/power_helpers.c  | 23 --
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c 
b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
index 3ba87b0..038b882 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
@@ -165,18 +165,11 @@ struct iram_table_v_2_2 {
 };
 #pragma pack(pop)
 
-static uint16_t backlight_8_to_16(unsigned int backlight_8bit)
-{
-   return (uint16_t)(backlight_8bit * 0x101);
-}
-
 static void fill_backlight_transform_table(struct dmcu_iram_parameters params,
struct iram_table_v_2 *table)
 {
unsigned int i;
unsigned int num_entries = NUM_BL_CURVE_SEGS;
-   unsigned int query_input_8bit;
-   unsigned int query_output_8bit;
unsigned int lut_index;
 
table->backlight_thresholds[0] = 0;
@@ -194,16 +187,13 @@ static void fill_backlight_transform_table(struct 
dmcu_iram_parameters params,
 * format U4.10.
 */
for (i = 1; i+1 < num_entries; i++) {
-   query_input_8bit = DIV_ROUNDUP((i * 256), num_entries);
-
lut_index = (params.backlight_lut_array_size - 1) * i / 
(num_entries - 1);
ASSERT(lut_index < params.backlight_lut_array_size);
-   query_output_8bit = params.backlight_lut_array[lut_index] >> 8;
 
table->backlight_thresholds[i] =
-   backlight_8_to_16(query_input_8bit);
+   cpu_to_be16(DIV_ROUNDUP((i * 65536), num_entries));
table->backlight_offsets[i] =
-   backlight_8_to_16(query_output_8bit);
+   cpu_to_be16(params.backlight_lut_array[lut_index]);
}
 }
 
@@ -212,8 +202,6 @@ static void fill_backlight_transform_table_v_2_2(struct 
dmcu_iram_parameters par
 {
unsigned int i;
unsigned int num_entries = NUM_BL_CURVE_SEGS;
-   unsigned int query_input_8bit;
-   unsigned int query_output_8bit;
unsigned int lut_index;
 
table->backlight_thresholds[0] = 0;
@@ -231,16 +219,13 @@ static void fill_backlight_transform_table_v_2_2(struct 
dmcu_iram_parameters par
 * format U4.10.
 */
for (i = 1; i+1 < num_entries; i++) {
-   query_input_8bit = DIV_ROUNDUP((i * 256), num_entries);
-
lut_index = (params.backlight_lut_array_size - 1) * i / 
(num_entries - 1);
ASSERT(lut_index < params.backlight_lut_array_size);
-   query_output_8bit = params.backlight_lut_array[lut_index] >> 8;
 
table->backlight_thresholds[i] =
-   backlight_8_to_16(query_input_8bit);
+   cpu_to_be16(DIV_ROUNDUP((i * 65536), num_entries));
table->backlight_offsets[i] =
-   backlight_8_to_16(query_output_8bit);
+   cpu_to_be16(params.backlight_lut_array[lut_index]);
}
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 15/35] drm/amd/display: Fix negative cursor pos programming

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
If the cursor pos passed from DM is less than the plane_state->dst_rect
top left corner then the unsigned cursor pos wraps around to a large
positive number since cursor pos is a u32.

There was an attempt to guard against this in hubp1_cursor_set_position
by checking the src_x_offset and src_y_offset and offseting the
cursor hotspot within hubp1_cursor_set_position.

However, the cursor position itself is still being programmed
incorrectly as a large value.

This manifests itself visually as the cursor disappearing or containing
strange artifacts near the middle of the screen on raven.

[How]
Don't subtract the destination rect top left corner from the pos but
add it to the hotspot instead. This happens before the pos gets
passed into hubp1_cursor_set_position.

This achieves the same result but avoids the subtraction wrap around.
With this fix the original cursor programming logic can be used again.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Charlene Liu 
Acked-by: Leo Li 
Acked-by: Murton Liu 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c  | 23 ++
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  |  4 ++--
 2 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
index 6838294..0ba68d4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
@@ -1150,28 +1150,9 @@ void hubp1_cursor_set_position(
REG_UPDATE(CURSOR_CONTROL,
CURSOR_ENABLE, cur_en);
 
-   //account for cases where we see negative offset relative to overlay 
plane
-   if (src_x_offset < 0 && src_y_offset < 0) {
-   REG_SET_2(CURSOR_POSITION, 0,
-   CURSOR_X_POSITION, 0,
-   CURSOR_Y_POSITION, 0);
-   x_hotspot -= src_x_offset;
-   y_hotspot -= src_y_offset;
-   } else if (src_x_offset < 0) {
-   REG_SET_2(CURSOR_POSITION, 0,
-   CURSOR_X_POSITION, 0,
-   CURSOR_Y_POSITION, pos->y);
-   x_hotspot -= src_x_offset;
-   } else if (src_y_offset < 0) {
-   REG_SET_2(CURSOR_POSITION, 0,
+   REG_SET_2(CURSOR_POSITION, 0,
CURSOR_X_POSITION, pos->x,
-   CURSOR_Y_POSITION, 0);
-   y_hotspot -= src_y_offset;
-   } else {
-   REG_SET_2(CURSOR_POSITION, 0,
-   CURSOR_X_POSITION, pos->x,
-   CURSOR_Y_POSITION, pos->y);
-   }
+   CURSOR_Y_POSITION, pos->y);
 
REG_SET_2(CURSOR_HOT_SPOT, 0,
CURSOR_HOT_SPOT_X, x_hotspot,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 8ba895c..d1a8f1c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -2693,8 +2693,8 @@ static void dcn10_set_cursor_position(struct pipe_ctx 
*pipe_ctx)
.mirror = pipe_ctx->plane_state->horizontal_mirror
};
 
-   pos_cpy.x -= pipe_ctx->plane_state->dst_rect.x;
-   pos_cpy.y -= pipe_ctx->plane_state->dst_rect.y;
+   pos_cpy.x_hotspot += pipe_ctx->plane_state->dst_rect.x;
+   pos_cpy.y_hotspot += pipe_ctx->plane_state->dst_rect.y;
 
if (pipe_ctx->plane_state->address.type
== PLN_ADDR_TYPE_VIDEO_PROGRESSIVE)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 31/35] drm/amd/display: optionally optimize edp link rate based on timing

2019-02-13 Thread sunpeng.li
From: Josip Pavic 

[Why]
eDP v1.4 allows panels to report link rates other than RBR/HBR/HBR2, that
may be more optimal for the panel's timing. Power can be saved by using
a link rate closer to the required bandwidth of the panel's timing.

[How]
Scan the table of reported link rates from the panel, and select the
minimum link rate that satisfies the bandwidth requirements of the panel's
timing. Include a flag to make the feature optional.

Signed-off-by: Josip Pavic 
Reviewed-by: Harry Wentland 
Acked-by: Anthony Koo 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 195 ---
 drivers/gpu/drm/amd/display/dc/dc.h  |   6 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h |   2 +
 3 files changed, 140 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 09d3012..8ad79df 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -93,12 +93,10 @@ static void dpcd_set_link_settings(
struct dc_link *link,
const struct link_training_settings *lt_settings)
 {
-   uint8_t rate = (uint8_t)
-   (lt_settings->link_settings.link_rate);
+   uint8_t rate;
 
union down_spread_ctrl downspread = { {0} };
union lane_count_set lane_count_set = { {0} };
-   uint8_t link_set_buffer[2];
 
downspread.raw = (uint8_t)
(lt_settings->link_settings.link_spread);
@@ -111,29 +109,42 @@ static void dpcd_set_link_settings(
lane_count_set.bits.POST_LT_ADJ_REQ_GRANTED =
link->dpcd_caps.max_ln_count.bits.POST_LT_ADJ_REQ_SUPPORTED;
 
-   link_set_buffer[0] = rate;
-   link_set_buffer[1] = lane_count_set.raw;
-
-   core_link_write_dpcd(link, DP_LINK_BW_SET,
-   link_set_buffer, 2);
core_link_write_dpcd(link, DP_DOWNSPREAD_CTRL,
, sizeof(downspread));
 
+   core_link_write_dpcd(link, DP_LANE_COUNT_SET,
+   _count_set.raw, 1);
+
if (link->dpcd_caps.dpcd_rev.raw >= DPCD_REV_14 &&
-   (link->dpcd_caps.link_rate_set >= 1 &&
-   link->dpcd_caps.link_rate_set <= 8)) {
+   lt_settings->link_settings.use_link_rate_set == true) {
+   rate = 0;
+   core_link_write_dpcd(link, DP_LINK_BW_SET, , 1);
core_link_write_dpcd(link, DP_LINK_RATE_SET,
-   >dpcd_caps.link_rate_set, 1);
+   _settings->link_settings.link_rate_set, 1);
+   } else {
+   rate = (uint8_t) (lt_settings->link_settings.link_rate);
+   core_link_write_dpcd(link, DP_LINK_BW_SET, , 1);
}
 
-   DC_LOG_HW_LINK_TRAINING("%s\n %x rate = %x\n %x lane = %x\n %x spread = 
%x\n",
-   __func__,
-   DP_LINK_BW_SET,
-   lt_settings->link_settings.link_rate,
-   DP_LANE_COUNT_SET,
-   lt_settings->link_settings.lane_count,
-   DP_DOWNSPREAD_CTRL,
-   lt_settings->link_settings.link_spread);
+   if (rate) {
+   DC_LOG_HW_LINK_TRAINING("%s\n %x rate = %x\n %x lane = %x\n %x 
spread = %x\n",
+   __func__,
+   DP_LINK_BW_SET,
+   lt_settings->link_settings.link_rate,
+   DP_LANE_COUNT_SET,
+   lt_settings->link_settings.lane_count,
+   DP_DOWNSPREAD_CTRL,
+   lt_settings->link_settings.link_spread);
+   } else {
+   DC_LOG_HW_LINK_TRAINING("%s\n %x rate set = %x\n %x lane = %x\n 
%x spread = %x\n",
+   __func__,
+   DP_LINK_RATE_SET,
+   lt_settings->link_settings.link_rate_set,
+   DP_LANE_COUNT_SET,
+   lt_settings->link_settings.lane_count,
+   DP_DOWNSPREAD_CTRL,
+   lt_settings->link_settings.link_spread);
+   }
 
 }
 
@@ -952,6 +963,8 @@ enum link_training_result dc_link_dp_perform_link_training(
 
lt_settings.link_settings.link_rate = link_setting->link_rate;
lt_settings.link_settings.lane_count = link_setting->lane_count;
+   lt_settings.link_settings.use_link_rate_set = 
link_setting->use_link_rate_set;
+   lt_settings.link_settings.link_rate_set = link_setting->link_rate_set;
 
/*@todo[vdevulap] move SS to LS, should not be handled by displaypath*/
 
@@ -1075,7 +1088,7 @@ static struct dc_link_settings get_max_link_cap(struct 
dc_link *link)
 {
/* Set Default link settings */
struct dc_link_settings max_link_cap = {LANE_COUNT_FOUR, LINK_RATE_HIGH,
-   LINK_SPREAD_05_DOWNSPREAD_30KHZ};
+   LINK_SPREAD_05_DOWNSPREAD_30KHZ, false, 0};
 
/* Higher link settings based on feature supported */
if 

[PATCH 29/35] drm/amd/display: Add p_state_change_support flag to dc_clocks

2019-02-13 Thread sunpeng.li
From: Jun Lei 

Will be used to signify if P-state change is supported.

Signed-off-by: Jun Lei 
Reviewed-by: Eric Yang 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 9adb801..a4d3da8 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -203,6 +203,7 @@ struct dc_clocks {
int fclk_khz;
int phyclk_khz;
int dramclk_khz;
+   bool p_state_change_support;
 };
 
 struct dc_debug_options {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 26/35] drm/amd/display: Fix exception from AUX acquire failure

2019-02-13 Thread sunpeng.li
From: Anthony Koo 

[Why]
AUX arbitration occurs between SW and FW components.
When AUX acquire fails, it causes engine->ddc to be NULL,
which leads to an exception when we try to release the AUX
engine.

[How]
When AUX engine acquire fails, it should return from the
function without trying to continue the operation.
The upper level will determine if it wants to retry.
i.e. dce_aux_transfer_with_retries will be used and retry.

Signed-off-by: Anthony Koo 
Reviewed-by: Aric Cyr 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
index 4febf4e..2f50be3 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -374,7 +374,6 @@ static bool acquire(
struct dce_aux *engine,
struct ddc *ddc)
 {
-
enum gpio_result result;
 
if (!is_engine_available(engine))
@@ -455,7 +454,8 @@ int dce_aux_transfer(struct ddc_service *ddc,
memset(_rep, 0, sizeof(aux_rep));
 
aux_engine = ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en];
-   acquire(aux_engine, ddc_pin);
+   if (!acquire(aux_engine, ddc_pin))
+   return -1;
 
if (payload->i2c_over_aux)
aux_req.type = AUX_TRANSACTION_TYPE_I2C;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 30/35] drm/amd/display: set clocks to 0 on suspend on dce80

2019-02-13 Thread sunpeng.li
From: Bhawanpreet Lakha 

[Why]
When a dce80 asic was suspended, the clocks were not set to 0.
Upon resume, the new clock was compared to the existing clock,
they were found to be the same, and so the clock was not set.
This resulted in a blackscreen.

[How]
In atomic commit, check to see if there are any active pipes.
If no, set clocks to 0

Signed-off-by: Bhawanpreet Lakha 
Reviewed-by: Nicholas Kazlauskas 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
index 2eca81b..c109ace 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
@@ -792,9 +792,22 @@ bool dce80_validate_bandwidth(
struct dc *dc,
struct dc_state *context)
 {
-   /* TODO implement when needed but for now hardcode max value*/
-   context->bw.dce.dispclk_khz = 681000;
-   context->bw.dce.yclk_khz = 25 * MEMORY_TYPE_MULTIPLIER_CZ;
+   int i;
+   bool at_least_one_pipe = false;
+
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   if (context->res_ctx.pipe_ctx[i].stream)
+   at_least_one_pipe = true;
+   }
+
+   if (at_least_one_pipe) {
+   /* TODO implement when needed but for now hardcode max value*/
+   context->bw.dce.dispclk_khz = 681000;
+   context->bw.dce.yclk_khz = 25 * MEMORY_TYPE_MULTIPLIER_CZ;
+   } else {
+   context->bw.dce.dispclk_khz = 0;
+   context->bw.dce.yclk_khz = 0;
+   }
 
return true;
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 28/35] drm/amd/display: Clean up wait on vblank event

2019-02-13 Thread sunpeng.li
From: David Francis 

[Why]
The wait_for_vblank boolean in commit_tail was passed by reference
into each stream commit, and if that commit was an asynchronous
flip, it would disable vblank waits on all subsequent flips.

This made the behaviour depend on crtc order in a non-intuitive way,
although since the asynchronous pageflip flag is only used by the
legacy IOCTLs at the moment it is never an issue

[How]
Find wait_for_vblank before doing any stream commits

Signed-off-by: David Francis 
Reviewed-by: Nicholas Kazlauskas 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8cd6a82..fc39cd0 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4719,7 +4719,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
struct drm_device *dev,
struct amdgpu_display_manager *dm,
struct drm_crtc *pcrtc,
-   bool *wait_for_vblank)
+   bool wait_for_vblank)
 {
uint32_t i, r;
uint64_t timestamp_ns;
@@ -4786,14 +4786,6 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 
if (pflip_needed) {
/*
-* Assume even ONE crtc with immediate flip means
-* entire can't wait for VBLANK
-* TODO Check if it's correct
-*/
-   if (new_pcrtc_state->pageflip_flags & 
DRM_MODE_PAGE_FLIP_ASYNC)
-   *wait_for_vblank = false;
-
-   /*
 * TODO This might fail and hence better not used, wait
 * explicitly on fences instead
 * and in general should be called for
@@ -4888,7 +4880,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 * hopefully eliminating dc_*_update structs in their entirety.
 */
if (flip_count) {
-   target = (uint32_t)drm_crtc_vblank_count(pcrtc) + 
*wait_for_vblank;
+   target = (uint32_t)drm_crtc_vblank_count(pcrtc) + 
wait_for_vblank;
/* Prepare wait for target vblank early - before the 
fence-waits */
target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) 
+
amdgpu_get_vblank_counter_kms(pcrtc->dev, 
acrtc_attach->crtc_id);
@@ -5266,13 +5258,17 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
 #endif
}
 
+   for_each_new_crtc_in_state(state, crtc, new_crtc_state, j)
+   if (new_crtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC)
+   wait_for_vblank = false;
+
/* update planes when needed per crtc*/
for_each_new_crtc_in_state(state, crtc, new_crtc_state, j) {
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 
if (dm_new_crtc_state->stream)
amdgpu_dm_commit_planes(state, dc_state, dev,
-   dm, crtc, _for_vblank);
+   dm, crtc, wait_for_vblank);
}
 
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 32/35] drm/amd/display: Make stream commits call into DC only once

2019-02-13 Thread sunpeng.li
From: David Francis 

[Why]
dc_commit_updates_for_stream is called twice per stream: once
with the flip data and once will all other data. This causes
problems when these DC calls have different numbers of planes

For example, a commit with a pageflip on plane A and a
non-pageflip change on plane B will first call
into DC with just plane A, causing plane B to be
disabled. Then it will call into DC with both planes,
re-enabling plane B

[How]
Merge flip and full into a single bundle

Apart from the single DC call, the logic should not be
changed by this patch

Signed-off-by: David Francis 
Reviewed-by: Nicholas Kazlauskas 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 129 +-
 1 file changed, 54 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index fc39cd0..7ffa587 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4731,30 +4731,25 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
struct dm_crtc_state *acrtc_state = to_dm_crtc_state(new_pcrtc_state);
struct dm_crtc_state *dm_old_crtc_state =
to_dm_crtc_state(drm_atomic_get_old_crtc_state(state, 
pcrtc));
-   int flip_count = 0, planes_count = 0, vpos, hpos;
+   int planes_count = 0, vpos, hpos;
unsigned long flags;
struct amdgpu_bo *abo;
uint64_t tiling_flags, dcc_address;
uint32_t target, target_vblank;
-
-   struct {
-   struct dc_surface_update surface_updates[MAX_SURFACES];
-   struct dc_flip_addrs flip_addrs[MAX_SURFACES];
-   struct dc_stream_update stream_update;
-   } *flip;
+   bool pflip_present = false;
 
struct {
struct dc_surface_update surface_updates[MAX_SURFACES];
struct dc_plane_info plane_infos[MAX_SURFACES];
struct dc_scaling_info scaling_infos[MAX_SURFACES];
+   struct dc_flip_addrs flip_addrs[MAX_SURFACES];
struct dc_stream_update stream_update;
-   } *full;
+   } *bundle;
 
-   flip = kzalloc(sizeof(*flip), GFP_KERNEL);
-   full = kzalloc(sizeof(*full), GFP_KERNEL);
+   bundle = kzalloc(sizeof(*bundle), GFP_KERNEL);
 
-   if (!flip || !full) {
-   dm_error("Failed to allocate update bundles\n");
+   if (!bundle) {
+   dm_error("Failed to allocate update bundle\n");
goto cleanup;
}
 
@@ -4764,7 +4759,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
struct drm_crtc_state *new_crtc_state;
struct drm_framebuffer *fb = new_plane_state->fb;
struct amdgpu_framebuffer *afb = to_amdgpu_framebuffer(fb);
-   bool pflip_needed;
+   bool framebuffer_changed;
struct dc_plane_state *dc_plane;
struct dm_plane_state *dm_new_plane_state = 
to_dm_plane_state(new_plane_state);
 
@@ -4779,12 +4774,14 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
if (!new_crtc_state->active)
continue;
 
-   pflip_needed = old_plane_state->fb &&
+   dc_plane = dm_new_plane_state->dc_state;
+
+   framebuffer_changed = old_plane_state->fb &&
old_plane_state->fb != new_plane_state->fb;
 
-   dc_plane = dm_new_plane_state->dc_state;
+   pflip_present = pflip_present || framebuffer_changed;
 
-   if (pflip_needed) {
+   if (framebuffer_changed) {
/*
 * TODO This might fail and hence better not used, wait
 * explicitly on fences instead
@@ -4806,22 +4803,22 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 
amdgpu_bo_unreserve(abo);
 
-   flip->flip_addrs[flip_count].address.grph.addr.low_part 
= lower_32_bits(afb->address);
-   
flip->flip_addrs[flip_count].address.grph.addr.high_part = 
upper_32_bits(afb->address);
+   
bundle->flip_addrs[planes_count].address.grph.addr.low_part = 
lower_32_bits(afb->address);
+   
bundle->flip_addrs[planes_count].address.grph.addr.high_part = 
upper_32_bits(afb->address);
 
dcc_address = get_dcc_address(afb->address, 
tiling_flags);
-   
flip->flip_addrs[flip_count].address.grph.meta_addr.low_part = 
lower_32_bits(dcc_address);
-   
flip->flip_addrs[flip_count].address.grph.meta_addr.high_part = 
upper_32_bits(dcc_address);
+   
bundle->flip_addrs[planes_count].address.grph.meta_addr.low_part = 
lower_32_bits(dcc_address);
+   

[PATCH 34/35] drm/amd/display: 3.2.19

2019-02-13 Thread sunpeng.li
From: Mark McGarrity 

Signed-off-by: Mark McGarrity 
Reviewed-by: Tony Cheng 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index ed11b3c5..ebd4073 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -39,7 +39,7 @@
 #include "inc/hw/dmcu.h"
 #include "dml/display_mode_lib.h"
 
-#define DC_VER "3.2.18"
+#define DC_VER "3.2.19"
 
 #define MAX_SURFACES 3
 #define MAX_STREAMS 6
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 16/35] drm/amd/display: PPLIB Hookup

2019-02-13 Thread sunpeng.li
From: Jun Lei 

[Why]
Make dml and integration with pplib clearer.

[How]
Change the way the dml formula is initialized to make its values more
clear. Restructure DC interface with pplib into rv_funcs.
Cap clocks received from pplib.

Signed-off-by: Jun Lei 
Signed-off-by: Eryk Brol 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Leo Li 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c   | 20 ++--
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c   |  2 +-
 .../amd/display/dc/dce110/dce110_hw_sequencer.c| 37 --
 .../gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c   |  2 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c  |  6 ++--
 drivers/gpu/drm/amd/display/dc/dm_pp_smu.h |  2 ++
 drivers/gpu/drm/amd/display/dc/dm_services.h   |  4 +--
 drivers/gpu/drm/amd/display/dc/dm_services_types.h |  2 +-
 .../gpu/drm/amd/display/dc/dml/display_mode_lib.c  | 24 ++
 .../gpu/drm/amd/display/dc/dml/display_mode_lib.h  |  5 +++
 drivers/gpu/drm/amd/display/dc/inc/core_types.h|  2 +-
 11 files changed, 78 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index e8e9eeb..4ba979e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -611,17 +611,17 @@ void pp_rv_set_hard_min_fclk_by_freq(struct pp_smu *pp, 
int mhz)
pp_funcs->set_hard_min_fclk_by_freq(pp_handle, mhz);
 }
 
-void dm_pp_get_funcs_rv(
+void dm_pp_get_funcs(
struct dc_context *ctx,
-   struct pp_smu_funcs_rv *funcs)
+   struct pp_smu_funcs *funcs)
 {
-   funcs->pp_smu.dm = ctx;
-   funcs->set_display_requirement = pp_rv_set_display_requirement;
-   funcs->set_wm_ranges = pp_rv_set_wm_ranges;
-   funcs->set_pme_wa_enable = pp_rv_set_pme_wa_enable;
-   funcs->set_display_count = pp_rv_set_active_display_count;
-   funcs->set_min_deep_sleep_dcfclk = pp_rv_set_min_deep_sleep_dcfclk;
-   funcs->set_hard_min_dcfclk_by_freq = pp_rv_set_hard_min_dcefclk_by_freq;
-   funcs->set_hard_min_fclk_by_freq = pp_rv_set_hard_min_fclk_by_freq;
+   funcs->rv_funcs.pp_smu.dm = ctx;
+   funcs->rv_funcs.set_display_requirement = pp_rv_set_display_requirement;
+   funcs->rv_funcs.set_wm_ranges = pp_rv_set_wm_ranges;
+   funcs->rv_funcs.set_pme_wa_enable = pp_rv_set_pme_wa_enable;
+   funcs->rv_funcs.set_display_count = pp_rv_set_active_display_count;
+   funcs->rv_funcs.set_min_deep_sleep_dcfclk = 
pp_rv_set_min_deep_sleep_dcfclk;
+   funcs->rv_funcs.set_hard_min_dcfclk_by_freq = 
pp_rv_set_hard_min_dcefclk_by_freq;
+   funcs->rv_funcs.set_hard_min_fclk_by_freq = 
pp_rv_set_hard_min_fclk_by_freq;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 12d1842..2a807b9 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -1391,7 +1391,7 @@ void dcn_bw_update_from_pplib(struct dc *dc)
 
 void dcn_bw_notify_pplib_of_wm_ranges(struct dc *dc)
 {
-   struct pp_smu_funcs_rv *pp = dc->res_pool->pp_smu;
+   struct pp_smu_funcs_rv *pp = >res_pool->pp_smu->rv_funcs;
struct pp_smu_wm_range_sets ranges = {0};
int min_fclk_khz, min_dcfclk_khz, socclk_khz;
const int overdrive = 500; /* 5 GHz to cover Overdrive */
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 5e4db37..5c7fb92 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -935,13 +935,31 @@ void hwss_edp_backlight_control(
edp_receiver_ready_T9(link);
 }
 
+// Static helper function which calls the correct function
+// based on pp_smu version
+static void set_pme_wa_enable_by_version(struct dc *dc)
+{
+   struct pp_smu_funcs *pp_smu = NULL;
+
+   if (dc->res_pool->pp_smu)
+   pp_smu = dc->res_pool->pp_smu;
+
+   if (pp_smu) {
+   if (pp_smu->ctx.ver == PP_SMU_VER_RV && 
pp_smu->rv_funcs.set_pme_wa_enable)
+   pp_smu->rv_funcs.set_pme_wa_enable(&(pp_smu->ctx));
+   }
+}
+
 void dce110_enable_audio_stream(struct pipe_ctx *pipe_ctx)
 {
-   struct dc *core_dc = pipe_ctx->stream->ctx->dc;
/* notify audio driver for audio modes of monitor */
-   struct pp_smu_funcs_rv *pp_smu = core_dc->res_pool->pp_smu;
+   struct dc *core_dc = pipe_ctx->stream->ctx->dc;
+   struct pp_smu_funcs *pp_smu = NULL;
unsigned int i, num_audio = 1;
 
+   if (core_dc->res_pool->pp_smu)
+   pp_smu = core_dc->res_pool->pp_smu;
+
if (pipe_ctx->stream_res.audio) {
for (i = 0; i < MAX_PIPES; i++) {

[PATCH 08/35] drm/amd/display: Fix wrong z-order when updating overlay planes

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
If a commit updates an overlay plane via the legacy plane IOCTL
then the only plane in the state will be the overlay plane.

Overlay planes need to be added first to the DC context, but in the
scenario above the plane will be added last. This will result in wrong
z-order during rendering.

[How]
If any non-cursor plane has been updated then the rest of the
non-cursor planes should be added to the CRTC state.

The cursor plane doesn't need to be included for stream updates and
locking it will cause performance issues. It should be ignored.

DC requires that the surface count passed during stream updates
be the number of surfaces currently on the stream to enable fast
updates. This previously wasn't the case without this patch, so this
also allows this optimization to occur.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Leo Li 
Acked-by: Tony Cheng 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 653bee1..4c51922 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6046,6 +6046,42 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
goto fail;
}
 
+   /*
+* Add all primary and overlay planes on the CRTC to the state
+* whenever a plane is enabled to maintain correct z-ordering
+* and to enable fast surface updates.
+*/
+   drm_for_each_crtc(crtc, dev) {
+   bool modified = false;
+
+   for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
new_plane_state, i) {
+   if (plane->type == DRM_PLANE_TYPE_CURSOR)
+   continue;
+
+   if (new_plane_state->crtc == crtc ||
+   old_plane_state->crtc == crtc) {
+   modified = true;
+   break;
+   }
+   }
+
+   if (!modified)
+   continue;
+
+   drm_for_each_plane_mask(plane, state->dev, 
crtc->state->plane_mask) {
+   if (plane->type == DRM_PLANE_TYPE_CURSOR)
+   continue;
+
+   new_plane_state =
+   drm_atomic_get_plane_state(state, plane);
+
+   if (IS_ERR(new_plane_state)) {
+   ret = PTR_ERR(new_plane_state);
+   goto fail;
+   }
+   }
+   }
+
/* Remove exiting planes if they are modified */
for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, 
new_plane_state, i) {
ret = dm_update_plane_state(dc, state, plane,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 14/35] drm/amd/display: Add disable triple buffering DC debug option

2019-02-13 Thread sunpeng.li
From: Charlene Liu 

Added a "disable_tri_buf" DC debug option. When set to 1  feature will
be off.

Signed-off-by: Charlene Liu 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 1a7fd6a..1b8eaf5 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -257,6 +257,7 @@ struct dc_debug_options {
bool skip_detection_link_training;
unsigned int force_odm_combine; //bit vector based on otg inst
unsigned int force_fclk_khz;
+   bool disable_tri_buf;
 };
 
 struct dc_debug_data {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 22/35] drm/amd/display: dcn add check surface in_use

2019-02-13 Thread sunpeng.li
From: Charlene Liu 

Driver need to  poll the SURFACE_INUSE register to determine when to
start the new task and write data to the checked surface.

Implement the wait functions, and add the necessary hubbub registers.

Signed-off-by: Charlene Liu 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c   |  3 ++
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c| 46 ++
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h| 25 ++--
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 22 ++-
 drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h   |  3 ++
 drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h  |  2 +
 6 files changed, 97 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index c68fbd5..1bfd9ba 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1726,6 +1726,9 @@ static void commit_planes_for_stream(struct dc *dc,
 
if (!pipe_ctx->plane_state)
continue;
+   /*make sure hw finished surface update*/
+   if (dc->hwss.wait_surface_safe_to_update)
+   dc->hwss.wait_surface_safe_to_update(dc, 
pipe_ctx);
 
/* Full fe update*/
if (update_type == UPDATE_TYPE_FAST)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
index e161ad8..9c6217b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
@@ -642,6 +642,50 @@ void hubbub1_soft_reset(struct hubbub *hubbub, bool reset)
DCHUBBUB_GLOBAL_SOFT_RESET, reset_en);
 }
 
+static bool hubbub1_is_surf_still_in_update(struct hubbub *hubbub, uint32_t 
hbup_inst)
+{
+   struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+   uint32_t still_used_by_dcn = 0;
+
+   switch (hbup_inst) {
+   case 0:
+   REG_GET(SURFACE_CHECK0_ADDRESS_MSB,
+   CHECKER0_SURFACE_INUSE,
+   _used_by_dcn);
+   break;
+   case 1:
+   REG_GET(SURFACE_CHECK1_ADDRESS_MSB,
+   CHECKER1_SURFACE_INUSE,
+   _used_by_dcn);
+   break;
+   case 2:
+   REG_GET(SURFACE_CHECK2_ADDRESS_MSB,
+   CHECKER2_SURFACE_INUSE,
+   _used_by_dcn);
+   break;
+   case 3:
+   REG_GET(SURFACE_CHECK3_ADDRESS_MSB,
+   CHECKER3_SURFACE_INUSE,
+   _used_by_dcn);
+   break;
+   default:
+   break;
+   }
+   return (still_used_by_dcn == 1);
+}
+
+void hubbub1_wait_for_safe_surf_update(struct hubbub *hubbub, uint32_t 
hbup_inst)
+{
+   uint32_t still_used_by_dcn = 0, count = 0;
+
+   do {
+   still_used_by_dcn = hubbub1_is_surf_still_in_update(hubbub, 
hbup_inst);
+   udelay(1);
+   count++;
+   } while (still_used_by_dcn == 1 && count < 100);
+   ASSERT(count < 100);
+}
+
 static bool hubbub1_dcc_support_swizzle(
enum swizzle_mode_values swizzle,
unsigned int bytes_per_element,
@@ -860,12 +904,14 @@ static bool hubbub1_get_dcc_compression_cap(struct hubbub 
*hubbub,
return true;
 }
 
+
 static const struct hubbub_funcs hubbub1_funcs = {
.update_dchub = hubbub1_update_dchub,
.dcc_support_swizzle = hubbub1_dcc_support_swizzle,
.dcc_support_pixel_format = hubbub1_dcc_support_pixel_format,
.get_dcc_compression_cap = hubbub1_get_dcc_compression_cap,
.wm_read_state = hubbub1_wm_read_state,
+   .wait_for_surf_safe_update = hubbub1_wait_for_safe_surf_update,
 };
 
 void hubbub1_construct(struct hubbub *hubbub,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
index 9cd4a51..f352e7a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
@@ -52,7 +52,11 @@
SR(DCHUBBUB_GLOBAL_TIMER_CNTL), \
SR(DCHUBBUB_TEST_DEBUG_INDEX), \
SR(DCHUBBUB_TEST_DEBUG_DATA),\
-   SR(DCHUBBUB_SOFT_RESET)
+   SR(DCHUBBUB_SOFT_RESET),\
+   SR(SURFACE_CHECK0_ADDRESS_MSB),\
+   SR(SURFACE_CHECK1_ADDRESS_MSB),\
+   SR(SURFACE_CHECK2_ADDRESS_MSB),\
+   SR(SURFACE_CHECK3_ADDRESS_MSB)
 
 #define HUBBUB_SR_WATERMARK_REG_LIST()\
SR(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_A),\
@@ -116,6 +120,10 @@ struct dcn_hubbub_registers {
uint32_t DCN_VM_AGP_BOT;
uint32_t DCN_VM_AGP_TOP;
uint32_t DCN_VM_AGP_BASE;
+   uint32_t SURFACE_CHECK0_ADDRESS_MSB;
+   uint32_t 

[PATCH 11/35] drm/amd/display: Do cursor updates after stream updates

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
Cursor updates used to happen after vblank/flip/stream updates before
the stream update refactor. They now happen before stream updates
which means that they're not going to be synced with fb changes
and that they're going to programmed for pipes that we're disabling
within the same commit.

[How]
Move them after stream updates.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: David Francis 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a7c8583..8cd6a82 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4768,10 +4768,9 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
struct dc_plane_state *dc_plane;
struct dm_plane_state *dm_new_plane_state = 
to_dm_plane_state(new_plane_state);
 
-   if (plane->type == DRM_PLANE_TYPE_CURSOR) {
-   handle_cursor_update(plane, old_plane_state);
+   /* Cursor plane is handled after stream updates */
+   if (plane->type == DRM_PLANE_TYPE_CURSOR)
continue;
-   }
 
if (!fb || !crtc || pcrtc != crtc)
continue;
@@ -4964,6 +4963,10 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
mutex_unlock(>dc_lock);
}
 
+   for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
new_plane_state, i)
+   if (plane->type == DRM_PLANE_TYPE_CURSOR)
+   handle_cursor_update(plane, old_plane_state);
+
 cleanup:
kfree(flip);
kfree(full);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 12/35] drm/amd/display: Clear stream->mode_changed after commit

2019-02-13 Thread sunpeng.li
From: Nicholas Kazlauskas 

[Why]
The stream->mode_changed flag can persist in the following sequence
of atomic commits:

Commit 1:
Enable CRTC0 (mode_changed = true), Enable CRTC1 (mode_changed = true)

Commit 2:
Disable CRTC1 (mode_changed = false)

In this sequence we want to keep the exiting CRTC0 but it's not in the
atomic state for the commit since it hasn't been modified. In this case
the stream->mode_changed flag persists as true and we don't re-program
the planes for the existing stream.

[How]
The flag needs to be cleared and it makes the most sense to do it within
DC after the state has been committed. Nothing following dc_commit_state
should think that the stream's mode has changed.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Leo Li 
Acked-by: Tony Cheng 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 52f8384..8879cd4 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1138,6 +1138,9 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
/* pplib is notified if disp_num changed */
dc->hwss.optimize_bandwidth(dc, context);
 
+   for (i = 0; i < context->stream_count; i++)
+   context->streams[i]->mode_changed = false;
+
dc_release_state(dc->current_state);
 
dc->current_state = context;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 24/35] drm/amd/display: fix optimize_bandwidth func pointer for dce80

2019-02-13 Thread sunpeng.li
From: Bhawanpreet Lakha 

[Why]
optimize_bandwidth was using dce100_prepare_bandwidth this is incorrect

[How]
change it to dce100_optimize_bandwidth

Signed-off-by: Bhawanpreet Lakha 
Reviewed-by: Charlene Liu 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce100/dce100_hw_sequencer.h | 4 
 drivers/gpu/drm/amd/display/dc/dce80/dce80_hw_sequencer.c   | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce100/dce100_hw_sequencer.h 
b/drivers/gpu/drm/amd/display/dc/dce100/dce100_hw_sequencer.h
index acd4185..a6b80fd 100644
--- a/drivers/gpu/drm/amd/display/dc/dce100/dce100_hw_sequencer.h
+++ b/drivers/gpu/drm/amd/display/dc/dce100/dce100_hw_sequencer.h
@@ -37,6 +37,10 @@ void dce100_prepare_bandwidth(
struct dc *dc,
struct dc_state *context);
 
+void dce100_optimize_bandwidth(
+   struct dc *dc,
+   struct dc_state *context);
+
 bool dce100_enable_display_power_gating(struct dc *dc, uint8_t controller_id,
struct dc_bios *dcb,
enum pipe_gating_control power_gating);
diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce80/dce80_hw_sequencer.c
index a60a90e..c454317 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_hw_sequencer.c
@@ -77,6 +77,6 @@ void dce80_hw_sequencer_construct(struct dc *dc)
dc->hwss.enable_display_power_gating = 
dce100_enable_display_power_gating;
dc->hwss.pipe_control_lock = dce_pipe_control_lock;
dc->hwss.prepare_bandwidth = dce100_prepare_bandwidth;
-   dc->hwss.optimize_bandwidth = dce100_prepare_bandwidth;
+   dc->hwss.optimize_bandwidth = dce100_optimize_bandwidth;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 17/35] drm/amd/display: 3.2.18

2019-02-13 Thread sunpeng.li
From: mmcgarri 

Signed-off-by: mmcgarri 
Reviewed-by: Tony Cheng 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 1b8eaf5..9adb801 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -39,7 +39,7 @@
 #include "inc/hw/dmcu.h"
 #include "dml/display_mode_lib.h"
 
-#define DC_VER "3.2.17"
+#define DC_VER "3.2.18"
 
 #define MAX_SURFACES 3
 #define MAX_STREAMS 6
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 07/35] drm/amd/display: send pipe set command to dmcu when backlight is set

2019-02-13 Thread sunpeng.li
From: Josip Pavic 

[Why]
Previously, a change removed code that would send a pipe set command
to dmcu each time the backlight was set, as it was thought to be
superfluous. However, it is possible for the backlight to be set
before a valid pipe has been set, which causes DMCU to hang after a
DPMS restore on some systems.

[How]
Send a pipe set command to DMCU prior to setting the backlight.

Fixes: 4d3cb100431c ("drm/amd/display: send pipe set command to dmcu when 
backlight is set")
Signed-off-by: Josip Pavic 
Reviewed-by: Anthony Koo 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_abm.c | 45 ++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
index a740bc3..da96229 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
@@ -53,6 +53,27 @@
 
 #define MCP_DISABLE_ABM_IMMEDIATELY 255
 
+static bool dce_abm_set_pipe(struct abm *abm, uint32_t controller_id)
+{
+   struct dce_abm *abm_dce = TO_DCE_ABM(abm);
+   uint32_t rampingBoundary = 0x;
+
+   REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0,
+   1, 8);
+
+   /* set ramping boundary */
+   REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
+
+   /* setDMCUParam_Pipe */
+   REG_UPDATE_2(MASTER_COMM_CMD_REG,
+   MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
+   MASTER_COMM_CMD_REG_BYTE1, controller_id);
+
+   /* notifyDMCUMsg */
+   REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
+
+   return true;
+}
 
 static unsigned int calculate_16_bit_backlight_from_pwm(struct dce_abm 
*abm_dce)
 {
@@ -184,6 +205,8 @@ static void dmcu_set_backlight_level(
// Take MSB of fractional part since backlight is not max
backlight_8_bit = (backlight_pwm_u16_16 >> 8) & 0xFF;
 
+   dce_abm_set_pipe(_dce->base, controller_id);
+
/* waitDMCUReadyForCmd */
REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT,
0, 1, 8);
@@ -293,28 +316,6 @@ static bool dce_abm_set_level(struct abm *abm, uint32_t 
level)
return true;
 }
 
-static bool dce_abm_set_pipe(struct abm *abm, uint32_t controller_id)
-{
-   struct dce_abm *abm_dce = TO_DCE_ABM(abm);
-   uint32_t rampingBoundary = 0x;
-
-   REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0,
-   1, 8);
-
-   /* set ramping boundary */
-   REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
-
-   /* setDMCUParam_Pipe */
-   REG_UPDATE_2(MASTER_COMM_CMD_REG,
-   MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
-   MASTER_COMM_CMD_REG_BYTE1, controller_id);
-
-   /* notifyDMCUMsg */
-   REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
-
-   return true;
-}
-
 static bool dce_abm_immediate_disable(struct abm *abm)
 {
struct dce_abm *abm_dce = TO_DCE_ABM(abm);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 00/35] DC Patches Feb 13, 2019

2019-02-13 Thread sunpeng.li
From: Leo Li 

Summary of change:
* Fix S3 resume black screen on DCE8
* Fix dissapearing cursor on Raven sytems
* Cleanup DM plane commit logic
* Fixes for multiplane commits
* Fixes for seamless boot

Anthony Koo (5):
  drm/amd/display: remove screen flashes on seamless boot
  drm/amd/display: Increase precision for backlight curve
  drm/amd/display: make seamless boot work generically
  drm/amd/display: Fix exception from AUX acquire failure
  drm/amd/display: Fix issue with link_active state not correct for MST

Bhawanpreet Lakha (2):
  drm/amd/display: fix optimize_bandwidth func pointer for dce80
  drm/amd/display: set clocks to 0 on suspend on dce80

Charlene Liu (2):
  drm/amd/display: Add disable triple buffering DC debug option
  drm/amd/display: dcn add check surface in_use

David Francis (2):
  drm/amd/display: Clean up wait on vblank event
  drm/amd/display: Make stream commits call into DC only once

Dmytro Laktyushkin (1):
  drm/amd/display: Allow for plane-less resource reservation

Eric Bernstein (1):
  drm/amd/display: Move enum gamut_remap_select to hw_shared.h

Eryk Brol (1):
  drm/amd/display: Add DCN_VM aperture registers

Fatemeh Darbehani (1):
  drm/amd/display: Remove redundant 'else' statement in
dcn1_update_clocks

Gary Kattan (1):
  drm/amd/display: Ungate stream before programming registers

Josip Pavic (3):
  drm/amd/display: send pipe set command to dmcu when stream unblanks
  drm/amd/display: send pipe set command to dmcu when backlight is set
  drm/amd/display: optionally optimize edp link rate based on timing

Jun Lei (3):
  drm/amd/display: PPLIB Hookup
  drm/amd/display: Add p_state_change_support flag to dc_clocks
  drm/amd/display: Add ability to override bounding box in DC construct

Leo (Hanghong) Ma (1):
  drm/amd/display: Fix MST reboot/poweroff sequence

Mark McGarrity (1):
  drm/amd/display: 3.2.19

Nicholas Kazlauskas (7):
  drm/amd/display: Fix wrong z-order when updating overlay planes
  drm/amd/display: Don't expose support for DRM_FORMAT_RGB888
  drm/amd/display: Fix update type mismatches in atomic check
  drm/amd/display: Do cursor updates after stream updates
  drm/amd/display: Clear stream->mode_changed after commit
  drm/amd/display: Fix negative cursor pos programming
  drm/amd/display: Reset planes that were disabled in init_pipes

Roman Li (1):
  drm/amd/display: Raise dispclk value for dce11

Wesley Chalmers (1):
  drm/amd/display: Set flip pending for pipe split

Yongqiang Sun (1):
  drm/amd/display: Refactor for setup periodic interrupt.

mmcgarri (1):
  drm/amd/display: 3.2.18

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 214 +
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c   |  20 +-
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c   |   2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c   |  20 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link.c  |  15 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c   | 195 +++-
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  |   3 +
 drivers/gpu/drm/amd/display/dc/core/dc_surface.c   |  13 ++
 drivers/gpu/drm/amd/display/dc/dc.h|  20 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h   |   2 +
 drivers/gpu/drm/amd/display/dc/dc_stream.h |  24 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_abm.c   |  45 ++--
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c   |   4 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c   |  11 +-
 .../amd/display/dc/dce100/dce100_hw_sequencer.h|   4 +
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|  62 +++--
 .../drm/amd/display/dc/dce80/dce80_hw_sequencer.c  |   2 +-
 .../gpu/drm/amd/display/dc/dce80/dce80_resource.c  |  19 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c   |  10 +-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c   |   7 -
 .../gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c|   7 -
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c|  46 
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h|  25 +-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c  |  23 +-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h  |   3 +
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  | 252 +++--
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h  |   2 +
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c  | 133 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h  |  13 +-
 .../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c  |   6 +-
 drivers/gpu/drm/amd/display/dc/dm_pp_smu.h |   2 +
 drivers/gpu/drm/amd/display/dc/dm_services.h   |   4 +-
 drivers/gpu/drm/amd/display/dc/dm_services_types.h |   2 +-
 .../gpu/drm/amd/display/dc/dml/display_mode_lib.c  |  24 ++
 .../gpu/drm/amd/display/dc/dml/display_mode_lib.h  |   5 +
 drivers/gpu/drm/amd/display/dc/inc/core_types.h|   2 +-
 drivers/gpu/drm/amd/display/dc/inc/hw/abm.h|   1 +
 drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h   |   3 +
 

[PATCH 02/35] drm/amd/display: send pipe set command to dmcu when stream unblanks

2019-02-13 Thread sunpeng.li
From: Josip Pavic 

[Why]
When stream is blanked, pipe set command is sent to dmcu to notify it
that the abm pipe is disabled. When stream is unblanked, no notification is
made to dmcu that the abm pipe has been enabled, resulting in abm not
being enabled in the firmware.

[How]
When stream is unblanked, send a pipe set command to dmcu.

Signed-off-by: Josip Pavic 
Reviewed-by: Anthony Koo 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_abm.c   | 32 --
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  |  4 ++-
 drivers/gpu/drm/amd/display/dc/inc/hw/abm.h|  1 +
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
index 01e56f1..a740bc3 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
@@ -175,7 +175,6 @@ static void dmcu_set_backlight_level(
uint32_t controller_id)
 {
unsigned int backlight_8_bit = 0;
-   uint32_t rampingBoundary = 0x;
uint32_t s2;
 
if (backlight_pwm_u16_16 & 0x1)
@@ -185,17 +184,6 @@ static void dmcu_set_backlight_level(
// Take MSB of fractional part since backlight is not max
backlight_8_bit = (backlight_pwm_u16_16 >> 8) & 0xFF;
 
-   /* set ramping boundary */
-   REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
-
-   /* setDMCUParam_Pipe */
-   REG_UPDATE_2(MASTER_COMM_CMD_REG,
-   MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
-   MASTER_COMM_CMD_REG_BYTE1, controller_id);
-
-   /* notifyDMCUMsg */
-   REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
-
/* waitDMCUReadyForCmd */
REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT,
0, 1, 8);
@@ -305,21 +293,34 @@ static bool dce_abm_set_level(struct abm *abm, uint32_t 
level)
return true;
 }
 
-static bool dce_abm_immediate_disable(struct abm *abm)
+static bool dce_abm_set_pipe(struct abm *abm, uint32_t controller_id)
 {
struct dce_abm *abm_dce = TO_DCE_ABM(abm);
+   uint32_t rampingBoundary = 0x;
 
REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0,
1, 8);
 
-   /* setDMCUParam_ABMLevel */
+   /* set ramping boundary */
+   REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
+
+   /* setDMCUParam_Pipe */
REG_UPDATE_2(MASTER_COMM_CMD_REG,
MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
-   MASTER_COMM_CMD_REG_BYTE1, MCP_DISABLE_ABM_IMMEDIATELY);
+   MASTER_COMM_CMD_REG_BYTE1, controller_id);
 
/* notifyDMCUMsg */
REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
 
+   return true;
+}
+
+static bool dce_abm_immediate_disable(struct abm *abm)
+{
+   struct dce_abm *abm_dce = TO_DCE_ABM(abm);
+
+   dce_abm_set_pipe(abm, MCP_DISABLE_ABM_IMMEDIATELY);
+
abm->stored_backlight_registers.BL_PWM_CNTL =
REG_READ(BL_PWM_CNTL);
abm->stored_backlight_registers.BL_PWM_CNTL2 =
@@ -419,6 +420,7 @@ static const struct abm_funcs dce_funcs = {
.abm_init = dce_abm_init,
.set_abm_level = dce_abm_set_level,
.init_backlight = dce_abm_init_backlight,
+   .set_pipe = dce_abm_set_pipe,
.set_backlight_level_pwm = dce_abm_set_backlight_level_pwm,
.get_current_backlight = dce_abm_get_current_backlight,
.get_target_backlight = dce_abm_get_target_backlight,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 117d9d8..7f95808 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -2162,8 +2162,10 @@ static void dcn10_blank_pixel_data(
if (!blank) {
if (stream_res->tg->funcs->set_blank)
stream_res->tg->funcs->set_blank(stream_res->tg, blank);
-   if (stream_res->abm)
+   if (stream_res->abm) {
+   stream_res->abm->funcs->set_pipe(stream_res->abm, 
stream_res->tg->inst + 1);
stream_res->abm->funcs->set_abm_level(stream_res->abm, 
stream->abm_level);
+   }
} else if (blank) {
if (stream_res->abm)

stream_res->abm->funcs->set_abm_immediate_disable(stream_res->abm);
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
index abc961c..86dc39a 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
@@ -46,6 +46,7 @@ struct abm_funcs {
void (*abm_init)(struct abm *abm);
bool (*set_abm_level)(struct abm *abm, unsigned int abm_level);
bool 

Re: [PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 2:21 PM, Grodzovsky, Andrey wrote:
> 
> On 2/13/19 2:16 PM, Kazlauskas, Nicholas wrote:
>> On 2/13/19 2:10 PM, Grodzovsky, Andrey wrote:
>>> On 2/13/19 2:00 PM, Kazlauskas, Nicholas wrote:
 On 2/13/19 1:58 PM, Andrey Grodzovsky wrote:
> When ring hang happens amdgpu_dm_commit_planes during flip is holding
> the BO reserved and then stack waiting for fences to signal in
> reservation_object_wait_timeout_rcu (which won't signal because there
> was a hnag). Then when we try to shutdown display block during reset
> recovery from drm_atomic_helper_suspend we also try to reserve the BO
> from dm_plane_helper_cleanup_fb ending in deadlock.
> Also remove useless WARN_ON
>
> Signed-off-by: Andrey Grodzovsky 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 
> +--
>   1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index acc4ff8..f8dec36 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>*/
>   abo = gem_to_amdgpu_bo(fb->obj[0]);
>   r = amdgpu_bo_reserve(abo, true);
> - if (unlikely(r != 0)) {
> + if (unlikely(r != 0))
>   DRM_ERROR("failed to reserve buffer 
> before flip\n");
> - WARN_ON(1);
> - }
>   
> - /* Wait for all fences on this FB */
> - 
> WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
> - 
> MAX_SCHEDULE_TIMEOUT) < 0);
> + /*
> +  * Wait for all fences on this FB. Do limited wait to 
> avoid
> +  * deadlock during GPU reset when this fence will not 
> signal
> +  * but we hold reservation lock for the BO.
> +  */
> + r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
> + true, false,
> + 
> msecs_to_jiffies(5000));
> + if (unlikely(r == 0))
> + DRM_ERROR("Waiting for fences timed out.");
> +
> +
>   
>   amdgpu_bo_get_tiling_flags(abo, _flags);
>   
>
 Is it safe that we're just continuing like this? It's probably better to
 just unreserve the buffer and continue to the next plane, no?

 Nicholas Kazlauskas
>>> As far as I see it should be safe as you are simply flipping to a buffer
>>> for which rendering hasn't finished (or stack actually in this case) so
>>> you might see visual corruption but that the least of your problems if
>>> after 5s the BO still not finalized for presentation, the system is
>>> already probably in very bad shape. Also, in case we do want to  do
>>> error handling we should also take care of  amdgpu_bo_reserve failure
>>> just before that.
>>>
>>> Andrey
>>>
>>>
>> Yeah, I guess this whole blocks needs to be cleaned up in that case.
>> This is a good first step at least. Technically
>> reservation_object_wait_timeout_rcu will return < 0 when it's been
>> interrupted too as an error code but I guess that will just be silently
>> ignored here.
>>
>> If you want you can change the condition to:
>>
>> if (unlikely(r >= 0))
>> DRM_ERROR("Waiting for FB fence failed: id=%d res=%d\n",
>> plane->id, r);
> 
> 
> Note that reservation_object_wait_timeout_rcu has a flag 'bool intr: if
> true, do interruptible wait', we set it to false since the code in case
> of flip runs form the kernel worker thread and not from IOCTL meaning we
> are not in user mode context and hence are not going to recieve user
> signals (cannot be interrupted). So the only values we can recieve here
> are either 0 for time out or val > 0 for wait that completed before time
> out value.
> 
> Andrey

Oh, right.

This patch is good as-is then.

Reviewed-by: Nicholas Kazlauskas 

> 
>>
>> But with or without that change this patch is:
>>
>> Reviewed-by: Nicholas Kazlauskas 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Grodzovsky, Andrey

On 2/13/19 2:16 PM, Kazlauskas, Nicholas wrote:
> On 2/13/19 2:10 PM, Grodzovsky, Andrey wrote:
>> On 2/13/19 2:00 PM, Kazlauskas, Nicholas wrote:
>>> On 2/13/19 1:58 PM, Andrey Grodzovsky wrote:
 When ring hang happens amdgpu_dm_commit_planes during flip is holding
 the BO reserved and then stack waiting for fences to signal in
 reservation_object_wait_timeout_rcu (which won't signal because there
 was a hnag). Then when we try to shutdown display block during reset
 recovery from drm_atomic_helper_suspend we also try to reserve the BO
 from dm_plane_helper_cleanup_fb ending in deadlock.
 Also remove useless WARN_ON

 Signed-off-by: Andrey Grodzovsky 
 ---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 
 +--
  1 file changed, 13 insertions(+), 6 deletions(-)

 diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
 b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 index acc4ff8..f8dec36 100644
 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
 drm_atomic_state *state,
 */
abo = gem_to_amdgpu_bo(fb->obj[0]);
r = amdgpu_bo_reserve(abo, true);
 -  if (unlikely(r != 0)) {
 +  if (unlikely(r != 0))
DRM_ERROR("failed to reserve buffer 
 before flip\n");
 -  WARN_ON(1);
 -  }
  
 -  /* Wait for all fences on this FB */
 -  
 WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
 -  
 MAX_SCHEDULE_TIMEOUT) < 0);
 +  /*
 +   * Wait for all fences on this FB. Do limited wait to 
 avoid
 +   * deadlock during GPU reset when this fence will not 
 signal
 +   * but we hold reservation lock for the BO.
 +   */
 +  r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
 +  true, false,
 +  
 msecs_to_jiffies(5000));
 +  if (unlikely(r == 0))
 +  DRM_ERROR("Waiting for fences timed out.");
 +
 +
  
amdgpu_bo_get_tiling_flags(abo, _flags);
  

>>> Is it safe that we're just continuing like this? It's probably better to
>>> just unreserve the buffer and continue to the next plane, no?
>>>
>>> Nicholas Kazlauskas
>> As far as I see it should be safe as you are simply flipping to a buffer
>> for which rendering hasn't finished (or stack actually in this case) so
>> you might see visual corruption but that the least of your problems if
>> after 5s the BO still not finalized for presentation, the system is
>> already probably in very bad shape. Also, in case we do want to  do
>> error handling we should also take care of  amdgpu_bo_reserve failure
>> just before that.
>>
>> Andrey
>>
>>
> Yeah, I guess this whole blocks needs to be cleaned up in that case.
> This is a good first step at least. Technically
> reservation_object_wait_timeout_rcu will return < 0 when it's been
> interrupted too as an error code but I guess that will just be silently
> ignored here.
>
> If you want you can change the condition to:
>
> if (unlikely(r >= 0))
>DRM_ERROR("Waiting for FB fence failed: id=%d res=%d\n",
> plane->id, r);


Note that reservation_object_wait_timeout_rcu has a flag 'bool intr: if 
true, do interruptible wait', we set it to false since the code in case 
of flip runs form the kernel worker thread and not from IOCTL meaning we 
are not in user mode context and hence are not going to recieve user 
signals (cannot be interrupted). So the only values we can recieve here 
are either 0 for time out or val > 0 for wait that completed before time 
out value.

Andrey

>
> But with or without that change this patch is:
>
> Reviewed-by: Nicholas Kazlauskas 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 2:10 PM, Grodzovsky, Andrey wrote:
> 
> On 2/13/19 2:00 PM, Kazlauskas, Nicholas wrote:
>> On 2/13/19 1:58 PM, Andrey Grodzovsky wrote:
>>> When ring hang happens amdgpu_dm_commit_planes during flip is holding
>>> the BO reserved and then stack waiting for fences to signal in
>>> reservation_object_wait_timeout_rcu (which won't signal because there
>>> was a hnag). Then when we try to shutdown display block during reset
>>> recovery from drm_atomic_helper_suspend we also try to reserve the BO
>>> from dm_plane_helper_cleanup_fb ending in deadlock.
>>> Also remove useless WARN_ON
>>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> ---
>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 
>>> +--
>>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> index acc4ff8..f8dec36 100644
>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
>>> drm_atomic_state *state,
>>>  */
>>> abo = gem_to_amdgpu_bo(fb->obj[0]);
>>> r = amdgpu_bo_reserve(abo, true);
>>> -   if (unlikely(r != 0)) {
>>> +   if (unlikely(r != 0))
>>> DRM_ERROR("failed to reserve buffer 
>>> before flip\n");
>>> -   WARN_ON(1);
>>> -   }
>>> 
>>> -   /* Wait for all fences on this FB */
>>> -   
>>> WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
>>> -   
>>> MAX_SCHEDULE_TIMEOUT) < 0);
>>> +   /*
>>> +* Wait for all fences on this FB. Do limited wait to 
>>> avoid
>>> +* deadlock during GPU reset when this fence will not 
>>> signal
>>> +* but we hold reservation lock for the BO.
>>> +*/
>>> +   r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
>>> +   true, false,
>>> +   
>>> msecs_to_jiffies(5000));
>>> +   if (unlikely(r == 0))
>>> +   DRM_ERROR("Waiting for fences timed out.");
>>> +
>>> +
>>> 
>>> amdgpu_bo_get_tiling_flags(abo, _flags);
>>> 
>>>
>> Is it safe that we're just continuing like this? It's probably better to
>> just unreserve the buffer and continue to the next plane, no?
>>
>> Nicholas Kazlauskas
> 
> As far as I see it should be safe as you are simply flipping to a buffer
> for which rendering hasn't finished (or stack actually in this case) so
> you might see visual corruption but that the least of your problems if
> after 5s the BO still not finalized for presentation, the system is
> already probably in very bad shape. Also, in case we do want to  do
> error handling we should also take care of  amdgpu_bo_reserve failure
> just before that.
> 
> Andrey
> 
> 

Yeah, I guess this whole blocks needs to be cleaned up in that case. 
This is a good first step at least. Technically 
reservation_object_wait_timeout_rcu will return < 0 when it's been 
interrupted too as an error code but I guess that will just be silently 
ignored here.

If you want you can change the condition to:

if (unlikely(r >= 0))
  DRM_ERROR("Waiting for FB fence failed: id=%d res=%d\n", 
plane->id, r);

But with or without that change this patch is:

Reviewed-by: Nicholas Kazlauskas 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Grodzovsky, Andrey

On 2/13/19 2:00 PM, Kazlauskas, Nicholas wrote:
> On 2/13/19 1:58 PM, Andrey Grodzovsky wrote:
>> When ring hang happens amdgpu_dm_commit_planes during flip is holding
>> the BO reserved and then stack waiting for fences to signal in
>> reservation_object_wait_timeout_rcu (which won't signal because there
>> was a hnag). Then when we try to shutdown display block during reset
>> recovery from drm_atomic_helper_suspend we also try to reserve the BO
>> from dm_plane_helper_cleanup_fb ending in deadlock.
>> Also remove useless WARN_ON
>>
>> Signed-off-by: Andrey Grodzovsky 
>> ---
>>drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 +--
>>1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index acc4ff8..f8dec36 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
>> drm_atomic_state *state,
>>   */
>>  abo = gem_to_amdgpu_bo(fb->obj[0]);
>>  r = amdgpu_bo_reserve(abo, true);
>> -if (unlikely(r != 0)) {
>> +if (unlikely(r != 0))
>>  DRM_ERROR("failed to reserve buffer before 
>> flip\n");
>> -WARN_ON(1);
>> -}
>>
>> -/* Wait for all fences on this FB */
>> -
>> WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
>> -
>> MAX_SCHEDULE_TIMEOUT) < 0);
>> +/*
>> + * Wait for all fences on this FB. Do limited wait to 
>> avoid
>> + * deadlock during GPU reset when this fence will not 
>> signal
>> + * but we hold reservation lock for the BO.
>> + */
>> +r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
>> +true, false,
>> +
>> msecs_to_jiffies(5000));
>> +if (unlikely(r == 0))
>> +DRM_ERROR("Waiting for fences timed out.");
>> +
>> +
>>
>>  amdgpu_bo_get_tiling_flags(abo, _flags);
>>
>>
> Is it safe that we're just continuing like this? It's probably better to
> just unreserve the buffer and continue to the next plane, no?
>
> Nicholas Kazlauskas

As far as I see it should be safe as you are simply flipping to a buffer 
for which rendering hasn't finished (or stack actually in this case) so 
you might see visual corruption but that the least of your problems if 
after 5s the BO still not finalized for presentation, the system is 
already probably in very bad shape. Also, in case we do want to  do 
error handling we should also take care of  amdgpu_bo_reserve failure 
just before that.

Andrey


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 1:58 PM, Andrey Grodzovsky wrote:
> When ring hang happens amdgpu_dm_commit_planes during flip is holding
> the BO reserved and then stack waiting for fences to signal in
> reservation_object_wait_timeout_rcu (which won't signal because there
> was a hnag). Then when we try to shutdown display block during reset
> recovery from drm_atomic_helper_suspend we also try to reserve the BO
> from dm_plane_helper_cleanup_fb ending in deadlock.
> Also remove useless WARN_ON
> 
> Signed-off-by: Andrey Grodzovsky 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 +--
>   1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index acc4ff8..f8dec36 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>*/
>   abo = gem_to_amdgpu_bo(fb->obj[0]);
>   r = amdgpu_bo_reserve(abo, true);
> - if (unlikely(r != 0)) {
> + if (unlikely(r != 0))
>   DRM_ERROR("failed to reserve buffer before 
> flip\n");
> - WARN_ON(1);
> - }
>   
> - /* Wait for all fences on this FB */
> - 
> WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
> - 
> MAX_SCHEDULE_TIMEOUT) < 0);
> + /*
> +  * Wait for all fences on this FB. Do limited wait to 
> avoid
> +  * deadlock during GPU reset when this fence will not 
> signal
> +  * but we hold reservation lock for the BO.
> +  */
> + r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
> + true, false,
> + 
> msecs_to_jiffies(5000));
> + if (unlikely(r == 0))
> + DRM_ERROR("Waiting for fences timed out.");
> +
> +
>   
>   amdgpu_bo_get_tiling_flags(abo, _flags);
>   
> 

Is it safe that we're just continuing like this? It's probably better to 
just unreserve the buffer and continue to the next plane, no?

Nicholas Kazlauskas
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

2019-02-13 Thread Andrey Grodzovsky
When ring hang happens amdgpu_dm_commit_planes during flip is holding
the BO reserved and then stack waiting for fences to signal in
reservation_object_wait_timeout_rcu (which won't signal because there
was a hnag). Then when we try to shutdown display block during reset
recovery from drm_atomic_helper_suspend we also try to reserve the BO
from dm_plane_helper_cleanup_fb ending in deadlock.
Also remove useless WARN_ON

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index acc4ff8..f8dec36 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 */
abo = gem_to_amdgpu_bo(fb->obj[0]);
r = amdgpu_bo_reserve(abo, true);
-   if (unlikely(r != 0)) {
+   if (unlikely(r != 0))
DRM_ERROR("failed to reserve buffer before 
flip\n");
-   WARN_ON(1);
-   }
 
-   /* Wait for all fences on this FB */
-   
WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
-   
MAX_SCHEDULE_TIMEOUT) < 0);
+   /*
+* Wait for all fences on this FB. Do limited wait to 
avoid
+* deadlock during GPU reset when this fence will not 
signal
+* but we hold reservation lock for the BO.
+*/
+   r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
+   true, false,
+   
msecs_to_jiffies(5000));
+   if (unlikely(r == 0))
+   DRM_ERROR("Waiting for fences timed out.");
+
+
 
amdgpu_bo_get_tiling_flags(abo, _flags);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm/dsc: Add native 420 and 422 support to compute_rc_params

2019-02-13 Thread Manasi Navare via amd-gfx
On Wed, Feb 13, 2019 at 09:45:35AM -0500, David Francis wrote:
> Native 420 and 422 transfer modes are new in DSC1.2
> 
> In these modes, each two pixels of a slice are treated as one
> pixel, so the slice width is half as large (round down) for
> the purposes of calucating the groups per line and chunk size
> in bytes
> 
> In native 422 mode, each pixel has four components, so the
> mux component of a group is larger by one additional mux word
> and one additional component
> 
> Now that there is native 422 support, the configuration option
> previously called enable422 is renamed to simple_422 to avoid
> confusion
> 
> Signed-off-by: David Francis 

This looks good and verified that the DSC 1.2 spec actually renames it
as simple_422.

Reviewed-by: Manasi Navare 

Manasi

> ---
>  drivers/gpu/drm/drm_dsc.c | 31 +++
>  drivers/gpu/drm/i915/intel_vdsc.c |  4 ++--
>  include/drm/drm_dsc.h |  4 ++--
>  3 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index 4b0e3c9c3ff8..9e675dd39a44 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -77,7 +77,7 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   ((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT) |
>   dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
> - dsc_cfg->enable422 << DSC_PPS_SIMPLE422_SHIFT |
> + dsc_cfg->simple_422 << DSC_PPS_SIMPLE422_SHIFT |
>   dsc_cfg->convert_rgb << DSC_PPS_CONVERT_RGB_SHIFT |
>   dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
>  
> @@ -246,19 +246,34 @@ int drm_dsc_compute_rc_parameters(struct drm_dsc_config 
> *vdsc_cfg)
>   unsigned long final_scale = 0;
>   unsigned long rbs_min = 0;
>  
> - /* Number of groups used to code each line of a slice */
> - groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> -DSC_RC_PIXELS_PER_GROUP);
> + if (vdsc_cfg->native_420 || vdsc_cfg->native_422) {
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width / 2,
> +DSC_RC_PIXELS_PER_GROUP);
>  
> - /* chunksize in Bytes */
> - vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
> -   vdsc_cfg->bits_per_pixel,
> -   (8 * 16));
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
> / 2 *
> +   
> vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> + } else {
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> +DSC_RC_PIXELS_PER_GROUP);
> +
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
> *
> +   
> vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> + }
>  
>   if (vdsc_cfg->convert_rgb)
>   num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
> (4 * vdsc_cfg->bits_per_component + 4)
> - 2);
> + else if (vdsc_cfg->native_422)
> + num_extra_mux_bits = 4 * vdsc_cfg->mux_word_size +
> + (4 * vdsc_cfg->bits_per_component + 4) +
> + 3 * (4 * vdsc_cfg->bits_per_component) - 2;
>   else
>   num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
>   (4 * vdsc_cfg->bits_per_component + 4) +
> diff --git a/drivers/gpu/drm/i915/intel_vdsc.c 
> b/drivers/gpu/drm/i915/intel_vdsc.c
> index c76cec8bfb74..7702c5c8b3f2 100644
> --- a/drivers/gpu/drm/i915/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/intel_vdsc.c
> @@ -369,7 +369,7 @@ int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
>   DSC_1_1_MAX_LINEBUF_DEPTH_BITS : line_buf_depth;
>  
>   /* Gen 11 does not support YCbCr */
> - vdsc_cfg->enable422 = false;
> + vdsc_cfg->simple_422 = false;
>   /* Gen 11 does not support VBR */
>   vdsc_cfg->vbr_enable = false;
>   vdsc_cfg->block_pred_enable =
> @@ -496,7 +496,7 @@ static void intel_configure_pps_for_dsc_encoder(struct 
> intel_encoder *encoder,
>   pps_val |= DSC_BLOCK_PREDICTION;
>   if (vdsc_cfg->convert_rgb)
>   pps_val |= DSC_COLOR_SPACE_CONVERSION;
> - if (vdsc_cfg->enable422)
> + if (vdsc_cfg->simple_422)
>   

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 1:10 PM, Mario Kleiner wrote:
> On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter  wrote:
>>
>> On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
>>  wrote:
>>>
>>> On 2/13/19 10:14 AM, Daniel Vetter wrote:
 On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
  wrote:
>
> On 2/13/19 4:50 AM, Daniel Vetter wrote:
>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:

 On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
  wrote:
>
> On 2/11/19 3:35 AM, Daniel Vetter wrote:
>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
>>> The pageflip completion timestamps transmitted to userspace
>>> via pageflip completion events are supposed to describe the
>>> time at which the first pixel of the new post-pageflip scanout
>>> buffer leaves the video output of the gpu. This time is
>>> identical to end of vblank, when active scanout starts.
>>>
>>> For a crtc in standard fixed refresh rate, the end of vblank
>>> is identical to the vblank timestamps calculated by
>>> drm_update_vblank_count() at each vblank interrupt, or each
>>> vblank dis-/enable. Therefore pageflip events just carry
>>> that vblank timestamp as their pageflip timestamp.
>>>
>>> For a crtc switched to variable refresh rate mode (vrr), the
>>> pageflip completion timestamps are identical to the vblank
>>> timestamps iff the pageflip was executed early in vblank,
>>> before the minimum vblank duration elapsed. In this case
>>> the time of display onset is identical to when the crtc
>>> is running in fixed refresh rate.
>>>
>>> However, if a pageflip completes later in the vblank, inside
>>> the "extended front porch" in vrr mode, then the vblank will
>>> terminate at a fixed (back porch) duration after flip, so
>>> the display onset time is delayed correspondingly. In this
>>> case the vblank timestamp computed at vblank irq time would
>>> be too early, and we need a way to calculate an estimated
>>> pageflip timestamp that will be later than the vblank timestamp.
>>>
>>> How a driver determines such a "late flip" timestamp is hw
>>> and driver specific, but this patch adds a new helper function
>>> that allows the driver to propose such an alternate "late flip"
>>> timestamp for use in pageflip events:
>>>
>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>>
>>> When sending out pageflip events, we now compare that proposed
>>> flip_timestamp against the vblank timestamp of the current
>>> vblank of flip completion and choose to send out the greater/
>>> later timestamp as flip completion timestamp.
>>>
>>> The most simple way for a kms driver to supply a suitable
>>> flip_timestamp in vrr mode would be to simply take a timestamp
>>> at start of the pageflip completion handler, e.g., pageflip
>>> irq handler: flip_timestamp = ktime_get(); and then set that
>>> as proposed "late" alternative timestamp via ...
>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>>
>>> More clever approaches could try to add some corrective offset
>>> for fixed back porch duration, or ideally use hardware features
>>> like hw timestamps to calculate the exact end time of vblank.
>>>
>>> Signed-off-by: Mario Kleiner 
>>> Cc: Nicholas Kazlauskas 
>>> Cc: Harry Wentland 
>>> Cc: Alex Deucher 
>>
>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
>> give us
>> the right timestampe, once? With this I guess if you do a vblank 
>> query in
>> between the wrong and the right vblank you'll get the bogus value. 
>> Not
>> really great for userspace.
>> -Daniel
>
> I think we calculate the timestamp and send the vblank event both 
> within
> the pageflip IRQ handler so calculating the right pageflip timestamp
> once could probably be done. I'm not sure if it's easier than 
> proposing
> a later flip time with an API like this though.
>
> The actual scanout time should be known from the page-flip handler so
> the semantics for VRR on/off remain the same. This is because the
> page-flip triggers entering the back porch if we're in the extended
> front porch.
>
> But scanout time from vblank events for something like
> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> treated as estimates. If we're in the regular front porch 

Re: [PATCH 1/3] drm/i915: Move dsc rate params compute into drm

2019-02-13 Thread Manasi Navare via amd-gfx
On Wed, Feb 13, 2019 at 09:45:34AM -0500, David Francis wrote:
> The function intel_compute_rc_parameters is part of the dsc spec
> and is not driver-specific. Other drm drivers might like to use
> it.  The function is not changed; just moved and renamed.
>

Yes this sounds fair since its DSC spec related and can move to drm_dsc.c.
As a part of this series or later you should also consider moving the
rc_parameters struct for input bpc/output BPP combinations to DRM since that
is also purely spec related.

With this change and compute_rc_params function in DRM, please add appropriate
description of the function as part of kernel documentation.

With the documentation change, you have my r-b.

Regards
Manasi
 
> Signed-off-by: David Francis 
> ---
>  drivers/gpu/drm/drm_dsc.c | 133 ++
>  drivers/gpu/drm/i915/intel_vdsc.c | 125 +---
>  include/drm/drm_dsc.h |   1 +
>  3 files changed, 135 insertions(+), 124 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index bc2b23adb072..4b0e3c9c3ff8 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -226,3 +227,135 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   /* PPS 94 - 127 are O */
>  }
>  EXPORT_SYMBOL(drm_dsc_pps_infoframe_pack);
> +
> +/**
> + * drm_dsc_compute_rc_parameters() - Write rate control
> + * parameters to the dsc configuration. Some configuration
> + * fields must be present beforehand.
> + *
> + * @dsc_cfg:
> + * DSC Configuration data partially filled by driver
> + */
> +int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg)
> +{
> + unsigned long groups_per_line = 0;
> + unsigned long groups_total = 0;
> + unsigned long num_extra_mux_bits = 0;
> + unsigned long slice_bits = 0;
> + unsigned long hrd_delay = 0;
> + unsigned long final_scale = 0;
> + unsigned long rbs_min = 0;
> +
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> +DSC_RC_PIXELS_PER_GROUP);
> +
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
> +   vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> +
> + if (vdsc_cfg->convert_rgb)
> + num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
> +   (4 * vdsc_cfg->bits_per_component + 4)
> +   - 2);
> + else
> + num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
> + (4 * vdsc_cfg->bits_per_component + 4) +
> + 2 * (4 * vdsc_cfg->bits_per_component) - 2;
> + /* Number of bits in one Slice */
> + slice_bits = 8 * vdsc_cfg->slice_chunk_size * vdsc_cfg->slice_height;
> +
> + while ((num_extra_mux_bits > 0) &&
> +((slice_bits - num_extra_mux_bits) % vdsc_cfg->mux_word_size))
> + num_extra_mux_bits--;
> +
> + if (groups_per_line < vdsc_cfg->initial_scale_value - 8)
> + vdsc_cfg->initial_scale_value = groups_per_line + 8;
> +
> + /* scale_decrement_interval calculation according to DSC spec 1.11 */
> + if (vdsc_cfg->initial_scale_value > 8)
> + vdsc_cfg->scale_decrement_interval = groups_per_line /
> + (vdsc_cfg->initial_scale_value - 8);
> + else
> + vdsc_cfg->scale_decrement_interval = 
> DSC_SCALE_DECREMENT_INTERVAL_MAX;
> +
> + vdsc_cfg->final_offset = vdsc_cfg->rc_model_size -
> + (vdsc_cfg->initial_xmit_delay *
> +  vdsc_cfg->bits_per_pixel + 8) / 16 + num_extra_mux_bits;
> +
> + if (vdsc_cfg->final_offset >= vdsc_cfg->rc_model_size) {
> + DRM_DEBUG_KMS("FinalOfs < RcModelSze for this 
> InitialXmitDelay\n");
> + return -ERANGE;
> + }
> +
> + final_scale = (vdsc_cfg->rc_model_size * 8) /
> + (vdsc_cfg->rc_model_size - vdsc_cfg->final_offset);
> + if (vdsc_cfg->slice_height > 1)
> + /*
> +  * NflBpgOffset is 16 bit value with 11 fractional bits
> +  * hence we multiply by 2^11 for preserving the
> +  * fractional part
> +  */
> + vdsc_cfg->nfl_bpg_offset = 
> DIV_ROUND_UP((vdsc_cfg->first_line_bpg_offset << 11),
> + (vdsc_cfg->slice_height 
> - 1));
> + else
> + vdsc_cfg->nfl_bpg_offset = 0;
> +
> + /* 2^16 - 1 */
> + if (vdsc_cfg->nfl_bpg_offset > 65535) {
> + DRM_DEBUG_KMS("NflBpgOffset is too large for this slice 
> height\n");
> + return -ERANGE;
> + }
> +
> + /* Number of 

[PATCH 2/4] drm/amdgpu: Add last_non_cp in amdgpu_doorbell_index

2019-02-13 Thread Zhao, Yong
Change-Id: Icc9167771ad9539d8e31b40058e3b22be825a585
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h | 6 ++
 drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c | 3 +++
 3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 43546500ec26..1ccc10741ad8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -69,6 +69,7 @@ struct amdgpu_doorbell_index {
uint32_t vce_ring6_7;
} uvd_vce;
};
+   uint32_t last_non_cp;
uint32_t max_assignment;
/* Per engine SDMA doorbell size in dword */
uint32_t sdma_doorbell_range;
@@ -139,6 +140,9 @@ typedef enum _AMDGPU_VEGA20_DOORBELL_ASSIGNMENT
AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3 = 0x18D,
AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5 = 0x18E,
AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7 = 0x18F,
+
+   AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP = 
AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7,
+
AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT= 0x18F,
AMDGPU_VEGA20_DOORBELL_INVALID   = 0x
 } AMDGPU_VEGA20_DOORBELL_ASSIGNMENT;
@@ -214,6 +218,8 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
AMDGPU_DOORBELL64_VCE_RING4_5 = 0xFE,
AMDGPU_DOORBELL64_VCE_RING6_7 = 0xFF,
 
+   AMDGPU_DOORBELL64_LAST_NON_CP = 
AMDGPU_DOORBELL64_VCE_RING6_7,
+
AMDGPU_DOORBELL64_MAX_ASSIGNMENT  = 0xFF,
AMDGPU_DOORBELL64_INVALID = 0x
 } AMDGPU_DOORBELL64_ASSIGNMENT;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c 
b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
index 62f49c895314..ffe0e0593207 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
@@ -79,6 +79,9 @@ void vega10_doorbell_index_init(struct amdgpu_device *adev)
adev->doorbell_index.uvd_vce.vce_ring2_3 = 
AMDGPU_DOORBELL64_VCE_RING2_3;
adev->doorbell_index.uvd_vce.vce_ring4_5 = 
AMDGPU_DOORBELL64_VCE_RING4_5;
adev->doorbell_index.uvd_vce.vce_ring6_7 = 
AMDGPU_DOORBELL64_VCE_RING6_7;
+
+   adev->doorbell_index.last_non_cp = AMDGPU_DOORBELL64_LAST_NON_CP;
+
/* In unit of dword doorbell */
adev->doorbell_index.max_assignment = AMDGPU_DOORBELL64_MAX_ASSIGNMENT 
<< 1;
adev->doorbell_index.sdma_doorbell_range = 4;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c 
b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
index 1271e1702ad4..700ff8aec999 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
@@ -83,6 +83,9 @@ void vega20_doorbell_index_init(struct amdgpu_device *adev)
adev->doorbell_index.uvd_vce.vce_ring2_3 = 
AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3;
adev->doorbell_index.uvd_vce.vce_ring4_5 = 
AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5;
adev->doorbell_index.uvd_vce.vce_ring6_7 = 
AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7;
+
+   adev->doorbell_index.last_non_cp = AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP;
+
adev->doorbell_index.max_assignment = 
AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT << 1;
adev->doorbell_index.sdma_doorbell_range = 20;
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm/amdkfd: Optimize out sdma doorbell array in kgd2kfd_shared_resources

2019-02-13 Thread Zhao, Yong
We can directly calculate sdma doorbell indexes in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more. This alleviates the
adaptation needs when new SDMA configurations are introduced.

Change-Id: Ic657799856ed0256f36b01e502ef0cab263b1f49
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 41 +--
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 16 +---
 .../gpu/drm/amd/include/kgd_kfd_interface.h   |  4 +-
 3 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index a8a166fff1e3..88f6f0ae38a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -131,7 +131,7 @@ static void amdgpu_doorbell_get_kfd_info(struct 
amdgpu_device *adev,
 
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 {
-   int i, n;
+   int i;
int last_valid_bit;
 
if (adev->kfd.dev) {
@@ -142,7 +142,9 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
.gpuvm_size = min(adev->vm_manager.max_pfn
  << AMDGPU_GPU_PAGE_SHIFT,
  AMDGPU_GMC_HOLE_START),
-   .drm_render_minor = adev->ddev->render->index
+   .drm_render_minor = adev->ddev->render->index,
+   .sdma_doorbell_idx = adev->doorbell_index.sdma_engine,
+
};
 
/* this is going to have a few of the MSBs set that we need to
@@ -172,31 +174,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
_resources.doorbell_aperture_size,
_resources.doorbell_start_offset);
 
-   if (adev->asic_type < CHIP_VEGA10) {
-   kgd2kfd_device_init(adev->kfd.dev, _resources);
-   return;
-   }
-
-   n = (adev->asic_type < CHIP_VEGA20) ? 2 : 8;
-
-   for (i = 0; i < n; i += 2) {
-   /* On SOC15 the BIF is involved in routing
-* doorbells using the low 12 bits of the
-* address. Communicate the assignments to
-* KFD. KFD uses two doorbell pages per
-* process in case of 64-bit doorbells so we
-* can use each doorbell assignment twice.
-*/
-   gpu_resources.sdma_doorbell[0][i] =
-   adev->doorbell_index.sdma_engine[0] + (i >> 1);
-   gpu_resources.sdma_doorbell[0][i+1] =
-   adev->doorbell_index.sdma_engine[0] + 0x200 + 
(i >> 1);
-   gpu_resources.sdma_doorbell[1][i] =
-   adev->doorbell_index.sdma_engine[1] + (i >> 1);
-   gpu_resources.sdma_doorbell[1][i+1] =
-   adev->doorbell_index.sdma_engine[1] + 0x200 + 
(i >> 1);
-   }
-
/* Since SOC15, BIF starts to statically use the
 * lower 12 bits of doorbell addresses for routing
 * based on settings in registers like
@@ -205,10 +182,12 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 * 12 bits of its address has to be outside the range
 * set for SDMA, VCN, and IH blocks.
 */
-   gpu_resources.non_cp_doorbells_start =
-   adev->doorbell_index.sdma_engine[0];
-   gpu_resources.non_cp_doorbells_end =
-   adev->doorbell_index.last_non_cp;
+   if (adev->asic_type >= CHIP_VEGA10) {
+   gpu_resources.non_cp_doorbells_start =
+   adev->doorbell_index.sdma_engine[0];
+   gpu_resources.non_cp_doorbells_end =
+   adev->doorbell_index.last_non_cp;
+   }
 
kgd2kfd_device_init(adev->kfd.dev, _resources);
}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8372556b52eb..c6c9530e704e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -134,12 +134,18 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
 */
q->doorbell_id = q->properties.queue_id;
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
-   /* For SDMA queues on SOC15, use static doorbell
-* assignments based on the engine and queue.
+   /* For SDMA queues on SOC15 

[PATCH 1/4] drm/amdkfd: Move a constant definition around

2019-02-13 Thread Zhao, Yong
The similar definitions should be consecutive.

Change-Id: I936cf076363e641c60e0704d8405ae9493718e18
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 12b66330fc6d..e5ebcca7f031 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -97,17 +97,18 @@
 #define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
 #define KFD_CWSR_TMA_OFFSET PAGE_SIZE
 
+#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE   \
+   (KFD_MAX_NUM_OF_PROCESSES * \
+   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+
+#define KFD_KERNEL_QUEUE_SIZE 2048
+
 /*
  * Kernel module parameter to specify maximum number of supported queues per
  * device
  */
 extern int max_num_of_queues_per_device;
 
-#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE   \
-   (KFD_MAX_NUM_OF_PROCESSES * \
-   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
-
-#define KFD_KERNEL_QUEUE_SIZE 2048
 
 /* Kernel module parameter to specify the scheduling policy */
 extern int sched_policy;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/4] drm/amdkfd: Fix bugs regarding CP queue doorbell mask on SOC15

2019-02-13 Thread Zhao, Yong
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.

Change-Id: I670adfc3fd7725d2ed0bd9665cb7f69f8b9023c2
Signed-off-by: Yong Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  | 16 
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c| 14 +-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 15 ++-
 4 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index e957e42c539a..a8a166fff1e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -196,11 +196,19 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
gpu_resources.sdma_doorbell[1][i+1] =
adev->doorbell_index.sdma_engine[1] + 0x200 + 
(i >> 1);
}
-   /* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
-* SDMA, IH and VCN. So don't use them for the CP.
+
+   /* Since SOC15, BIF starts to statically use the
+* lower 12 bits of doorbell addresses for routing
+* based on settings in registers like
+* SDMA0_DOORBELL_RANGE etc..
+* In order to route a doorbell to CP engine, the lower
+* 12 bits of its address has to be outside the range
+* set for SDMA, VCN, and IH blocks.
 */
-   gpu_resources.reserved_doorbell_mask = 0x1e0;
-   gpu_resources.reserved_doorbell_val  = 0x0e0;
+   gpu_resources.non_cp_doorbells_start =
+   adev->doorbell_index.sdma_engine[0];
+   gpu_resources.non_cp_doorbells_end =
+   adev->doorbell_index.last_non_cp;
 
kgd2kfd_device_init(adev->kfd.dev, _resources);
}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index e5ebcca7f031..03c6d6dc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -103,6 +103,17 @@
 
 #define KFD_KERNEL_QUEUE_SIZE 2048
 
+/*
+ * 512 = 0x200
+ * The doorbell index distance between SDMA RLC (2*i) and (2*i+1) in the
+ * same SDMA engine on SOC15, which has 8-byte doorbells for SDMA.
+ * 512 8-byte doorbell distance (i.e. one page away) ensures that SDMA RLC
+ * (2*i+1) doorbells (in terms of the lower 12 bit address) lie exactly in
+ * the OFFSET and SIZE set in registers like BIF_SDMA0_DOORBELL_RANGE.
+ */
+#define KFD_QUEUE_DOORBELL_MIRROR_OFFSET 512
+
+
 /*
  * Kernel module parameter to specify maximum number of supported queues per
  * device
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 80b36e860a0a..4bdae78bab8e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -607,13 +607,17 @@ static int init_doorbell_bitmap(struct qcm_process_device 
*qpd,
if (!qpd->doorbell_bitmap)
return -ENOMEM;
 
-   /* Mask out any reserved doorbells */
-   for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS; i++)
-   if ((dev->shared_resources.reserved_doorbell_mask & i) ==
-   dev->shared_resources.reserved_doorbell_val) {
+   /* Mask out doorbells reserved for SDMA, IH, and VCN on SOC15. */
+   for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS / 2; i++) {
+   if (i >= dev->shared_resources.non_cp_doorbells_start
+   && i <= dev->shared_resources.non_cp_doorbells_end) {
set_bit(i, qpd->doorbell_bitmap);
-   pr_debug("reserved doorbell 0x%03x\n", i);
+   set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
+   qpd->doorbell_bitmap);
+   pr_debug("reserved doorbell 0x%03x and 0x%03x\n", i,
+   i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET);
}
+   }
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 83d960110d23..0b6b34f4e5a1 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -140,17 +140,14 @@ struct kgd2kfd_shared_resources {
/* Doorbell assignments (SOC15 and later chips only). Only
 * specific doorbells are routed to each SDMA engine. Others
 * are routed to IH and VCN. They are not usable by the CP.
-*
-* Any doorbell number D that satisfies the following condition
-* is reserved: (D & reserved_doorbell_mask) == reserved_doorbell_val
-*
-* KFD currently uses 1024 (= 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter  wrote:
>
> On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
>  wrote:
> >
> > On 2/13/19 10:14 AM, Daniel Vetter wrote:
> > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
> > >  wrote:
> > >>
> > >> On 2/13/19 4:50 AM, Daniel Vetter wrote:
> > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> >  On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> > >  wrote:
> > >>
> > >> On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > >>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> >  The pageflip completion timestamps transmitted to userspace
> >  via pageflip completion events are supposed to describe the
> >  time at which the first pixel of the new post-pageflip scanout
> >  buffer leaves the video output of the gpu. This time is
> >  identical to end of vblank, when active scanout starts.
> > 
> >  For a crtc in standard fixed refresh rate, the end of vblank
> >  is identical to the vblank timestamps calculated by
> >  drm_update_vblank_count() at each vblank interrupt, or each
> >  vblank dis-/enable. Therefore pageflip events just carry
> >  that vblank timestamp as their pageflip timestamp.
> > 
> >  For a crtc switched to variable refresh rate mode (vrr), the
> >  pageflip completion timestamps are identical to the vblank
> >  timestamps iff the pageflip was executed early in vblank,
> >  before the minimum vblank duration elapsed. In this case
> >  the time of display onset is identical to when the crtc
> >  is running in fixed refresh rate.
> > 
> >  However, if a pageflip completes later in the vblank, inside
> >  the "extended front porch" in vrr mode, then the vblank will
> >  terminate at a fixed (back porch) duration after flip, so
> >  the display onset time is delayed correspondingly. In this
> >  case the vblank timestamp computed at vblank irq time would
> >  be too early, and we need a way to calculate an estimated
> >  pageflip timestamp that will be later than the vblank timestamp.
> > 
> >  How a driver determines such a "late flip" timestamp is hw
> >  and driver specific, but this patch adds a new helper function
> >  that allows the driver to propose such an alternate "late flip"
> >  timestamp for use in pageflip events:
> > 
> >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > 
> >  When sending out pageflip events, we now compare that proposed
> >  flip_timestamp against the vblank timestamp of the current
> >  vblank of flip completion and choose to send out the greater/
> >  later timestamp as flip completion timestamp.
> > 
> >  The most simple way for a kms driver to supply a suitable
> >  flip_timestamp in vrr mode would be to simply take a timestamp
> >  at start of the pageflip completion handler, e.g., pageflip
> >  irq handler: flip_timestamp = ktime_get(); and then set that
> >  as proposed "late" alternative timestamp via ...
> >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > 
> >  More clever approaches could try to add some corrective offset
> >  for fixed back porch duration, or ideally use hardware features
> >  like hw timestamps to calculate the exact end time of vblank.
> > 
> >  Signed-off-by: Mario Kleiner 
> >  Cc: Nicholas Kazlauskas 
> >  Cc: Harry Wentland 
> >  Cc: Alex Deucher 
> > >>>
> > >>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
> > >>> give us
> > >>> the right timestampe, once? With this I guess if you do a vblank 
> > >>> query in
> > >>> between the wrong and the right vblank you'll get the bogus value. 
> > >>> Not
> > >>> really great for userspace.
> > >>> -Daniel
> > >>
> > >> I think we calculate the timestamp and send the vblank event both 
> > >> within
> > >> the pageflip IRQ handler so calculating the right pageflip timestamp
> > >> once could probably be done. I'm not sure if it's easier than 
> > >> proposing
> > >> a later flip time with an API like this though.
> > >>
> > >> The actual scanout time should be known from the page-flip handler so
> > >> the semantics for VRR on/off remain the same. This is because the
> > >> page-flip triggers entering the back porch if we're in the extended
> > >> front porch.
> > >>
> > >> But scanout time from vblank events for something like
> > >> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are 
> > >> only
> > >> 

Re: [PATCH 2/2] drm/amdgpu: Delete user queue doorbell variables

2019-02-13 Thread Zhao, Yong
Pushed. Thanks.

Yong

On 2019-02-08 5:09 p.m., Kuehling, Felix wrote:
> The series is Reviewed-by: Felix Kuehling 
>
> On 2019-02-07 5:23 p.m., Zhao, Yong wrote:
>> They are no longer used, so delete them to avoid confusion.
>>
>> Change-Id: I3cf23fe7110ff88f53c0c279b2b4ec8d1a53b87c
>> Signed-off-by: Yong Zhao 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h | 8 
>>drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c | 2 --
>>drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c | 2 --
>>3 files changed, 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
>> index 4de431f7f380..4c877e57ba97 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
>> @@ -48,8 +48,6 @@ struct amdgpu_doorbell_index {
>>  uint32_t mec_ring5;
>>  uint32_t mec_ring6;
>>  uint32_t mec_ring7;
>> -uint32_t userqueue_start;
>> -uint32_t userqueue_end;
>>  uint32_t gfx_ring0;
>>  uint32_t sdma_engine[8];
>>  uint32_t ih;
>> @@ -112,8 +110,6 @@ typedef enum _AMDGPU_VEGA20_DOORBELL_ASSIGNMENT
>>  AMDGPU_VEGA20_DOORBELL_MEC_RING5   = 0x008,
>>  AMDGPU_VEGA20_DOORBELL_MEC_RING6   = 0x009,
>>  AMDGPU_VEGA20_DOORBELL_MEC_RING7   = 0x00A,
>> -AMDGPU_VEGA20_DOORBELL_USERQUEUE_START = 0x00B,
>> -AMDGPU_VEGA20_DOORBELL_USERQUEUE_END   = 0x08A,
>>  AMDGPU_VEGA20_DOORBELL_GFX_RING0   = 0x08B,
>>  /* SDMA:256~335*/
>>  AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE0= 0x100,
>> @@ -178,10 +174,6 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
>>  AMDGPU_DOORBELL64_MEC_RING6   = 0x09,
>>  AMDGPU_DOORBELL64_MEC_RING7   = 0x0a,
>>
>> -/* User queue doorbell range (128 doorbells) */
>> -AMDGPU_DOORBELL64_USERQUEUE_START = 0x0b,
>> -AMDGPU_DOORBELL64_USERQUEUE_END   = 0x8a,
>> -
>>  /* Graphics engine */
>>  AMDGPU_DOORBELL64_GFX_RING0   = 0x8b,
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c 
>> b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
>> index fa0433199215..ffe0e0593207 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
>> @@ -67,8 +67,6 @@ void vega10_doorbell_index_init(struct amdgpu_device *adev)
>>  adev->doorbell_index.mec_ring5 = AMDGPU_DOORBELL64_MEC_RING5;
>>  adev->doorbell_index.mec_ring6 = AMDGPU_DOORBELL64_MEC_RING6;
>>  adev->doorbell_index.mec_ring7 = AMDGPU_DOORBELL64_MEC_RING7;
>> -adev->doorbell_index.userqueue_start = 
>> AMDGPU_DOORBELL64_USERQUEUE_START;
>> -adev->doorbell_index.userqueue_end = AMDGPU_DOORBELL64_USERQUEUE_END;
>>  adev->doorbell_index.gfx_ring0 = AMDGPU_DOORBELL64_GFX_RING0;
>>  adev->doorbell_index.sdma_engine[0] = AMDGPU_DOORBELL64_sDMA_ENGINE0;
>>  adev->doorbell_index.sdma_engine[1] = AMDGPU_DOORBELL64_sDMA_ENGINE1;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c 
>> b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
>> index b1052caaff5e..700ff8aec999 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
>> @@ -65,8 +65,6 @@ void vega20_doorbell_index_init(struct amdgpu_device *adev)
>>  adev->doorbell_index.mec_ring5 = AMDGPU_VEGA20_DOORBELL_MEC_RING5;
>>  adev->doorbell_index.mec_ring6 = AMDGPU_VEGA20_DOORBELL_MEC_RING6;
>>  adev->doorbell_index.mec_ring7 = AMDGPU_VEGA20_DOORBELL_MEC_RING7;
>> -adev->doorbell_index.userqueue_start = 
>> AMDGPU_VEGA20_DOORBELL_USERQUEUE_START;
>> -adev->doorbell_index.userqueue_end = 
>> AMDGPU_VEGA20_DOORBELL_USERQUEUE_END;
>>  adev->doorbell_index.gfx_ring0 = AMDGPU_VEGA20_DOORBELL_GFX_RING0;
>>  adev->doorbell_index.sdma_engine[0] = 
>> AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE0;
>>  adev->doorbell_index.sdma_engine[1] = 
>> AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE1;
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Daniel Vetter
On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
 wrote:
>
> On 2/13/19 10:14 AM, Daniel Vetter wrote:
> > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
> >  wrote:
> >>
> >> On 2/13/19 4:50 AM, Daniel Vetter wrote:
> >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
>  On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> >
> > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> >  wrote:
> >>
> >> On 2/11/19 3:35 AM, Daniel Vetter wrote:
> >>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
>  The pageflip completion timestamps transmitted to userspace
>  via pageflip completion events are supposed to describe the
>  time at which the first pixel of the new post-pageflip scanout
>  buffer leaves the video output of the gpu. This time is
>  identical to end of vblank, when active scanout starts.
> 
>  For a crtc in standard fixed refresh rate, the end of vblank
>  is identical to the vblank timestamps calculated by
>  drm_update_vblank_count() at each vblank interrupt, or each
>  vblank dis-/enable. Therefore pageflip events just carry
>  that vblank timestamp as their pageflip timestamp.
> 
>  For a crtc switched to variable refresh rate mode (vrr), the
>  pageflip completion timestamps are identical to the vblank
>  timestamps iff the pageflip was executed early in vblank,
>  before the minimum vblank duration elapsed. In this case
>  the time of display onset is identical to when the crtc
>  is running in fixed refresh rate.
> 
>  However, if a pageflip completes later in the vblank, inside
>  the "extended front porch" in vrr mode, then the vblank will
>  terminate at a fixed (back porch) duration after flip, so
>  the display onset time is delayed correspondingly. In this
>  case the vblank timestamp computed at vblank irq time would
>  be too early, and we need a way to calculate an estimated
>  pageflip timestamp that will be later than the vblank timestamp.
> 
>  How a driver determines such a "late flip" timestamp is hw
>  and driver specific, but this patch adds a new helper function
>  that allows the driver to propose such an alternate "late flip"
>  timestamp for use in pageflip events:
> 
>  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> 
>  When sending out pageflip events, we now compare that proposed
>  flip_timestamp against the vblank timestamp of the current
>  vblank of flip completion and choose to send out the greater/
>  later timestamp as flip completion timestamp.
> 
>  The most simple way for a kms driver to supply a suitable
>  flip_timestamp in vrr mode would be to simply take a timestamp
>  at start of the pageflip completion handler, e.g., pageflip
>  irq handler: flip_timestamp = ktime_get(); and then set that
>  as proposed "late" alternative timestamp via ...
>  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> 
>  More clever approaches could try to add some corrective offset
>  for fixed back porch duration, or ideally use hardware features
>  like hw timestamps to calculate the exact end time of vblank.
> 
>  Signed-off-by: Mario Kleiner 
>  Cc: Nicholas Kazlauskas 
>  Cc: Harry Wentland 
>  Cc: Alex Deucher 
> >>>
> >>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
> >>> give us
> >>> the right timestampe, once? With this I guess if you do a vblank 
> >>> query in
> >>> between the wrong and the right vblank you'll get the bogus value. Not
> >>> really great for userspace.
> >>> -Daniel
> >>
> >> I think we calculate the timestamp and send the vblank event both 
> >> within
> >> the pageflip IRQ handler so calculating the right pageflip timestamp
> >> once could probably be done. I'm not sure if it's easier than proposing
> >> a later flip time with an API like this though.
> >>
> >> The actual scanout time should be known from the page-flip handler so
> >> the semantics for VRR on/off remain the same. This is because the
> >> page-flip triggers entering the back porch if we're in the extended
> >> front porch.
> >>
> >> But scanout time from vblank events for something like
> >> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> >> treated as estimates. If we're in the regular front porch then the
> >> timing to scanout is based on the fixed duration front porch for the
> >> current mode. If we're in the extended back porch then it's technically
> >> driver defined but the most reasonable guess is 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 10:14 AM, Daniel Vetter wrote:
> On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
>  wrote:
>>
>> On 2/13/19 4:50 AM, Daniel Vetter wrote:
>>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
 On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
>
> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
>  wrote:
>>
>> On 2/11/19 3:35 AM, Daniel Vetter wrote:
>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
 The pageflip completion timestamps transmitted to userspace
 via pageflip completion events are supposed to describe the
 time at which the first pixel of the new post-pageflip scanout
 buffer leaves the video output of the gpu. This time is
 identical to end of vblank, when active scanout starts.

 For a crtc in standard fixed refresh rate, the end of vblank
 is identical to the vblank timestamps calculated by
 drm_update_vblank_count() at each vblank interrupt, or each
 vblank dis-/enable. Therefore pageflip events just carry
 that vblank timestamp as their pageflip timestamp.

 For a crtc switched to variable refresh rate mode (vrr), the
 pageflip completion timestamps are identical to the vblank
 timestamps iff the pageflip was executed early in vblank,
 before the minimum vblank duration elapsed. In this case
 the time of display onset is identical to when the crtc
 is running in fixed refresh rate.

 However, if a pageflip completes later in the vblank, inside
 the "extended front porch" in vrr mode, then the vblank will
 terminate at a fixed (back porch) duration after flip, so
 the display onset time is delayed correspondingly. In this
 case the vblank timestamp computed at vblank irq time would
 be too early, and we need a way to calculate an estimated
 pageflip timestamp that will be later than the vblank timestamp.

 How a driver determines such a "late flip" timestamp is hw
 and driver specific, but this patch adds a new helper function
 that allows the driver to propose such an alternate "late flip"
 timestamp for use in pageflip events:

 drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);

 When sending out pageflip events, we now compare that proposed
 flip_timestamp against the vblank timestamp of the current
 vblank of flip completion and choose to send out the greater/
 later timestamp as flip completion timestamp.

 The most simple way for a kms driver to supply a suitable
 flip_timestamp in vrr mode would be to simply take a timestamp
 at start of the pageflip completion handler, e.g., pageflip
 irq handler: flip_timestamp = ktime_get(); and then set that
 as proposed "late" alternative timestamp via ...
 drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);

 More clever approaches could try to add some corrective offset
 for fixed back porch duration, or ideally use hardware features
 like hw timestamps to calculate the exact end time of vblank.

 Signed-off-by: Mario Kleiner 
 Cc: Nicholas Kazlauskas 
 Cc: Harry Wentland 
 Cc: Alex Deucher 
>>>
>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give 
>>> us
>>> the right timestampe, once? With this I guess if you do a vblank query 
>>> in
>>> between the wrong and the right vblank you'll get the bogus value. Not
>>> really great for userspace.
>>> -Daniel
>>
>> I think we calculate the timestamp and send the vblank event both within
>> the pageflip IRQ handler so calculating the right pageflip timestamp
>> once could probably be done. I'm not sure if it's easier than proposing
>> a later flip time with an API like this though.
>>
>> The actual scanout time should be known from the page-flip handler so
>> the semantics for VRR on/off remain the same. This is because the
>> page-flip triggers entering the back porch if we're in the extended
>> front porch.
>>
>> But scanout time from vblank events for something like
>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
>> treated as estimates. If we're in the regular front porch then the
>> timing to scanout is based on the fixed duration front porch for the
>> current mode. If we're in the extended back porch then it's technically
>> driver defined but the most reasonable guess is to assume that the front
>> porch is going to end at any moment, so just return the length of the
>> back porch for getting the scanout time.
>>
>> Proposing the late timestamp shouldn't affect vblank event in the
>> 

Re: [PATCH 3/3] drm/dsc: Change infoframe_pack to payload_pack

2019-02-13 Thread Wentland, Harry
On 2019-02-13 9:45 a.m., David Francis wrote:
> The function drm_dsc_pps_infoframe_pack only
> packed the payload portion of the infoframe.
> Change the input struct to the PPS payload
> to clarify the function's purpose and allow
> for drivers with their own handling of sdp.
> (e.g. drivers with their own struct for
> all SDP transactions)
> 
> Signed-off-by: David Francis 

Reviewed-by: Harry Wentland 

Again, ideally we'd want an AB from i915 guys as well.

Harry

> ---
>  drivers/gpu/drm/drm_dsc.c | 86 +++
>  drivers/gpu/drm/i915/intel_vdsc.c |  2 +-
>  include/drm/drm_dsc.h |  2 +-
>  3 files changed, 45 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index 9e675dd39a44..4ada4d4f59ac 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -38,42 +38,42 @@ void drm_dsc_dp_pps_header_init(struct 
> drm_dsc_pps_infoframe *pps_sdp)
>  EXPORT_SYMBOL(drm_dsc_dp_pps_header_init);
>  
>  /**
> - * drm_dsc_pps_infoframe_pack() - Populates the DSC PPS infoframe
> + * drm_dsc_pps_payload_pack() - Populates the DSC PPS payload
>   * using the DSC configuration parameters in the order expected
>   * by the DSC Display Sink device. For the DSC, the sink device
>   * expects the PPS payload in the big endian format for the fields
>   * that span more than 1 byte.
>   *
> - * @pps_sdp:
> - * Secondary data packet for DSC Picture Parameter Set
> + * @pps_payload:
> + * DSC Picture Parameter Set
>   * @dsc_cfg:
>   * DSC Configuration data filled by driver
>   */
> -void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe *pps_sdp,
> +void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set 
> *pps_payload,
>   const struct drm_dsc_config *dsc_cfg)
>  {
>   int i;
>  
>   /* Protect against someone accidently changing struct size */
> - BUILD_BUG_ON(sizeof(pps_sdp->pps_payload) !=
> + BUILD_BUG_ON(sizeof(*pps_payload) !=
>DP_SDP_PPS_HEADER_PAYLOAD_BYTES_MINUS_1 + 1);
>  
> - memset(_sdp->pps_payload, 0, sizeof(pps_sdp->pps_payload));
> + memset(pps_payload, 0, sizeof(*pps_payload));
>  
>   /* PPS 0 */
> - pps_sdp->pps_payload.dsc_version =
> + pps_payload->dsc_version =
>   dsc_cfg->dsc_version_minor |
>   dsc_cfg->dsc_version_major << DSC_PPS_VERSION_MAJOR_SHIFT;
>  
>   /* PPS 1, 2 is 0 */
>  
>   /* PPS 3 */
> - pps_sdp->pps_payload.pps_3 =
> + pps_payload->pps_3 =
>   dsc_cfg->line_buf_depth |
>   dsc_cfg->bits_per_component << DSC_PPS_BPC_SHIFT;
>  
>   /* PPS 4 */
> - pps_sdp->pps_payload.pps_4 =
> + pps_payload->pps_4 =
>   ((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT) |
>   dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
> @@ -82,7 +82,7 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
>  
>   /* PPS 5 */
> - pps_sdp->pps_payload.bits_per_pixel_low =
> + pps_payload->bits_per_pixel_low =
>   (dsc_cfg->bits_per_pixel & DSC_PPS_LSB_MASK);
>  
>   /*
> @@ -93,103 +93,103 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>*/
>  
>   /* PPS 6, 7 */
> - pps_sdp->pps_payload.pic_height = cpu_to_be16(dsc_cfg->pic_height);
> + pps_payload->pic_height = cpu_to_be16(dsc_cfg->pic_height);
>  
>   /* PPS 8, 9 */
> - pps_sdp->pps_payload.pic_width = cpu_to_be16(dsc_cfg->pic_width);
> + pps_payload->pic_width = cpu_to_be16(dsc_cfg->pic_width);
>  
>   /* PPS 10, 11 */
> - pps_sdp->pps_payload.slice_height = cpu_to_be16(dsc_cfg->slice_height);
> + pps_payload->slice_height = cpu_to_be16(dsc_cfg->slice_height);
>  
>   /* PPS 12, 13 */
> - pps_sdp->pps_payload.slice_width = cpu_to_be16(dsc_cfg->slice_width);
> + pps_payload->slice_width = cpu_to_be16(dsc_cfg->slice_width);
>  
>   /* PPS 14, 15 */
> - pps_sdp->pps_payload.chunk_size = 
> cpu_to_be16(dsc_cfg->slice_chunk_size);
> + pps_payload->chunk_size = cpu_to_be16(dsc_cfg->slice_chunk_size);
>  
>   /* PPS 16 */
> - pps_sdp->pps_payload.initial_xmit_delay_high =
> + pps_payload->initial_xmit_delay_high =
>   ((dsc_cfg->initial_xmit_delay &
> DSC_PPS_INIT_XMIT_DELAY_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT);
>  
>   /* PPS 17 */
> - pps_sdp->pps_payload.initial_xmit_delay_low =
> + pps_payload->initial_xmit_delay_low =
>   (dsc_cfg->initial_xmit_delay & DSC_PPS_LSB_MASK);
>  
>   /* PPS 18, 19 */
> - pps_sdp->pps_payload.initial_dec_delay =
> + pps_payload->initial_dec_delay =
>   cpu_to_be16(dsc_cfg->initial_dec_delay);
>  
>   /* PPS 20 is 0 */
>  
>   /* PPS 21 */
> - 

Re: [PATCH 2/3] drm/dsc: Add native 420 and 422 support to compute_rc_params

2019-02-13 Thread Wentland, Harry
On 2019-02-13 9:45 a.m., David Francis wrote:
> Native 420 and 422 transfer modes are new in DSC1.2
> 
> In these modes, each two pixels of a slice are treated as one
> pixel, so the slice width is half as large (round down) for
> the purposes of calucating the groups per line and chunk size
> in bytes
> 
> In native 422 mode, each pixel has four components, so the
> mux component of a group is larger by one additional mux word
> and one additional component
> 
> Now that there is native 422 support, the configuration option
> previously called enable422 is renamed to simple_422 to avoid
> confusion
> 
> Signed-off-by: David Francis 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/drm_dsc.c | 31 +++
>  drivers/gpu/drm/i915/intel_vdsc.c |  4 ++--
>  include/drm/drm_dsc.h |  4 ++--
>  3 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index 4b0e3c9c3ff8..9e675dd39a44 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -77,7 +77,7 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   ((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
>DSC_PPS_MSB_SHIFT) |
>   dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
> - dsc_cfg->enable422 << DSC_PPS_SIMPLE422_SHIFT |
> + dsc_cfg->simple_422 << DSC_PPS_SIMPLE422_SHIFT |
>   dsc_cfg->convert_rgb << DSC_PPS_CONVERT_RGB_SHIFT |
>   dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
>  
> @@ -246,19 +246,34 @@ int drm_dsc_compute_rc_parameters(struct drm_dsc_config 
> *vdsc_cfg)
>   unsigned long final_scale = 0;
>   unsigned long rbs_min = 0;
>  
> - /* Number of groups used to code each line of a slice */
> - groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> -DSC_RC_PIXELS_PER_GROUP);
> + if (vdsc_cfg->native_420 || vdsc_cfg->native_422) {
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width / 2,
> +DSC_RC_PIXELS_PER_GROUP);
>  
> - /* chunksize in Bytes */
> - vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
> -   vdsc_cfg->bits_per_pixel,
> -   (8 * 16));
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
> / 2 *
> +   
> vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> + } else {
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> +DSC_RC_PIXELS_PER_GROUP);
> +
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
> *
> +   
> vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> + }
>  
>   if (vdsc_cfg->convert_rgb)
>   num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
> (4 * vdsc_cfg->bits_per_component + 4)
> - 2);
> + else if (vdsc_cfg->native_422)
> + num_extra_mux_bits = 4 * vdsc_cfg->mux_word_size +
> + (4 * vdsc_cfg->bits_per_component + 4) +
> + 3 * (4 * vdsc_cfg->bits_per_component) - 2;
>   else
>   num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
>   (4 * vdsc_cfg->bits_per_component + 4) +
> diff --git a/drivers/gpu/drm/i915/intel_vdsc.c 
> b/drivers/gpu/drm/i915/intel_vdsc.c
> index c76cec8bfb74..7702c5c8b3f2 100644
> --- a/drivers/gpu/drm/i915/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/intel_vdsc.c
> @@ -369,7 +369,7 @@ int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
>   DSC_1_1_MAX_LINEBUF_DEPTH_BITS : line_buf_depth;
>  
>   /* Gen 11 does not support YCbCr */
> - vdsc_cfg->enable422 = false;
> + vdsc_cfg->simple_422 = false;
>   /* Gen 11 does not support VBR */
>   vdsc_cfg->vbr_enable = false;
>   vdsc_cfg->block_pred_enable =
> @@ -496,7 +496,7 @@ static void intel_configure_pps_for_dsc_encoder(struct 
> intel_encoder *encoder,
>   pps_val |= DSC_BLOCK_PREDICTION;
>   if (vdsc_cfg->convert_rgb)
>   pps_val |= DSC_COLOR_SPACE_CONVERSION;
> - if (vdsc_cfg->enable422)
> + if (vdsc_cfg->simple_422)
>   pps_val |= DSC_422_ENABLE;
>   if (vdsc_cfg->vbr_enable)
>   pps_val |= DSC_VBR_ENABLE;
> 

Re: [PATCH 1/3] drm/i915: Move dsc rate params compute into drm

2019-02-13 Thread Wentland, Harry
On 2019-02-13 9:45 a.m., David Francis wrote:
> The function intel_compute_rc_parameters is part of the dsc spec
> and is not driver-specific. Other drm drivers might like to use
> it.  The function is not changed; just moved and renamed.
> 
> Signed-off-by: David Francis 

Reviewed-by: Harry Wentland 

This one also needs an RB or AB from i915 guys.

Harry

> ---
>  drivers/gpu/drm/drm_dsc.c | 133 ++
>  drivers/gpu/drm/i915/intel_vdsc.c | 125 +---
>  include/drm/drm_dsc.h |   1 +
>  3 files changed, 135 insertions(+), 124 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
> index bc2b23adb072..4b0e3c9c3ff8 100644
> --- a/drivers/gpu/drm/drm_dsc.c
> +++ b/drivers/gpu/drm/drm_dsc.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -226,3 +227,135 @@ void drm_dsc_pps_infoframe_pack(struct 
> drm_dsc_pps_infoframe *pps_sdp,
>   /* PPS 94 - 127 are O */
>  }
>  EXPORT_SYMBOL(drm_dsc_pps_infoframe_pack);
> +
> +/**
> + * drm_dsc_compute_rc_parameters() - Write rate control
> + * parameters to the dsc configuration. Some configuration
> + * fields must be present beforehand.
> + *
> + * @dsc_cfg:
> + * DSC Configuration data partially filled by driver
> + */
> +int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg)
> +{
> + unsigned long groups_per_line = 0;
> + unsigned long groups_total = 0;
> + unsigned long num_extra_mux_bits = 0;
> + unsigned long slice_bits = 0;
> + unsigned long hrd_delay = 0;
> + unsigned long final_scale = 0;
> + unsigned long rbs_min = 0;
> +
> + /* Number of groups used to code each line of a slice */
> + groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
> +DSC_RC_PIXELS_PER_GROUP);
> +
> + /* chunksize in Bytes */
> + vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
> +   vdsc_cfg->bits_per_pixel,
> +   (8 * 16));
> +
> + if (vdsc_cfg->convert_rgb)
> + num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
> +   (4 * vdsc_cfg->bits_per_component + 4)
> +   - 2);
> + else
> + num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
> + (4 * vdsc_cfg->bits_per_component + 4) +
> + 2 * (4 * vdsc_cfg->bits_per_component) - 2;
> + /* Number of bits in one Slice */
> + slice_bits = 8 * vdsc_cfg->slice_chunk_size * vdsc_cfg->slice_height;
> +
> + while ((num_extra_mux_bits > 0) &&
> +((slice_bits - num_extra_mux_bits) % vdsc_cfg->mux_word_size))
> + num_extra_mux_bits--;
> +
> + if (groups_per_line < vdsc_cfg->initial_scale_value - 8)
> + vdsc_cfg->initial_scale_value = groups_per_line + 8;
> +
> + /* scale_decrement_interval calculation according to DSC spec 1.11 */
> + if (vdsc_cfg->initial_scale_value > 8)
> + vdsc_cfg->scale_decrement_interval = groups_per_line /
> + (vdsc_cfg->initial_scale_value - 8);
> + else
> + vdsc_cfg->scale_decrement_interval = 
> DSC_SCALE_DECREMENT_INTERVAL_MAX;
> +
> + vdsc_cfg->final_offset = vdsc_cfg->rc_model_size -
> + (vdsc_cfg->initial_xmit_delay *
> +  vdsc_cfg->bits_per_pixel + 8) / 16 + num_extra_mux_bits;
> +
> + if (vdsc_cfg->final_offset >= vdsc_cfg->rc_model_size) {
> + DRM_DEBUG_KMS("FinalOfs < RcModelSze for this 
> InitialXmitDelay\n");
> + return -ERANGE;
> + }
> +
> + final_scale = (vdsc_cfg->rc_model_size * 8) /
> + (vdsc_cfg->rc_model_size - vdsc_cfg->final_offset);
> + if (vdsc_cfg->slice_height > 1)
> + /*
> +  * NflBpgOffset is 16 bit value with 11 fractional bits
> +  * hence we multiply by 2^11 for preserving the
> +  * fractional part
> +  */
> + vdsc_cfg->nfl_bpg_offset = 
> DIV_ROUND_UP((vdsc_cfg->first_line_bpg_offset << 11),
> + (vdsc_cfg->slice_height 
> - 1));
> + else
> + vdsc_cfg->nfl_bpg_offset = 0;
> +
> + /* 2^16 - 1 */
> + if (vdsc_cfg->nfl_bpg_offset > 65535) {
> + DRM_DEBUG_KMS("NflBpgOffset is too large for this slice 
> height\n");
> + return -ERANGE;
> + }
> +
> + /* Number of groups used to code the entire slice */
> + groups_total = groups_per_line * vdsc_cfg->slice_height;
> +
> + /* slice_bpg_offset is 16 bit value with 11 fractional bits */
> + vdsc_cfg->slice_bpg_offset = DIV_ROUND_UP(((vdsc_cfg->rc_model_size -
> + vdsc_cfg->initial_offset +
> + 

Re: "ring gfx timeout" with Vega 64 on mesa 19.0.0-rc2 and kernel 5.0.0-rc6 (GPU reset still not works)

2019-02-13 Thread Grodzovsky, Andrey
OK, just apply the following to your amdgpu_dm_do_flip function and see 
if GPU reset does proceed after you experience the hang.

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d59bafc..586301f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4809,7 +4809,7 @@ static void amdgpu_dm_do_flip(struct 
drm_atomic_state *state,

     /* Wait for all fences on this FB */
WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
- MAX_SCHEDULE_TIMEOUT) < 0);
+   msecs_to_jiffies(5000)) < 0);

     amdgpu_bo_get_tiling_flags(abo, _flags);

Andrey

On 2/13/19 8:59 AM, Mikhail Gavrilov wrote:
> On Wed, 13 Feb 2019 at 00:44, Grodzovsky, Andrey
>  wrote:
>> Sorry, for your kernel this particular set of prints should go in 
>> amdgpu_dm_do_flip
>>
>
> Kernel logs became very weird after yesterday patch.
> Too many messages even without reproducing  the issue which cause
> "ring gfx timeout".
>
> --
> Best Regards,
> Mike Gavrilov.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Daniel Vetter
On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
 wrote:
>
> On 2/13/19 4:50 AM, Daniel Vetter wrote:
> > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> >> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> >>>
> >>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> >>>  wrote:
> 
>  On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> >> The pageflip completion timestamps transmitted to userspace
> >> via pageflip completion events are supposed to describe the
> >> time at which the first pixel of the new post-pageflip scanout
> >> buffer leaves the video output of the gpu. This time is
> >> identical to end of vblank, when active scanout starts.
> >>
> >> For a crtc in standard fixed refresh rate, the end of vblank
> >> is identical to the vblank timestamps calculated by
> >> drm_update_vblank_count() at each vblank interrupt, or each
> >> vblank dis-/enable. Therefore pageflip events just carry
> >> that vblank timestamp as their pageflip timestamp.
> >>
> >> For a crtc switched to variable refresh rate mode (vrr), the
> >> pageflip completion timestamps are identical to the vblank
> >> timestamps iff the pageflip was executed early in vblank,
> >> before the minimum vblank duration elapsed. In this case
> >> the time of display onset is identical to when the crtc
> >> is running in fixed refresh rate.
> >>
> >> However, if a pageflip completes later in the vblank, inside
> >> the "extended front porch" in vrr mode, then the vblank will
> >> terminate at a fixed (back porch) duration after flip, so
> >> the display onset time is delayed correspondingly. In this
> >> case the vblank timestamp computed at vblank irq time would
> >> be too early, and we need a way to calculate an estimated
> >> pageflip timestamp that will be later than the vblank timestamp.
> >>
> >> How a driver determines such a "late flip" timestamp is hw
> >> and driver specific, but this patch adds a new helper function
> >> that allows the driver to propose such an alternate "late flip"
> >> timestamp for use in pageflip events:
> >>
> >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> >>
> >> When sending out pageflip events, we now compare that proposed
> >> flip_timestamp against the vblank timestamp of the current
> >> vblank of flip completion and choose to send out the greater/
> >> later timestamp as flip completion timestamp.
> >>
> >> The most simple way for a kms driver to supply a suitable
> >> flip_timestamp in vrr mode would be to simply take a timestamp
> >> at start of the pageflip completion handler, e.g., pageflip
> >> irq handler: flip_timestamp = ktime_get(); and then set that
> >> as proposed "late" alternative timestamp via ...
> >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> >>
> >> More clever approaches could try to add some corrective offset
> >> for fixed back porch duration, or ideally use hardware features
> >> like hw timestamps to calculate the exact end time of vblank.
> >>
> >> Signed-off-by: Mario Kleiner 
> >> Cc: Nicholas Kazlauskas 
> >> Cc: Harry Wentland 
> >> Cc: Alex Deucher 
> >
> > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give 
> > us
> > the right timestampe, once? With this I guess if you do a vblank query 
> > in
> > between the wrong and the right vblank you'll get the bogus value. Not
> > really great for userspace.
> > -Daniel
> 
>  I think we calculate the timestamp and send the vblank event both within
>  the pageflip IRQ handler so calculating the right pageflip timestamp
>  once could probably be done. I'm not sure if it's easier than proposing
>  a later flip time with an API like this though.
> 
>  The actual scanout time should be known from the page-flip handler so
>  the semantics for VRR on/off remain the same. This is because the
>  page-flip triggers entering the back porch if we're in the extended
>  front porch.
> 
>  But scanout time from vblank events for something like
>  DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
>  treated as estimates. If we're in the regular front porch then the
>  timing to scanout is based on the fixed duration front porch for the
>  current mode. If we're in the extended back porch then it's technically
>  driver defined but the most reasonable guess is to assume that the front
>  porch is going to end at any moment, so just return the length of the
>  back porch for getting the scanout time.
> 
>  Proposing the late timestamp shouldn't affect vblank event in the
>  DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip

[PATCH 3/3] drm/dsc: Change infoframe_pack to payload_pack

2019-02-13 Thread David Francis
The function drm_dsc_pps_infoframe_pack only
packed the payload portion of the infoframe.
Change the input struct to the PPS payload
to clarify the function's purpose and allow
for drivers with their own handling of sdp.
(e.g. drivers with their own struct for
all SDP transactions)

Signed-off-by: David Francis 
---
 drivers/gpu/drm/drm_dsc.c | 86 +++
 drivers/gpu/drm/i915/intel_vdsc.c |  2 +-
 include/drm/drm_dsc.h |  2 +-
 3 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
index 9e675dd39a44..4ada4d4f59ac 100644
--- a/drivers/gpu/drm/drm_dsc.c
+++ b/drivers/gpu/drm/drm_dsc.c
@@ -38,42 +38,42 @@ void drm_dsc_dp_pps_header_init(struct 
drm_dsc_pps_infoframe *pps_sdp)
 EXPORT_SYMBOL(drm_dsc_dp_pps_header_init);
 
 /**
- * drm_dsc_pps_infoframe_pack() - Populates the DSC PPS infoframe
+ * drm_dsc_pps_payload_pack() - Populates the DSC PPS payload
  * using the DSC configuration parameters in the order expected
  * by the DSC Display Sink device. For the DSC, the sink device
  * expects the PPS payload in the big endian format for the fields
  * that span more than 1 byte.
  *
- * @pps_sdp:
- * Secondary data packet for DSC Picture Parameter Set
+ * @pps_payload:
+ * DSC Picture Parameter Set
  * @dsc_cfg:
  * DSC Configuration data filled by driver
  */
-void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe *pps_sdp,
+void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set 
*pps_payload,
const struct drm_dsc_config *dsc_cfg)
 {
int i;
 
/* Protect against someone accidently changing struct size */
-   BUILD_BUG_ON(sizeof(pps_sdp->pps_payload) !=
+   BUILD_BUG_ON(sizeof(*pps_payload) !=
 DP_SDP_PPS_HEADER_PAYLOAD_BYTES_MINUS_1 + 1);
 
-   memset(_sdp->pps_payload, 0, sizeof(pps_sdp->pps_payload));
+   memset(pps_payload, 0, sizeof(*pps_payload));
 
/* PPS 0 */
-   pps_sdp->pps_payload.dsc_version =
+   pps_payload->dsc_version =
dsc_cfg->dsc_version_minor |
dsc_cfg->dsc_version_major << DSC_PPS_VERSION_MAJOR_SHIFT;
 
/* PPS 1, 2 is 0 */
 
/* PPS 3 */
-   pps_sdp->pps_payload.pps_3 =
+   pps_payload->pps_3 =
dsc_cfg->line_buf_depth |
dsc_cfg->bits_per_component << DSC_PPS_BPC_SHIFT;
 
/* PPS 4 */
-   pps_sdp->pps_payload.pps_4 =
+   pps_payload->pps_4 =
((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
 DSC_PPS_MSB_SHIFT) |
dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
@@ -82,7 +82,7 @@ void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe 
*pps_sdp,
dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
 
/* PPS 5 */
-   pps_sdp->pps_payload.bits_per_pixel_low =
+   pps_payload->bits_per_pixel_low =
(dsc_cfg->bits_per_pixel & DSC_PPS_LSB_MASK);
 
/*
@@ -93,103 +93,103 @@ void drm_dsc_pps_infoframe_pack(struct 
drm_dsc_pps_infoframe *pps_sdp,
 */
 
/* PPS 6, 7 */
-   pps_sdp->pps_payload.pic_height = cpu_to_be16(dsc_cfg->pic_height);
+   pps_payload->pic_height = cpu_to_be16(dsc_cfg->pic_height);
 
/* PPS 8, 9 */
-   pps_sdp->pps_payload.pic_width = cpu_to_be16(dsc_cfg->pic_width);
+   pps_payload->pic_width = cpu_to_be16(dsc_cfg->pic_width);
 
/* PPS 10, 11 */
-   pps_sdp->pps_payload.slice_height = cpu_to_be16(dsc_cfg->slice_height);
+   pps_payload->slice_height = cpu_to_be16(dsc_cfg->slice_height);
 
/* PPS 12, 13 */
-   pps_sdp->pps_payload.slice_width = cpu_to_be16(dsc_cfg->slice_width);
+   pps_payload->slice_width = cpu_to_be16(dsc_cfg->slice_width);
 
/* PPS 14, 15 */
-   pps_sdp->pps_payload.chunk_size = 
cpu_to_be16(dsc_cfg->slice_chunk_size);
+   pps_payload->chunk_size = cpu_to_be16(dsc_cfg->slice_chunk_size);
 
/* PPS 16 */
-   pps_sdp->pps_payload.initial_xmit_delay_high =
+   pps_payload->initial_xmit_delay_high =
((dsc_cfg->initial_xmit_delay &
  DSC_PPS_INIT_XMIT_DELAY_HIGH_MASK) >>
 DSC_PPS_MSB_SHIFT);
 
/* PPS 17 */
-   pps_sdp->pps_payload.initial_xmit_delay_low =
+   pps_payload->initial_xmit_delay_low =
(dsc_cfg->initial_xmit_delay & DSC_PPS_LSB_MASK);
 
/* PPS 18, 19 */
-   pps_sdp->pps_payload.initial_dec_delay =
+   pps_payload->initial_dec_delay =
cpu_to_be16(dsc_cfg->initial_dec_delay);
 
/* PPS 20 is 0 */
 
/* PPS 21 */
-   pps_sdp->pps_payload.initial_scale_value =
+   pps_payload->initial_scale_value =
dsc_cfg->initial_scale_value;
 
/* PPS 22, 23 */
-   pps_sdp->pps_payload.scale_increment_interval =
+   pps_payload->scale_increment_interval =

[PATCH 0/3] Make DRM DSC helpers more generally usable

2019-02-13 Thread David Francis
drm_dsc could use some work so that drm drivers other than
i915 can make use of it their own DSC implementations

Move rc compute, a function that forms part of the DSC spec,
into drm. Update it to DSC 1.2. Also change the packing function
to operate only on the packing struct, to allow for drivers with
their own SDP struct headers

David Francis (3):
  drm/i915: Move dsc rate params compute into drm
  drm/dsc: Add native 420 and 422 support to compute_rc_params
  drm/dsc: Change infoframe_pack to payload_pack

 drivers/gpu/drm/drm_dsc.c | 236 --
 drivers/gpu/drm/i915/intel_vdsc.c | 131 +
 include/drm/drm_dsc.h |   7 +-
 3 files changed, 200 insertions(+), 174 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/dsc: Add native 420 and 422 support to compute_rc_params

2019-02-13 Thread David Francis
Native 420 and 422 transfer modes are new in DSC1.2

In these modes, each two pixels of a slice are treated as one
pixel, so the slice width is half as large (round down) for
the purposes of calucating the groups per line and chunk size
in bytes

In native 422 mode, each pixel has four components, so the
mux component of a group is larger by one additional mux word
and one additional component

Now that there is native 422 support, the configuration option
previously called enable422 is renamed to simple_422 to avoid
confusion

Signed-off-by: David Francis 
---
 drivers/gpu/drm/drm_dsc.c | 31 +++
 drivers/gpu/drm/i915/intel_vdsc.c |  4 ++--
 include/drm/drm_dsc.h |  4 ++--
 3 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
index 4b0e3c9c3ff8..9e675dd39a44 100644
--- a/drivers/gpu/drm/drm_dsc.c
+++ b/drivers/gpu/drm/drm_dsc.c
@@ -77,7 +77,7 @@ void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe 
*pps_sdp,
((dsc_cfg->bits_per_pixel & DSC_PPS_BPP_HIGH_MASK) >>
 DSC_PPS_MSB_SHIFT) |
dsc_cfg->vbr_enable << DSC_PPS_VBR_EN_SHIFT |
-   dsc_cfg->enable422 << DSC_PPS_SIMPLE422_SHIFT |
+   dsc_cfg->simple_422 << DSC_PPS_SIMPLE422_SHIFT |
dsc_cfg->convert_rgb << DSC_PPS_CONVERT_RGB_SHIFT |
dsc_cfg->block_pred_enable << DSC_PPS_BLOCK_PRED_EN_SHIFT;
 
@@ -246,19 +246,34 @@ int drm_dsc_compute_rc_parameters(struct drm_dsc_config 
*vdsc_cfg)
unsigned long final_scale = 0;
unsigned long rbs_min = 0;
 
-   /* Number of groups used to code each line of a slice */
-   groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
-  DSC_RC_PIXELS_PER_GROUP);
+   if (vdsc_cfg->native_420 || vdsc_cfg->native_422) {
+   /* Number of groups used to code each line of a slice */
+   groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width / 2,
+  DSC_RC_PIXELS_PER_GROUP);
 
-   /* chunksize in Bytes */
-   vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
- vdsc_cfg->bits_per_pixel,
- (8 * 16));
+   /* chunksize in Bytes */
+   vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
/ 2 *
+ 
vdsc_cfg->bits_per_pixel,
+ (8 * 16));
+   } else {
+   /* Number of groups used to code each line of a slice */
+   groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
+  DSC_RC_PIXELS_PER_GROUP);
+
+   /* chunksize in Bytes */
+   vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width 
*
+ 
vdsc_cfg->bits_per_pixel,
+ (8 * 16));
+   }
 
if (vdsc_cfg->convert_rgb)
num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
  (4 * vdsc_cfg->bits_per_component + 4)
  - 2);
+   else if (vdsc_cfg->native_422)
+   num_extra_mux_bits = 4 * vdsc_cfg->mux_word_size +
+   (4 * vdsc_cfg->bits_per_component + 4) +
+   3 * (4 * vdsc_cfg->bits_per_component) - 2;
else
num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
(4 * vdsc_cfg->bits_per_component + 4) +
diff --git a/drivers/gpu/drm/i915/intel_vdsc.c 
b/drivers/gpu/drm/i915/intel_vdsc.c
index c76cec8bfb74..7702c5c8b3f2 100644
--- a/drivers/gpu/drm/i915/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/intel_vdsc.c
@@ -369,7 +369,7 @@ int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
DSC_1_1_MAX_LINEBUF_DEPTH_BITS : line_buf_depth;
 
/* Gen 11 does not support YCbCr */
-   vdsc_cfg->enable422 = false;
+   vdsc_cfg->simple_422 = false;
/* Gen 11 does not support VBR */
vdsc_cfg->vbr_enable = false;
vdsc_cfg->block_pred_enable =
@@ -496,7 +496,7 @@ static void intel_configure_pps_for_dsc_encoder(struct 
intel_encoder *encoder,
pps_val |= DSC_BLOCK_PREDICTION;
if (vdsc_cfg->convert_rgb)
pps_val |= DSC_COLOR_SPACE_CONVERSION;
-   if (vdsc_cfg->enable422)
+   if (vdsc_cfg->simple_422)
pps_val |= DSC_422_ENABLE;
if (vdsc_cfg->vbr_enable)
pps_val |= DSC_VBR_ENABLE;
diff --git a/include/drm/drm_dsc.h b/include/drm/drm_dsc.h
index ad43494f1cc8..4e55e37943d7 100644
--- a/include/drm/drm_dsc.h
+++ b/include/drm/drm_dsc.h
@@ -70,10 +70,10 @@ struct 

[PATCH 1/3] drm/i915: Move dsc rate params compute into drm

2019-02-13 Thread David Francis
The function intel_compute_rc_parameters is part of the dsc spec
and is not driver-specific. Other drm drivers might like to use
it.  The function is not changed; just moved and renamed.

Signed-off-by: David Francis 
---
 drivers/gpu/drm/drm_dsc.c | 133 ++
 drivers/gpu/drm/i915/intel_vdsc.c | 125 +---
 include/drm/drm_dsc.h |   1 +
 3 files changed, 135 insertions(+), 124 deletions(-)

diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
index bc2b23adb072..4b0e3c9c3ff8 100644
--- a/drivers/gpu/drm/drm_dsc.c
+++ b/drivers/gpu/drm/drm_dsc.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -226,3 +227,135 @@ void drm_dsc_pps_infoframe_pack(struct 
drm_dsc_pps_infoframe *pps_sdp,
/* PPS 94 - 127 are O */
 }
 EXPORT_SYMBOL(drm_dsc_pps_infoframe_pack);
+
+/**
+ * drm_dsc_compute_rc_parameters() - Write rate control
+ * parameters to the dsc configuration. Some configuration
+ * fields must be present beforehand.
+ *
+ * @dsc_cfg:
+ * DSC Configuration data partially filled by driver
+ */
+int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg)
+{
+   unsigned long groups_per_line = 0;
+   unsigned long groups_total = 0;
+   unsigned long num_extra_mux_bits = 0;
+   unsigned long slice_bits = 0;
+   unsigned long hrd_delay = 0;
+   unsigned long final_scale = 0;
+   unsigned long rbs_min = 0;
+
+   /* Number of groups used to code each line of a slice */
+   groups_per_line = DIV_ROUND_UP(vdsc_cfg->slice_width,
+  DSC_RC_PIXELS_PER_GROUP);
+
+   /* chunksize in Bytes */
+   vdsc_cfg->slice_chunk_size = DIV_ROUND_UP(vdsc_cfg->slice_width *
+ vdsc_cfg->bits_per_pixel,
+ (8 * 16));
+
+   if (vdsc_cfg->convert_rgb)
+   num_extra_mux_bits = 3 * (vdsc_cfg->mux_word_size +
+ (4 * vdsc_cfg->bits_per_component + 4)
+ - 2);
+   else
+   num_extra_mux_bits = 3 * vdsc_cfg->mux_word_size +
+   (4 * vdsc_cfg->bits_per_component + 4) +
+   2 * (4 * vdsc_cfg->bits_per_component) - 2;
+   /* Number of bits in one Slice */
+   slice_bits = 8 * vdsc_cfg->slice_chunk_size * vdsc_cfg->slice_height;
+
+   while ((num_extra_mux_bits > 0) &&
+  ((slice_bits - num_extra_mux_bits) % vdsc_cfg->mux_word_size))
+   num_extra_mux_bits--;
+
+   if (groups_per_line < vdsc_cfg->initial_scale_value - 8)
+   vdsc_cfg->initial_scale_value = groups_per_line + 8;
+
+   /* scale_decrement_interval calculation according to DSC spec 1.11 */
+   if (vdsc_cfg->initial_scale_value > 8)
+   vdsc_cfg->scale_decrement_interval = groups_per_line /
+   (vdsc_cfg->initial_scale_value - 8);
+   else
+   vdsc_cfg->scale_decrement_interval = 
DSC_SCALE_DECREMENT_INTERVAL_MAX;
+
+   vdsc_cfg->final_offset = vdsc_cfg->rc_model_size -
+   (vdsc_cfg->initial_xmit_delay *
+vdsc_cfg->bits_per_pixel + 8) / 16 + num_extra_mux_bits;
+
+   if (vdsc_cfg->final_offset >= vdsc_cfg->rc_model_size) {
+   DRM_DEBUG_KMS("FinalOfs < RcModelSze for this 
InitialXmitDelay\n");
+   return -ERANGE;
+   }
+
+   final_scale = (vdsc_cfg->rc_model_size * 8) /
+   (vdsc_cfg->rc_model_size - vdsc_cfg->final_offset);
+   if (vdsc_cfg->slice_height > 1)
+   /*
+* NflBpgOffset is 16 bit value with 11 fractional bits
+* hence we multiply by 2^11 for preserving the
+* fractional part
+*/
+   vdsc_cfg->nfl_bpg_offset = 
DIV_ROUND_UP((vdsc_cfg->first_line_bpg_offset << 11),
+   (vdsc_cfg->slice_height 
- 1));
+   else
+   vdsc_cfg->nfl_bpg_offset = 0;
+
+   /* 2^16 - 1 */
+   if (vdsc_cfg->nfl_bpg_offset > 65535) {
+   DRM_DEBUG_KMS("NflBpgOffset is too large for this slice 
height\n");
+   return -ERANGE;
+   }
+
+   /* Number of groups used to code the entire slice */
+   groups_total = groups_per_line * vdsc_cfg->slice_height;
+
+   /* slice_bpg_offset is 16 bit value with 11 fractional bits */
+   vdsc_cfg->slice_bpg_offset = DIV_ROUND_UP(((vdsc_cfg->rc_model_size -
+   vdsc_cfg->initial_offset +
+   num_extra_mux_bits) << 11),
+ groups_total);
+
+   if (final_scale > 9) {
+   /*
+* ScaleIncrementInterval =
+* finaloffset/((NflBpgOffset + 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Kazlauskas, Nicholas
On 2/13/19 4:50 AM, Daniel Vetter wrote:
> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
>>>
>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
>>>  wrote:

 On 2/11/19 3:35 AM, Daniel Vetter wrote:
> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
>> The pageflip completion timestamps transmitted to userspace
>> via pageflip completion events are supposed to describe the
>> time at which the first pixel of the new post-pageflip scanout
>> buffer leaves the video output of the gpu. This time is
>> identical to end of vblank, when active scanout starts.
>>
>> For a crtc in standard fixed refresh rate, the end of vblank
>> is identical to the vblank timestamps calculated by
>> drm_update_vblank_count() at each vblank interrupt, or each
>> vblank dis-/enable. Therefore pageflip events just carry
>> that vblank timestamp as their pageflip timestamp.
>>
>> For a crtc switched to variable refresh rate mode (vrr), the
>> pageflip completion timestamps are identical to the vblank
>> timestamps iff the pageflip was executed early in vblank,
>> before the minimum vblank duration elapsed. In this case
>> the time of display onset is identical to when the crtc
>> is running in fixed refresh rate.
>>
>> However, if a pageflip completes later in the vblank, inside
>> the "extended front porch" in vrr mode, then the vblank will
>> terminate at a fixed (back porch) duration after flip, so
>> the display onset time is delayed correspondingly. In this
>> case the vblank timestamp computed at vblank irq time would
>> be too early, and we need a way to calculate an estimated
>> pageflip timestamp that will be later than the vblank timestamp.
>>
>> How a driver determines such a "late flip" timestamp is hw
>> and driver specific, but this patch adds a new helper function
>> that allows the driver to propose such an alternate "late flip"
>> timestamp for use in pageflip events:
>>
>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>
>> When sending out pageflip events, we now compare that proposed
>> flip_timestamp against the vblank timestamp of the current
>> vblank of flip completion and choose to send out the greater/
>> later timestamp as flip completion timestamp.
>>
>> The most simple way for a kms driver to supply a suitable
>> flip_timestamp in vrr mode would be to simply take a timestamp
>> at start of the pageflip completion handler, e.g., pageflip
>> irq handler: flip_timestamp = ktime_get(); and then set that
>> as proposed "late" alternative timestamp via ...
>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>
>> More clever approaches could try to add some corrective offset
>> for fixed back porch duration, or ideally use hardware features
>> like hw timestamps to calculate the exact end time of vblank.
>>
>> Signed-off-by: Mario Kleiner 
>> Cc: Nicholas Kazlauskas 
>> Cc: Harry Wentland 
>> Cc: Alex Deucher 
>
> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us
> the right timestampe, once? With this I guess if you do a vblank query in
> between the wrong and the right vblank you'll get the bogus value. Not
> really great for userspace.
> -Daniel

 I think we calculate the timestamp and send the vblank event both within
 the pageflip IRQ handler so calculating the right pageflip timestamp
 once could probably be done. I'm not sure if it's easier than proposing
 a later flip time with an API like this though.

 The actual scanout time should be known from the page-flip handler so
 the semantics for VRR on/off remain the same. This is because the
 page-flip triggers entering the back porch if we're in the extended
 front porch.

 But scanout time from vblank events for something like
 DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
 treated as estimates. If we're in the regular front porch then the
 timing to scanout is based on the fixed duration front porch for the
 current mode. If we're in the extended back porch then it's technically
 driver defined but the most reasonable guess is to assume that the front
 porch is going to end at any moment, so just return the length of the
 back porch for getting the scanout time.

 Proposing the late timestamp shouldn't affect vblank event in the
 DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
 event case. I'm not sure if that's what's guaranteed to happen with this
 patch though. There doesn't seem to be any locking on either
 dev->vblank_time_lock or the vblank->seqlock so while it's likely to get
 the same vblank event back as the 

Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.

2019-02-13 Thread Daniel Vetter
On Wed, Feb 13, 2019 at 11:54 AM Michel Dänzer  wrote:
>
> On 2019-02-13 10:53 a.m., Daniel Vetter wrote:
> > On Mon, Feb 11, 2019 at 04:01:12PM +0100, Michel Dänzer wrote:
> >> On 2019-02-09 7:52 a.m., Mario Kleiner wrote:
> >>> In VRR mode, keep track of the vblank count of the last
> >>> completed pageflip in amdgpu_crtc->last_flip_vblank, as
> >>> recorded in the pageflip completion handler after each
> >>> completed flip.
> >>>
> >>> Use that count to prevent mmio programming a new pageflip
> >>> within the same vblank in which the last pageflip completed,
> >>> iow. to throttle pageflips to at most one flip per video
> >>> frame, while at the same time allowing to request a flip
> >>> not only before start of vblank, but also anywhere within
> >>> vblank.
> >>>
> >>> The old logic did the same, and made sense for regular fixed
> >>> refresh rate flipping, but in vrr mode it prevents requesting
> >>> a flip anywhere inside the possibly huge vblank, thereby
> >>> reducing framerate in vrr mode instead of improving it, by
> >>> delaying a slightly delayed flip requests up to a maximum
> >>> vblank duration + 1 scanout duration. This would limit VRR
> >>> usefulness to only help applications with a very high GPU
> >>> demand, which can submit the flip request before start of
> >>> vblank, but then have to wait long for fences to complete.
> >>>
> >>> With this method a flip can be both requested and - after
> >>> fences have completed - executed, ie. it doesn't matter if
> >>> the request (amdgpu_dm_do_flip()) gets delayed until deep
> >>> into the extended vblank due to cpu execution delays. This
> >>> also allows clients which want to regulate framerate within
> >>> the vrr range a much more fine-grained control of flip timing,
> >>> a feature that might be useful for video playback, and is
> >>> very useful for neuroscience/vision research applications.
> >>>
> >>> In regular non-VRR mode, retain the old flip submission
> >>> behavior. This to keep flip scheduling for fullscreen X11/GLX
> >>> OpenGL clients intact, if they use the GLX_OML_sync_control
> >>> extensions glXSwapBufferMscOML(, ..., target_msc,...) function
> >>> with a specific target_msc target vblank count.
> >>>
> >>> glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
> >>> not flip at the proper target_msc for a non-zero target_msc
> >>> if VRR mode is active with this patch. They'd often flip one
> >>> frame too early. However, this limitation should not matter
> >>> much in VRR mode, as scheduling based on vblank counts is
> >>> pretty futile/unusable under variable refresh duration
> >>> anyway, so no real extra harm is done.
> >>>
> >>> According to some testing already done with this patch by
> >>> Nicholas on top of my tests, IGT tests didn't report any
> >>> problems. If fixes stuttering and flickering when flipping
> >>> at rates below the minimum vrr refresh rate.
> >>>
> >>> Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
> >>> properties")
> >>> Signed-off-by: Mario Kleiner 
> >>> Cc: 
> >>> Cc: Nicholas Kazlauskas 
> >>> Cc: Harry Wentland 
> >>> Cc: Alex Deucher 
> >>> Cc: Michel Dänzer 
> >>
> >> I wonder if this couldn't be solved in a simpler / cleaner way by making
> >> use of the target MSC passed to the page_flip_target hook.
> >
> > Requiring that all compositors that use VRR also have to use page_flip
> > target (which is not yet exposed on the atomic side yet at all) just
> > because amdgpu doesn't sound like a great idea. I think better to handle
> > this in the amdgpu kernel driver, for similar reasons we've originally
> > added this.
>
> We've originally added what?

The pageflip delay right after you receive a vblank. amdgpu did accept
pageflips for the current frame even after the vblank for the same was
sent out already, breaking non-amdgpu specific compositors. This is
why the page_flip target was added, so that amdgpu could still do
this, while delaying the page_flip for everyone else.

Side effect of that code is that the refresh rate goes slow if you
page_flip without a target and happen to issue it right in the vblank
(somewhat unlikely, given that scanout takes longer than blank
period). "Fixing" that by requiring everyone to specificy the target
(which isn't enabled even for atomic) doesn't sound like a good
solution to me. And page_flip target is an amdgpu-only feature.

> Also not sure what you mean by "because amdgpu". Other drivers which
> want to support VRR might also have to deal with this issue one way or
> another, as it's due to page flips submitted during a vertical blank
> period only being expected to take effect during the following vertical
> blank period normally.

Yeah I expect we'll need to do something similar for intel vrr. That's
why I'm in this discussion. Making the vblank and page_flip timestamps
agree, plus issueing page_flips as soon as possible (without violating
the ordering rules generic userspace expects) sounds like the best
option here.

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Daniel Vetter
On Wed, Feb 13, 2019 at 12:05 PM Mario Kleiner
 wrote:
>
> On Wed, Feb 13, 2019 at 10:56 AM Chris Wilson  
> wrote:
> >
> > Quoting Daniel Vetter (2019-02-13 09:50:55)
> > > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > > > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > > > of vblank to give Mesa the most time for frontbuffer rendering for
> > > > classic X. But vblank events are also used for scheduling bufferswaps
> > > > or other stuff for redirected windowed rendering, or via api's like
> > > > OML_sync_controls glXWaitForMscOML, so there might be other things
> > > > affected by a more delayed vblank handling.
> > >
> > > The frontbuffer rendering is very much X driver specific, and I think
> > > -amdgpu/radeon is the only one that requires this. No i915 driver ever
> > > used the vblank interrupt to schedule frontbuffer blits, we use some
> > > CS-side stalls.
> >
> > Fwiw, the Present midlayer does use vblank scheduling for inplace copy
> > updates. Not that I wish to encourage anyone to use frontbuffer
> > rendering.
> > -Chris
>
> Yes, that's what i meant. Under DRI2 at least AMD, Intel and nouveau
> have throttling based on CS stalls to avoid tearing and do throttling.
> DRI3/Present last time i checked just waited for a vblank event and
> then triggered the blit - something that causes tearing even when
> triggered at start of vblank.

Hm, might be good to document all that stuff somewhere. But in the end
I'd say the answer is "don't do frontbuffer rendering". I'm not sure
we managed to document all the rules for existing vblank vs. flip
interactions anywhere. At least I didn't find anything with a look at
our docs.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[bug report] drm/amd/display: Calc vline position in dc.

2019-02-13 Thread Dan Carpenter via amd-gfx
Hello Yongqiang Sun,

The patch 810ece19ee74: "drm/amd/display: Calc vline position in dc."
from Jan 24, 2019, leads to the following static checker warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_optc.c:152 
calc_vline_position()
warn: inconsistent indenting

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_optc.c
134 static void calc_vline_position(
135 struct timing_generator *optc,
136 const struct dc_crtc_timing *dc_crtc_timing,
137 unsigned long long vsync_delta,
138 uint32_t *start_line,
139 uint32_t *end_line)
140 {
141 unsigned long long req_delta_tens_of_usec = 
div64_u64((vsync_delta + ), 1);
142 unsigned long long pix_clk_hundreds_khz = 
div64_u64((dc_crtc_timing->pix_clk_100hz + 999), 1000);
143 uint32_t req_delta_lines = (uint32_t) div64_u64(
144 (req_delta_tens_of_usec * pix_clk_hundreds_khz 
+ dc_crtc_timing->h_total - 1),
145 
dc_crtc_timing->h_total);
146 
147 uint32_t vsync_line = get_start_vline(optc, dc_crtc_timing);
148 
149 if (req_delta_lines != 0)
150 req_delta_lines--;
 ^^^
My guess is that everything is indented one extra tab.


151 
--> 152 if (req_delta_lines > vsync_line)
 ^^^
It's also possible that this was supposed to be part of the if
statement?  Missing curly braces?  But I don't know for sure.

153 *start_line = dc_crtc_timing->v_total - 
(req_delta_lines - vsync_line) - 1;
154 else
155 *start_line = vsync_line - req_delta_lines;
156 
157 *end_line = *start_line + 2;
158 
159 if (*end_line >= dc_crtc_timing->v_total)
160 *end_line = 2;
161 }

regards,
dan carpenter
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 10:50 AM Daniel Vetter  wrote:
>
> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> > >  wrote:
> > > >
> > > > On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> > > > >> The pageflip completion timestamps transmitted to userspace
...
> > > > >
> > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
> > > > > give us
> > > > > the right timestampe, once? With this I guess if you do a vblank 
> > > > > query in
> > > > > between the wrong and the right vblank you'll get the bogus value. Not
> > > > > really great for userspace.
> > > > > -Daniel
> > > >
> > > > I think we calculate the timestamp and send the vblank event both within
> > > > the pageflip IRQ handler so calculating the right pageflip timestamp
> > > > once could probably be done. I'm not sure if it's easier than proposing
> > > > a later flip time with an API like this though.
> > > >
> > > > The actual scanout time should be known from the page-flip handler so
> > > > the semantics for VRR on/off remain the same. This is because the
> > > > page-flip triggers entering the back porch if we're in the extended
> > > > front porch.
> > > >
> > > > But scanout time from vblank events for something like
> > > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> > > > treated as estimates. If we're in the regular front porch then the
> > > > timing to scanout is based on the fixed duration front porch for the
> > > > current mode. If we're in the extended back porch then it's technically
> > > > driver defined but the most reasonable guess is to assume that the front
> > > > porch is going to end at any moment, so just return the length of the
> > > > back porch for getting the scanout time.
> > > >
> > > > Proposing the late timestamp shouldn't affect vblank event in the
> > > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
> > > > event case. I'm not sure if that's what's guaranteed to happen with this
> > > > patch though. There doesn't seem to be any locking on either
> > > > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get
> > > > the same vblank event back as the one just stored I don't think it's
> > > > guaranteed.
> > >
> > > That's the inconsistency I mean to highlight - the timestamp for the
> > > same frame as observed through flip complete and through the
> > > wait_vblank ioctl can differ. Which they really shouldn't.
> > >
> >
> > Ideally they shouldn't differ. The kernel docs for drm_crtc_state say
> > that vblank and pageflip timestamps should always match. But then the
> > kernel docs for "Variable refresh properties" in drm_connector.c for
> > vblank timestamps were changed for the VRR implementation in Linux
> > 5.0-rc to redefine them when in VRR mode. They are defined, but
> > probably rather useless for any practical purpose, like this:
> >
> > "The semantics for the vertical blank timestamp differ when
> > variable refresh rate is active. The vertical blank timestamp
> > is defined to be an estimate using the current mode's fixed
> > refresh rate timings. The semantics for the page-flip event
> > timestamp remain the same."
>
> Uh I missed that. That sounds like nonsense tbh.
>
> > So our docs contradict each other as of Linux 5.0-rc. Certainly having
> > useful vblank timetamps would be useful.
>
> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of
> hilarity will ensue.
>
> > > Now added complication is that amdgpu sends out vblank events really
> > > early, which is used by userspace to do frontbuffer rendering in the
> > > vblank time. But I don't think anyone wants to do both VRR and
> >
> > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > of vblank to give Mesa the most time for frontbuffer rendering for
> > classic X. But vblank events are also used for scheduling bufferswaps
> > or other stuff for redirected windowed rendering, or via api's like
> > OML_sync_controls glXWaitForMscOML, so there might be other things
> > affected by a more delayed vblank handling.
>
> The frontbuffer rendering is very much X driver specific, and I think
> -amdgpu/radeon is the only one that requires this. No i915 driver ever
> used the vblank interrupt to schedule frontbuffer blits, we use some
> CS-side stalls.
>
> Wrt scheduling pageflips: The rule is that right after you've received the
> vblank for frame X, then an immediately schedule pageflip should hit X+1,
> but not X. amdgpu had this broken, but it's fixed since a while.
>
> > > frontbuffer rendering, hence I think it should be possible to create
> > > correct vblank timestamps for the VRR case, while leaving the current
> > > logic in place for everything else. But that means moving the entire
> > > vblank 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 10:56 AM Chris Wilson  wrote:
>
> Quoting Daniel Vetter (2019-02-13 09:50:55)
> > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > > of vblank to give Mesa the most time for frontbuffer rendering for
> > > classic X. But vblank events are also used for scheduling bufferswaps
> > > or other stuff for redirected windowed rendering, or via api's like
> > > OML_sync_controls glXWaitForMscOML, so there might be other things
> > > affected by a more delayed vblank handling.
> >
> > The frontbuffer rendering is very much X driver specific, and I think
> > -amdgpu/radeon is the only one that requires this. No i915 driver ever
> > used the vblank interrupt to schedule frontbuffer blits, we use some
> > CS-side stalls.
>
> Fwiw, the Present midlayer does use vblank scheduling for inplace copy
> updates. Not that I wish to encourage anyone to use frontbuffer
> rendering.
> -Chris

Yes, that's what i meant. Under DRI2 at least AMD, Intel and nouveau
have throttling based on CS stalls to avoid tearing and do throttling.
DRI3/Present last time i checked just waited for a vblank event and
then triggered the blit - something that causes tearing even when
triggered at start of vblank.

-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.

2019-02-13 Thread Michel Dänzer
On 2019-02-13 10:53 a.m., Daniel Vetter wrote:
> On Mon, Feb 11, 2019 at 04:01:12PM +0100, Michel Dänzer wrote:
>> On 2019-02-09 7:52 a.m., Mario Kleiner wrote:
>>> In VRR mode, keep track of the vblank count of the last
>>> completed pageflip in amdgpu_crtc->last_flip_vblank, as
>>> recorded in the pageflip completion handler after each
>>> completed flip.
>>>
>>> Use that count to prevent mmio programming a new pageflip
>>> within the same vblank in which the last pageflip completed,
>>> iow. to throttle pageflips to at most one flip per video
>>> frame, while at the same time allowing to request a flip
>>> not only before start of vblank, but also anywhere within
>>> vblank.
>>>
>>> The old logic did the same, and made sense for regular fixed
>>> refresh rate flipping, but in vrr mode it prevents requesting
>>> a flip anywhere inside the possibly huge vblank, thereby
>>> reducing framerate in vrr mode instead of improving it, by
>>> delaying a slightly delayed flip requests up to a maximum
>>> vblank duration + 1 scanout duration. This would limit VRR
>>> usefulness to only help applications with a very high GPU
>>> demand, which can submit the flip request before start of
>>> vblank, but then have to wait long for fences to complete.
>>>
>>> With this method a flip can be both requested and - after
>>> fences have completed - executed, ie. it doesn't matter if
>>> the request (amdgpu_dm_do_flip()) gets delayed until deep
>>> into the extended vblank due to cpu execution delays. This
>>> also allows clients which want to regulate framerate within
>>> the vrr range a much more fine-grained control of flip timing,
>>> a feature that might be useful for video playback, and is
>>> very useful for neuroscience/vision research applications.
>>>
>>> In regular non-VRR mode, retain the old flip submission
>>> behavior. This to keep flip scheduling for fullscreen X11/GLX
>>> OpenGL clients intact, if they use the GLX_OML_sync_control
>>> extensions glXSwapBufferMscOML(, ..., target_msc,...) function
>>> with a specific target_msc target vblank count.
>>>
>>> glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
>>> not flip at the proper target_msc for a non-zero target_msc
>>> if VRR mode is active with this patch. They'd often flip one
>>> frame too early. However, this limitation should not matter
>>> much in VRR mode, as scheduling based on vblank counts is
>>> pretty futile/unusable under variable refresh duration
>>> anyway, so no real extra harm is done.
>>>
>>> According to some testing already done with this patch by
>>> Nicholas on top of my tests, IGT tests didn't report any
>>> problems. If fixes stuttering and flickering when flipping
>>> at rates below the minimum vrr refresh rate.
>>>
>>> Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
>>> properties")
>>> Signed-off-by: Mario Kleiner 
>>> Cc: 
>>> Cc: Nicholas Kazlauskas 
>>> Cc: Harry Wentland 
>>> Cc: Alex Deucher 
>>> Cc: Michel Dänzer 
>>
>> I wonder if this couldn't be solved in a simpler / cleaner way by making
>> use of the target MSC passed to the page_flip_target hook.
> 
> Requiring that all compositors that use VRR also have to use page_flip
> target (which is not yet exposed on the atomic side yet at all) just
> because amdgpu doesn't sound like a great idea. I think better to handle
> this in the amdgpu kernel driver, for similar reasons we've originally
> added this.

We've originally added what?

Also not sure what you mean by "because amdgpu". Other drivers which
want to support VRR might also have to deal with this issue one way or
another, as it's due to page flips submitted during a vertical blank
period only being expected to take effect during the following vertical
blank period normally.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Chris Wilson
Quoting Daniel Vetter (2019-02-13 09:50:55)
> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > of vblank to give Mesa the most time for frontbuffer rendering for
> > classic X. But vblank events are also used for scheduling bufferswaps
> > or other stuff for redirected windowed rendering, or via api's like
> > OML_sync_controls glXWaitForMscOML, so there might be other things
> > affected by a more delayed vblank handling.
> 
> The frontbuffer rendering is very much X driver specific, and I think
> -amdgpu/radeon is the only one that requires this. No i915 driver ever
> used the vblank interrupt to schedule frontbuffer blits, we use some
> CS-side stalls.

Fwiw, the Present midlayer does use vblank scheduling for inplace copy
updates. Not that I wish to encourage anyone to use frontbuffer
rendering.
-Chris
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.

2019-02-13 Thread Daniel Vetter
On Mon, Feb 11, 2019 at 04:01:12PM +0100, Michel Dänzer wrote:
> On 2019-02-09 7:52 a.m., Mario Kleiner wrote:
> > In VRR mode, keep track of the vblank count of the last
> > completed pageflip in amdgpu_crtc->last_flip_vblank, as
> > recorded in the pageflip completion handler after each
> > completed flip.
> > 
> > Use that count to prevent mmio programming a new pageflip
> > within the same vblank in which the last pageflip completed,
> > iow. to throttle pageflips to at most one flip per video
> > frame, while at the same time allowing to request a flip
> > not only before start of vblank, but also anywhere within
> > vblank.
> > 
> > The old logic did the same, and made sense for regular fixed
> > refresh rate flipping, but in vrr mode it prevents requesting
> > a flip anywhere inside the possibly huge vblank, thereby
> > reducing framerate in vrr mode instead of improving it, by
> > delaying a slightly delayed flip requests up to a maximum
> > vblank duration + 1 scanout duration. This would limit VRR
> > usefulness to only help applications with a very high GPU
> > demand, which can submit the flip request before start of
> > vblank, but then have to wait long for fences to complete.
> > 
> > With this method a flip can be both requested and - after
> > fences have completed - executed, ie. it doesn't matter if
> > the request (amdgpu_dm_do_flip()) gets delayed until deep
> > into the extended vblank due to cpu execution delays. This
> > also allows clients which want to regulate framerate within
> > the vrr range a much more fine-grained control of flip timing,
> > a feature that might be useful for video playback, and is
> > very useful for neuroscience/vision research applications.
> > 
> > In regular non-VRR mode, retain the old flip submission
> > behavior. This to keep flip scheduling for fullscreen X11/GLX
> > OpenGL clients intact, if they use the GLX_OML_sync_control
> > extensions glXSwapBufferMscOML(, ..., target_msc,...) function
> > with a specific target_msc target vblank count.
> > 
> > glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
> > not flip at the proper target_msc for a non-zero target_msc
> > if VRR mode is active with this patch. They'd often flip one
> > frame too early. However, this limitation should not matter
> > much in VRR mode, as scheduling based on vblank counts is
> > pretty futile/unusable under variable refresh duration
> > anyway, so no real extra harm is done.
> > 
> > According to some testing already done with this patch by
> > Nicholas on top of my tests, IGT tests didn't report any
> > problems. If fixes stuttering and flickering when flipping
> > at rates below the minimum vrr refresh rate.
> > 
> > Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
> > properties")
> > Signed-off-by: Mario Kleiner 
> > Cc: 
> > Cc: Nicholas Kazlauskas 
> > Cc: Harry Wentland 
> > Cc: Alex Deucher 
> > Cc: Michel Dänzer 
> 
> I wonder if this couldn't be solved in a simpler / cleaner way by making
> use of the target MSC passed to the page_flip_target hook.

Requiring that all compositors that use VRR also have to use page_flip
target (which is not yet exposed on the atomic side yet at all) just
because amdgpu doesn't sound like a great idea. I think better to handle
this in the amdgpu kernel driver, for similar reasons we've originally
added this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Daniel Vetter
On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> >
> > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> >  wrote:
> > >
> > > On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> > > >> The pageflip completion timestamps transmitted to userspace
> > > >> via pageflip completion events are supposed to describe the
> > > >> time at which the first pixel of the new post-pageflip scanout
> > > >> buffer leaves the video output of the gpu. This time is
> > > >> identical to end of vblank, when active scanout starts.
> > > >>
> > > >> For a crtc in standard fixed refresh rate, the end of vblank
> > > >> is identical to the vblank timestamps calculated by
> > > >> drm_update_vblank_count() at each vblank interrupt, or each
> > > >> vblank dis-/enable. Therefore pageflip events just carry
> > > >> that vblank timestamp as their pageflip timestamp.
> > > >>
> > > >> For a crtc switched to variable refresh rate mode (vrr), the
> > > >> pageflip completion timestamps are identical to the vblank
> > > >> timestamps iff the pageflip was executed early in vblank,
> > > >> before the minimum vblank duration elapsed. In this case
> > > >> the time of display onset is identical to when the crtc
> > > >> is running in fixed refresh rate.
> > > >>
> > > >> However, if a pageflip completes later in the vblank, inside
> > > >> the "extended front porch" in vrr mode, then the vblank will
> > > >> terminate at a fixed (back porch) duration after flip, so
> > > >> the display onset time is delayed correspondingly. In this
> > > >> case the vblank timestamp computed at vblank irq time would
> > > >> be too early, and we need a way to calculate an estimated
> > > >> pageflip timestamp that will be later than the vblank timestamp.
> > > >>
> > > >> How a driver determines such a "late flip" timestamp is hw
> > > >> and driver specific, but this patch adds a new helper function
> > > >> that allows the driver to propose such an alternate "late flip"
> > > >> timestamp for use in pageflip events:
> > > >>
> > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > > >>
> > > >> When sending out pageflip events, we now compare that proposed
> > > >> flip_timestamp against the vblank timestamp of the current
> > > >> vblank of flip completion and choose to send out the greater/
> > > >> later timestamp as flip completion timestamp.
> > > >>
> > > >> The most simple way for a kms driver to supply a suitable
> > > >> flip_timestamp in vrr mode would be to simply take a timestamp
> > > >> at start of the pageflip completion handler, e.g., pageflip
> > > >> irq handler: flip_timestamp = ktime_get(); and then set that
> > > >> as proposed "late" alternative timestamp via ...
> > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > > >>
> > > >> More clever approaches could try to add some corrective offset
> > > >> for fixed back porch duration, or ideally use hardware features
> > > >> like hw timestamps to calculate the exact end time of vblank.
> > > >>
> > > >> Signed-off-by: Mario Kleiner 
> > > >> Cc: Nicholas Kazlauskas 
> > > >> Cc: Harry Wentland 
> > > >> Cc: Alex Deucher 
> > > >
> > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give 
> > > > us
> > > > the right timestampe, once? With this I guess if you do a vblank query 
> > > > in
> > > > between the wrong and the right vblank you'll get the bogus value. Not
> > > > really great for userspace.
> > > > -Daniel
> > >
> > > I think we calculate the timestamp and send the vblank event both within
> > > the pageflip IRQ handler so calculating the right pageflip timestamp
> > > once could probably be done. I'm not sure if it's easier than proposing
> > > a later flip time with an API like this though.
> > >
> > > The actual scanout time should be known from the page-flip handler so
> > > the semantics for VRR on/off remain the same. This is because the
> > > page-flip triggers entering the back porch if we're in the extended
> > > front porch.
> > >
> > > But scanout time from vblank events for something like
> > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> > > treated as estimates. If we're in the regular front porch then the
> > > timing to scanout is based on the fixed duration front porch for the
> > > current mode. If we're in the extended back porch then it's technically
> > > driver defined but the most reasonable guess is to assume that the front
> > > porch is going to end at any moment, so just return the length of the
> > > back porch for getting the scanout time.
> > >
> > > Proposing the late timestamp shouldn't affect vblank event in the
> > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
> > > event case. I'm not sure if that's what's guaranteed to happen with this
> > > patch though. There doesn't seem to be any locking on