[PATCH] drm/amdgpu/sriov: skip programing some regs with new L1 policy

2020-03-01 Thread Tiecheng Zhou
With new L1 policy, some regs are blocked at guest and they are
programed at host side. So skip programing the regs under sriov.

the regs are:
GCMC_VM_FB_LOCATION_TOP
GCMC_VM_FB_LOCATION_BASE
MMMC_VM_FB_LOCATION_TOP
MMMC_VM_FB_LOCATION_BASE
GCMC_VM_SYSTEM_APERTURE_HIGH_ADDR
GCMC_VM_SYSTEM_APERTURE_LOW_ADDR
MMMC_VM_SYSTEM_APERTURE_HIGH_ADDR
MMMC_VM_SYSTEM_APERTURE_LOW_ADDR
HDP_NONSURFACE_BASE
HDP_NONSURFACE_BASE_HI
GCMC_VM_AGP_TOP
GCMC_VM_AGP_BOT
GCMC_VM_AGP_BASE

Signed-off-by: Tiecheng Zhou 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 55 +++-
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  | 29 ++---
 2 files changed, 37 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
index e0654a216ab5..cc866c367939 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c
@@ -81,24 +81,31 @@ static void gfxhub_v2_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
 {
uint64_t value;
 
-   /* Disable AGP. */
-   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_BASE, 0);
-   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_TOP, 0);
-   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_BOT, 0x00FF);
-
-   /* Program the system aperture low logical page number. */
-   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
-   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
-
-   /* Set default page address. */
-   value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start
-   + adev->vm_manager.vram_base_offset;
-   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR_LSB,
-(u32)(value >> 12));
-   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR_MSB,
-(u32)(value >> 44));
+   if (!amdgpu_sriov_vf(adev)) {
+   /*
+* the new L1 policy will block SRIOV guest from writing
+* these regs, and they will be programed at host.
+* so skip programing these regs.
+*/
+   /* Disable AGP. */
+   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_BASE, 0);
+   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_TOP, 0);
+   WREG32_SOC15(GC, 0, mmGCMC_VM_AGP_BOT, 0x00FF);
+
+   /* Program the system aperture low logical page number. */
+   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_LOW_ADDR,
+adev->gmc.vram_start >> 18);
+   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
+adev->gmc.vram_end >> 18);
+
+   /* Set default page address. */
+   value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start
+   + adev->vm_manager.vram_base_offset;
+   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR_LSB,
+(u32)(value >> 12));
+   WREG32_SOC15(GC, 0, mmGCMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR_MSB,
+(u32)(value >> 44));
+   }
 
/* Program "protection fault". */
WREG32_SOC15(GC, 0, mmGCVM_L2_PROTECTION_FAULT_DEFAULT_ADDR_LO32,
@@ -260,18 +267,6 @@ static void gfxhub_v2_0_program_invalidation(struct 
amdgpu_device *adev)
 
 int gfxhub_v2_0_gart_enable(struct amdgpu_device *adev)
 {
-   if (amdgpu_sriov_vf(adev)) {
-   /*
-* GCMC_VM_FB_LOCATION_BASE/TOP is NULL for VF, becuase they are
-* VF copy registers so vbios post doesn't program them, for
-* SRIOV driver need to program them
-*/
-   WREG32_SOC15(GC, 0, mmGCMC_VM_FB_LOCATION_BASE,
-adev->gmc.vram_start >> 24);
-   WREG32_SOC15(GC, 0, mmGCMC_VM_FB_LOCATION_TOP,
-adev->gmc.vram_end >> 24);
-   }
-
/* GART Enable. */
gfxhub_v2_0_init_gart_aperture_regs(adev);
gfxhub_v2_0_init_system_aperture_regs(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
index bde189680521..fb3f228458e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
@@ -72,11 +72,18 @@ static void mmhub_v2_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
WREG32_SOC15(MMHUB, 0, mmMMMC_VM_AGP_TOP, 0);
WREG32_SOC15(MMHUB, 0, mmMMMC_VM_AGP_BOT, 0x00FF);
 
-   /* Program the system aperture low logical page number. */
-   WREG32_SOC15(MMHUB, 0, mmMMMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
-   WREG32_SOC15(MMHUB, 0, mmMMMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
+   if (!amdgpu_sriov_vf(adev)) {
+   /*
+* the new L1 policy will 

[PATCH] drm/amdgpu/runpm: disable runpm on Vega10

2020-03-01 Thread Feifei Xu
Some framework test will fail if enable runpm on Vega10.
Disable it untill issue fixed.

Signed-off-by: Feifei Xu 
Tested-by: Kyle Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 0f3563926ad1..7c1e0d9f2c26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -175,6 +175,7 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned 
long flags)
else if (amdgpu_device_supports_baco(dev) &&
 (amdgpu_runtime_pm != 0) &&
 (adev->asic_type >= CHIP_TOPAZ) &&
+(adev->asic_type != CHIP_VEGA10) &&
 (adev->asic_type != CHIP_VEGA20) &&
 (adev->asic_type != CHIP_ARCTURUS)) /* enable runpm on VI+ */
adev->runpm = true;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Fix pageflip event race condition for DCN.

2020-03-01 Thread Mario Kleiner
Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
events at vsartup for DCN")' introduces a new way of pageflip
completion handling for DCN, and some trouble.

The current implementation introduces a race condition, which
can cause pageflip completion events to be sent out one vblank
too early, thereby confusing userspace and causing flicker:

prepare_flip_isr():

1. Pageflip programming takes the ddev->event_lock.
2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
3. Releases ddev->event_lock.

--> Deadline for surface address regs double-buffering passes on
target pipe.

4. dc_commit_updates_for_stream() MMIO programs the new pageflip
   into hw, but too late for current vblank.

=> pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
   in current vblank due to missing the double-buffering deadline
   by a tiny bit.

5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
   dm_dcn_crtc_high_irq() gets called.

6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
   pageflip has been completed/will complete in this vblank and
   sends out pageflip completion event to userspace and resets
   pflip_status = AMDGPU_FLIP_NONE.

=> Flip completion event sent out one vblank too early.

This behaviour has been observed during my testing with measurement
hardware a couple of time.

The commit message says that the extra flip event code was added to
dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
in case the pflip irq doesn't fire, because the "DCH HUBP" component
is clock gated and doesn't fire pflip irqs in that state. Also that
this clock gating may happen if no planes are active. This suggests
that the problem addressed by that commit can't happen if planes
are active.

The proposed solution is therefore to only execute the extra pflip
completion code iff the count of active planes is zero and otherwise
leave pflip completion handling to the pflip irq handler, for a
more race-free experience.

Note that i don't know if this fixes the problem the original commit
tried to address, as i don't know what the test scenario was. It
does fix the observed too early pageflip events though and points
out the problem introduced.

Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at vsartup 
for DCN")
Signed-off-by: Mario Kleiner 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 63e8a12a74bc..3502d6d52160 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -522,8 +522,9 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
 
acrtc_state = to_dm_crtc_state(acrtc->base.state);
 
-   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
-   amdgpu_dm_vrr_active(acrtc_state));
+   DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d, planes:%d\n", acrtc->crtc_id,
+amdgpu_dm_vrr_active(acrtc_state),
+acrtc_state->active_planes);
 
amdgpu_dm_crtc_handle_crc_irq(>base);
drm_crtc_handle_vblank(>base);
@@ -543,7 +544,18 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
_state->vrr_params.adjust);
}
 
-   if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED) {
+   /*
+* If there aren't any active_planes then DCH HUBP may be clock-gated.
+* In that case, pageflip completion interrupts won't fire and pageflip
+* completion events won't get delivered. Prevent this by sending
+* pending pageflip events from here if a flip is still pending.
+*
+* If any planes are enabled, use dm_pflip_high_irq() instead, to
+* avoid race conditions between flip programming and completion,
+* which could cause too early flip completion events.
+*/
+   if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED &&
+   acrtc_state->active_planes == 0) {
if (acrtc->event) {
drm_crtc_send_vblank_event(>base, acrtc->event);
acrtc->event = NULL;
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Update SPM_VMID with the job's vmid when application reserves the vmid

2020-03-01 Thread Jacob He
SPM access the video memory according to SPM_VMID. It should be updated
with the job's vmid right before the job is scheduled. SPM_VMID is a
global resource

Change-Id: Id3881908960398f87e7c95026a54ff83ff826700
Signed-off-by: Jacob He 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index c00696f3017e..c761d3a0b6e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1080,8 +1080,12 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
struct dma_fence *fence = NULL;
bool pasid_mapping_needed = false;
unsigned patch_offset = 0;
+   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));
int r;
 
+   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+
if (amdgpu_vmid_had_gpu_reset(adev, id)) {
gds_switch_needed = true;
vm_flush_needed = true;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Jason Ekstrand
On Sun, Mar 1, 2020 at 2:49 PM Nicolas Dufresne  wrote:
>
> Hi Jason,
>
> I personally think the suggestion are still a relatively good
> brainstorm data for those implicated. Of course, those not implicated
> in the CI scripting itself, I'd say just keep in mind that nothing is
> black and white and every changes end-up being time consuming.

Sorry.  I didn't intend to stop a useful brainstorming session.  I'm
just trying to say that CI is useful and we shouldn't hurt our
development flows just to save a little money unless we're truly
desperate.  From what I understand, I don't think we're that desperate
yet.  So I was mostly trying to re-focus the discussion towards
straightforward things we can do to get rid of pointless waste (there
probably is some pretty low-hanging fruit) and away from "OMG X.org is
running out of money; CI as little as possible".  I don't think you're
saying those things; but I've sensed a good bit of fear in this
thread.  (I could just be totally misreading people, but I don't think
so.)

One of the things that someone pointed out on this thread is that we
need data.  Some has been provided here but it's still a bit unclear
exactly what the break-down is so it's hard for people to come up with
good solutions beyond "just do less CI".  We do know that the biggest
cost is egress web traffic and that's something we didn't know before.
My understanding is that people on the X.org board and/or Daniel are
working to get better data.  I'm fairly hopeful that, once we
understand better what the costs are (or even with just the new data
we have), we can bring it down to reasonable and/or come up with money
to pay for it in fairly short order.

Again, sorry I was so terse.  I was just trying to slow the panic.

> Le dimanche 01 mars 2020 à 14:18 -0600, Jason Ekstrand a écrit :
> > I've seen a number of suggestions which will do one or both of those things 
> > including:
> >
> >  - Batching merge requests
>
> Agreed. Or at least I foresee quite complicated code to handle the case
> of one batched merge failing the tests, or worst, with flicky tests.
>
> >  - Not running CI on the master branch
>
> A small clarification, this depends on the chosen work-flow. In
> GStreamer, we use a rebase flow, so "merge" button isn't really
> merging. It means that to merge you need your branch to be rebased on
> top of the latest. As it is multi-repo, there is always a tiny chance
> of breakage due to mid-air collision in changes in other repos. What we
> see is that the post "merge" cannot even catch them all (as we already
> observed once). In fact, it usually does not catch anything. Or each
> time it cached something, we only notice on the next MR.0 So we are
> really considering doing this as for this specific workflow/project, we
> found very little gain of having it.
>
> With real merge, the code being tested before/after the merge is
> different, and for that I agree with you.

Even with a rebase model, it's still potentially different; though
marge re-runs CI before merging.  I agree the risk is low, however,
and if you have GitLab set up to block MRs that don't pass CI, then
you may be able to drop the master branch to a daily run or something
like that.  Again, should be project-by-project.

> >  - Shutting off CI
>
> Of course :-), specially that we had CI before gitlab in GStreamer
> (just not pre-commit), we don't want a regress that far in the past.
>
> >  - Preventing CI on other non-MR branches
>
> Another small nuance, mesa does not prevent CI, it only makes it manual
> on non-MR. Users can go click run to get CI results. We could also have
> option to trigger the ci (the opposite of ci.skip) from git command
> line.

Hence my use of "prevent". :-)  It's very useful but, IMO, it should
be opt-in and not opt-out.  I think we agree here. :-)

--Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: disable 3D pipe 1 on Navi1x

2020-03-01 Thread Tianci Yin
From: "Tianci.Yin" 

[why]
CP firmware decide to skip setting the state for 3D pipe 1 for Navi1x as there
is no use case.

[how]
Disable 3D pipe 1 on Navi1x.

Change-Id: I6898bdfe31d4e7908bd9bcfa82b6a75e118e8727
Reviewed-by: Hawking Zhang 
Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 97 ++
 1 file changed, 51 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 760fe2ebe799..f348512eb8c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -52,7 +52,7 @@
  * 1. Primary ring
  * 2. Async ring
  */
-#define GFX10_NUM_GFX_RINGS2
+#define GFX10_NUM_GFX_RINGS_NV1X   1
 #define GFX10_MEC_HPD_SIZE 2048
 
 #define F32_CE_PROGRAM_RAM_SIZE65536
@@ -1305,7 +1305,7 @@ static int gfx_v10_0_sw_init(void *handle)
case CHIP_NAVI14:
case CHIP_NAVI12:
adev->gfx.me.num_me = 1;
-   adev->gfx.me.num_pipe_per_me = 2;
+   adev->gfx.me.num_pipe_per_me = 1;
adev->gfx.me.num_queue_per_pipe = 1;
adev->gfx.mec.num_mec = 2;
adev->gfx.mec.num_pipe_per_mec = 4;
@@ -2711,18 +2711,20 @@ static int gfx_v10_0_cp_gfx_start(struct amdgpu_device 
*adev)
amdgpu_ring_commit(ring);
 
/* submit cs packet to copy state 0 to next available state */
-   ring = >gfx.gfx_ring[1];
-   r = amdgpu_ring_alloc(ring, 2);
-   if (r) {
-   DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r);
-   return r;
-   }
-
-   amdgpu_ring_write(ring, PACKET3(PACKET3_CLEAR_STATE, 0));
-   amdgpu_ring_write(ring, 0);
+   if (adev->gfx.num_gfx_rings > 1) {
+   /* maximum supported gfx ring is 2 */
+   ring = >gfx.gfx_ring[1];
+   r = amdgpu_ring_alloc(ring, 2);
+   if (r) {
+   DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r);
+   return r;
+   }
 
-   amdgpu_ring_commit(ring);
+   amdgpu_ring_write(ring, PACKET3(PACKET3_CLEAR_STATE, 0));
+   amdgpu_ring_write(ring, 0);
 
+   amdgpu_ring_commit(ring);
+   }
return 0;
 }
 
@@ -2819,39 +2821,41 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device 
*adev)
mutex_unlock(>srbm_mutex);
 
/* Init gfx ring 1 for pipe 1 */
-   mutex_lock(>srbm_mutex);
-   gfx_v10_0_cp_gfx_switch_pipe(adev, PIPE_ID1);
-   ring = >gfx.gfx_ring[1];
-   rb_bufsz = order_base_2(ring->ring_size / 8);
-   tmp = REG_SET_FIELD(0, CP_RB1_CNTL, RB_BUFSZ, rb_bufsz);
-   tmp = REG_SET_FIELD(tmp, CP_RB1_CNTL, RB_BLKSZ, rb_bufsz - 2);
-   WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp);
-   /* Initialize the ring buffer's write pointers */
-   ring->wptr = 0;
-   WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr));
-   WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, upper_32_bits(ring->wptr));
-   /* Set the wb address wether it's enabled or not */
-   rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
-   WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr));
-   WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) &
-   CP_RB1_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
-   wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
-   WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO,
-   lower_32_bits(wptr_gpu_addr));
-   WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI,
-   upper_32_bits(wptr_gpu_addr));
-
-   mdelay(1);
-   WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp);
-
-   rb_addr = ring->gpu_addr >> 8;
-   WREG32_SOC15(GC, 0, mmCP_RB1_BASE, rb_addr);
-   WREG32_SOC15(GC, 0, mmCP_RB1_BASE_HI, upper_32_bits(rb_addr));
-   WREG32_SOC15(GC, 0, mmCP_RB1_ACTIVE, 1);
-
-   gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
-   mutex_unlock(>srbm_mutex);
-
+   if (adev->gfx.num_gfx_rings > 1) {
+   mutex_lock(>srbm_mutex);
+   gfx_v10_0_cp_gfx_switch_pipe(adev, PIPE_ID1);
+   /* maximum supported gfx ring is 2 */
+   ring = >gfx.gfx_ring[1];
+   rb_bufsz = order_base_2(ring->ring_size / 8);
+   tmp = REG_SET_FIELD(0, CP_RB1_CNTL, RB_BUFSZ, rb_bufsz);
+   tmp = REG_SET_FIELD(tmp, CP_RB1_CNTL, RB_BLKSZ, rb_bufsz - 2);
+   WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp);
+   /* Initialize the ring buffer's write pointers */
+   ring->wptr = 0;
+   WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr));
+   WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, 
upper_32_bits(ring->wptr));
+   /* Set the wb address wether it's enabled or not */
+   rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+   

RE: [PATCH] drm/amdgpu: Rearm IRQ in Navi10 SR-IOV if IRQ lost

2020-03-01 Thread Liu, Monk
Hi Samir

Looks it is your first upstream path, 

The format of your description need to change:

Modify:
[PATCH] drm/amdgpu: Rearm IRQ in Navi10 SR-IOV if IRQ lost
To:
drm/amdgpu: Rearm IRQ in Navi10 SR-IOV if IRQ lost

with that changed you can get my RB 
(means you can put "Reviewed-by: Monk Liu " to the tail of 
your commit description)

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Samir Dhume
Sent: Friday, February 7, 2020 4:00 AM
To: amd-gfx@lists.freedesktop.org
Cc: Dhume, Samir 
Subject: [PATCH] drm/amdgpu: Rearm IRQ in Navi10 SR-IOV if IRQ lost

Ported from Vega10. SDMA stress tests sometimes see IRQ lost.

Signed-off-by: Samir Dhume 
---
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 36 ++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index cf557a428298..e08245a446fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -32,6 +32,7 @@
 #include "soc15_common.h"
 #include "navi10_ih.h"
 
+#define MAX_REARM_RETRY 10
 
 static void navi10_ih_set_interrupt_funcs(struct amdgpu_device *adev);
 
@@ -283,6 +284,38 @@ static void navi10_ih_decode_iv(struct amdgpu_device *adev,
ih->rptr += 32;
 }
 
+/**
+ * navi10_ih_irq_rearm - rearm IRQ if lost
+ *
+ * @adev: amdgpu_device pointer
+ *
+ */
+static void navi10_ih_irq_rearm(struct amdgpu_device *adev,
+  struct amdgpu_ih_ring *ih)
+{
+   uint32_t reg_rptr = 0;
+   uint32_t v = 0;
+   uint32_t i = 0;
+
+   if (ih == >irq.ih)
+   reg_rptr = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_RPTR);
+   else if (ih == >irq.ih1)
+   reg_rptr = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_RPTR_RING1);
+   else if (ih == >irq.ih2)
+   reg_rptr = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_RPTR_RING2);
+   else
+   return;
+
+   /* Rearm IRQ / re-write doorbell if doorbell write is lost */
+   for (i = 0; i < MAX_REARM_RETRY; i++) {
+   v = RREG32_NO_KIQ(reg_rptr);
+   if ((v < ih->ring_size) && (v != ih->rptr))
+   WDOORBELL32(ih->doorbell_index, ih->rptr);
+   else
+   break;
+   }
+}
+
 /**
  * navi10_ih_set_rptr - set the IH ring buffer rptr
  *
@@ -297,6 +330,9 @@ static void navi10_ih_set_rptr(struct amdgpu_device *adev,
/* XXX check if swapping is necessary on BE */
*ih->rptr_cpu = ih->rptr;
WDOORBELL32(ih->doorbell_index, ih->rptr);
+
+   if (amdgpu_sriov_vf(adev))
+   navi10_ih_irq_rearm(adev, ih);
} else
WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, ih->rptr);  }
--
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cmonk.liu%40amd.com%7Cd2e01b4b73cb4b75ae9f08d7ab3f27e1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637166160122750193sdata=NtTwlzGVJWf8D%2BKWiaQKiAile9n03KlIW70mu8TkKXM%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Bridgman, John
[AMD Official Use Only - Internal Distribution Only]

The one suggestion I saw that definitely seemed worth looking at was adding 
download caches if the larger CI systems didn't already have them.

Then again do we know that CI traffic is generating the bulk of the costs ? My 
guess would have been that individual developers and users would be generating 
as much traffic as the CI rigs.


From: amd-gfx  on behalf of Jason 
Ekstrand 
Sent: March 1, 2020 3:18 PM
To: Jacob Lifshay ; Nicolas Dufresne 

Cc: Erik Faye-Lund ; Daniel Vetter 
; Michel Dänzer ; X.Org development 
; amd-gfx list ; wayland 
; X.Org Foundation Board 
; Xorg Members List ; dri-devel 
; Mesa Dev ; 
intel-gfx ; Discussion of the development of 
and with GStreamer 
Subject: Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact 
on services

I don't think we need to worry so much about the cost of CI that we need to 
micro-optimize to to get the minimal number of CI runs. We especially shouldn't 
if it begins to impact coffee quality, people's ability to merge patches in a 
timely manner, or visibility into what went wrong when CI fails. I've seen a 
number of suggestions which will do one or both of those things including:

 - Batching merge requests
 - Not running CI on the master branch
 - Shutting off CI
 - Preventing CI on other non-MR branches
 - Disabling CI on WIP MRs
 - I'm sure there are more...

I think there are things we can do to make CI runs more efficient with some 
sort of end-point caching and we can probably find some truly wasteful CI to 
remove. Most of the things in the list above, I've seen presented by people who 
are only lightly involved the project to my knowledge (no offense to anyone 
intended).  Developers depend on the CI system for their day-to-day work and 
hampering it will only show down development, reduce code quality, and 
ultimately hurt our customers and community. If we're so desperate as to be 
considering painful solutions which will have a negative impact on development, 
we're better off trying to find more money.

--Jason


On March 1, 2020 13:51:32 Jacob Lifshay  wrote:

One idea for Marge-bot (don't know if you already do this):
Rust-lang has their bot (bors) automatically group together a few merge 
requests into a single merge commit, which it then tests, then, then the tests 
pass, it merges. This could help reduce CI runs to once a day (or some other 
rate). If the tests fail, then it could automatically deduce which one failed, 
by recursive subdivision or similar. There's also a mechanism to adjust 
priority and grouping behavior when the defaults aren't sufficient.

Jacob
___
Intel-gfx mailing list
intel-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Jason Ekstrand
I don't think we need to worry so much about the cost of CI that we need to 
micro-optimize to to get the minimal number of CI runs. We especially 
shouldn't if it begins to impact coffee quality, people's ability to merge 
patches in a timely manner, or visibility into what went wrong when CI 
fails. I've seen a number of suggestions which will do one or both of those 
things including:


- Batching merge requests
- Not running CI on the master branch
- Shutting off CI
- Preventing CI on other non-MR branches
- Disabling CI on WIP MRs
- I'm sure there are more...

I think there are things we can do to make CI runs more efficient with some 
sort of end-point caching and we can probably find some truly wasteful CI 
to remove. Most of the things in the list above, I've seen presented by 
people who are only lightly involved the project to my knowledge (no 
offense to anyone intended).  Developers depend on the CI system for their 
day-to-day work and hampering it will only show down development, reduce 
code quality, and ultimately hurt our customers and community. If we're so 
desperate as to be considering painful solutions which will have a negative 
impact on development, we're better off trying to find more money.


--Jason

On March 1, 2020 13:51:32 Jacob Lifshay  wrote:

One idea for Marge-bot (don't know if you already do this):
Rust-lang has their bot (bors) automatically group together a few merge 
requests into a single merge commit, which it then tests, then, then the 
tests pass, it merges. This could help reduce CI runs to once a day (or 
some other rate). If the tests fail, then it could automatically deduce 
which one failed, by recursive subdivision or similar. There's also a 
mechanism to adjust priority and grouping behavior when the defaults aren't 
sufficient.


Jacob
___
Intel-gfx mailing list
intel-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Jacob Lifshay
One idea for Marge-bot (don't know if you already do this):
Rust-lang has their bot (bors) automatically group together a few merge
requests into a single merge commit, which it then tests, then, then the
tests pass, it merges. This could help reduce CI runs to once a day (or
some other rate). If the tests fail, then it could automatically deduce
which one failed, by recursive subdivision or similar. There's also a
mechanism to adjust priority and grouping behavior when the defaults aren't
sufficient.

Jacob
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Michel Dänzer
On 2020-02-29 8:46 p.m., Nicolas Dufresne wrote:
> Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
>>
>> 1. I think we should completely disable running the CI on MRs which are
>> marked WIP. Speaking from personal experience, I usually make a lot of
>> changes to my MRs before they are merged, so it is a waste of CI
>> resources.

Interesting idea, do you want to create an MR implementing it?


> In the mean time, you can help by taking the habit to use:
> 
>   git push -o ci.skip

That breaks Marge Bot.


> Notably, we would like to get rid of the post merge CI, as in a rebase
> flow like we have in GStreamer, it's a really minor risk.

That should be pretty easy, see Mesa and
https://docs.gitlab.com/ce/ci/variables/predefined_variables.html.
Something like this should work:

  rules:
- if: '$CI_PROJECT_NAMESPACE != "gstreamer"'
  when: never

This is another interesting idea we could consider for Mesa as well. It
would however require (mostly) banning direct pushes to the main repository.


>> 2. Maybe we could take this one step further and only allow the CI to
>> be only triggered manually instead of automatically on every push.

That would again break Marge Bot.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx