The engine provides a mirror of the CSB in the HWSP. If we use the
cacheable reads from the HWSP, we can shave off a few mmio reads per
context-switch interrupt (which are quite frequent!). Just removing a
couple of mmio is not enough to actually reduce any latency, but a small
reduction in overall cpu usage.

Much appreciation for Ben dropping the bombshell that the CSB was in the
HWSP and for Michel in digging out the details.

Suggested-by: Ben Widawsky <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Cc: Michel Thierry <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Mika Kuoppala <[email protected]>
---
 drivers/gpu/drm/i915/intel_lrc.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9d231d0e427d..e413465a552b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -547,8 +547,8 @@ static void intel_lrc_irq_handler(unsigned long data)
        while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
                u32 __iomem *csb_mmio =
                        dev_priv->regs + 
i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
-               u32 __iomem *buf =
-                       dev_priv->regs + 
i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
+               /* The HWSP contains a (cacheable) mirror of the CSB */
+               u32 *buf = &engine->status_page.page_addr[0x10];
                unsigned int head, tail;
 
                /* The write will be ordered by the uncached read (itself
@@ -590,13 +590,12 @@ static void intel_lrc_irq_handler(unsigned long data)
                         * status notifier.
                         */
 
-                       status = readl(buf + 2 * head);
+                       status = buf[2 * head];
                        if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
                                continue;
 
                        /* Check the context/desc id for this event matches */
-                       GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
-                                        port->context_id);
+                       GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
 
                        rq = port_unpack(port, &count);
                        GEM_BUG_ON(count == 0);
-- 
2.13.2

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to