Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-15 Thread akash goel
On Wed, Nov 16, 2016 at 6:55 AM, Hugh Dickins  wrote:
> On Mon, 14 Nov 2016, akash goel wrote:
>> On Thu, Nov 10, 2016 at 1:00 PM, Goel, Akash  wrote:
>> > On 11/10/2016 12:09 PM, Hugh Dickins wrote:
>> >> On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
>> >>> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
>> >>>
>> >>> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
>> >>> +   if (IS_ENABLED(MIGRATION))
>
> Oh, I knew I'd seen a line like that recently, and it was bugging me
> that I ought to search my mailboxes for it; but now I'm glad to find
> it again.  If that condition stays, it would really need to say
>   if (IS_ENABLED(CONFIG_MIGRATION))
> wouldn't it?
>
Sorry this was a blooper, should have been
 if (IS_ENABLED(CONFIG_MIGRATION))

>> >>> +   mask |= __GFP_MOVABLE;
>> >>
>> >>
>> >> I was going to suggest just make that unconditional,
>> >> mask = GFP_HIGHUSER_MOVABLE | __GFP_RECLAIMABLE;
>> >>
>> >> But then I wondered what that __GFP_RECLAIMABLE actually achieves?
>> >> These pages are already __GFP_RECLAIM (inside GFP_HIGHUSER) and on
>> >> the LRU.  It affects gfpflags_to_migratetype(), but I'm not familiar
>> >> with what that different migratetype will end up doing.
>> >>
>> >
>> > Will check for this.
>> >
>>
>> The anti-fragmentation technique used by kernel is based on the idea
>> of grouping pages with identical mobility (UNMOVABLE, RECLAIMABLE,
>> MOVABLE) together.
>
> Yes.
>
>> __GFP_RECLAIMABLE, like  __GFP_MOVABLE, specifies the
>> mobility/migration type of the page and serves a different purpose
>> than __GFP_RECLAIM.
>
> Yes, I was wrong to mention __GFP_RECLAIM above: it describes what
> to do when in difficulty allocating a page, but says nothing at all
> about the nature of the page to be allocated.
>
Right, nicely phrased, thanks.

>>
>> Also as per the below snippet from gfpflags_to_migratetype(), looks
>> like __GFP_MOVABLE &  __GFP_RECLAIMABLE can't be used together, which
>> makes sense.
>> /* Convert GFP flags to their corresponding migrate type */
>> #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE | __GFP_MOVABLE)
>> static inline int gfpflags_to_migratetype(const gfp_t gfp_flags)
>> {
>> VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
>> .
>
> You're right, that does exclude them from being used together.  And it
> makes sense inasmuch as they're expected to be appled to quite different
> uses of a page (lru pages versus slab pages).
>
> The comment on __GFP_MOVABLE says "or can be reclaimed"; and
> the comment on __GFP_RECLAIMABLE says "used for slab allocations...".
> Though it does not say "used for allocations not put on a reclaimable
> lru", I think that is the intention; whereas shmem allocations are put
> on a reclaimable lru (though they might need your shrinker to unpin them).
>

As per my understanding both  __GFP_MOVABLE & __GFP_RECLAIMABLE type
pages would get added to the LRU list for reclaiming.
Irrespective of whether a shmem page is allocated as __GFP_MOVABLE
type or  __GFP_RECLAIMABLE type, it will be added to the LRU list.

>>
>> So probably would need to update the mask like this,
>>mask = GFP_HIGHUSER;
>>if (IS_ENABLED(MIGRATION))
>>  mask |= __GFP_MOVABLE;
>>else
>>  mask |=  __GFP_RECLAIMABLE;
>>
>> Please kindly let us know if this looks fine to you or not.
>
> Thanks for looking into it more deeply.  You leave me thinking that
> it should simply say
>
> mask = GFP_HIGHUSER_MOVABLE;
>
> Which is the default anyway, but it then has the Crestline+Broadwater
> condition to modify the mask further, so it's probably clearest to
> leave the mask = GFP_HIGHUSER_MOVABLE explicit.
>
> GFP_HIGHUSER_MOVABLE is used in many places, and includes __GFP_MOVABLE
> without any condition on CONFIG_MIGRATION - because the migratetype is
> irrelevant if there is no migration, I presume.
>
> Would you lose something by not or'ing in __GFP_RECLAIMABLE when
> CONFIG_MIGRATION=n?  No, because __GFP_RECLAIMABLE is not used for
> anything but the migratetype, and the migratetype is then irrelevant.
> (I didn't study the code closely enough to say whether the grouping
> can still happen even when migration is disabled, but even if it
> does still happen, I can't see that it would have any benefit.)
>
The 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-13 Thread akash goel
On Thu, Nov 10, 2016 at 1:00 PM, Goel, Akash  wrote:
>
>
> On 11/10/2016 12:09 PM, Hugh Dickins wrote:
>>
>> On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
>>>
>>> From: Chris Wilson 
>>>
>>> On a long run of more than 2-3 days, physical memory tends to get
>>> fragmented severely, which considerably slows down the system. In such a
>>> scenario, the shrinker is also unable to help as lack of memory is not
>>> the actual problem, since it has been observed that there are enough free
>>> pages of 0 order. This also manifests itself when an indiviual zone in
>>> the mm runs out of pages and if we cannot migrate pages between zones,
>>> the kernel hits an out-of-memory even though there are free pages (and
>>> often all of swap) available.
>>>
>>> To address the issue of external fragementation, kernel does a compaction
>>> (which involves migration of pages) but it's efficacy depends upon how
>>> many pages are marked as MOVABLE, as only those pages can be migrated.
>>>
>>> Currently the backing pages for GPU buffers are allocated from shmemfs
>>> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
>>> swap space, it may not be possible always to reclaim or swap-out pages of
>>> all the inactive objects, to make way for free space allowing formation
>>> of higher order groups of physically-contiguous pages on compaction.
>>>
>>> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
>>> pin the pages if they are in use by GPU, which will prevent their
>>> migration. So the migratepage callback in shmem is also hooked up to get
>>> a notification when kernel initiates the page migration. On the
>>> notification, i915.ko appropriately unpin the pages.  With this we can
>>> effectively mark the GPU pages as MOVABLE and hence mitigate the
>>> fragmentation problem.
>>>
>>> v2:
>>>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>>>shrinker file, and use the existing constructs (Chris)
>>>  - To cleanup, add a new helper function to encapsulate all page
>>> migration
>>>skip conditions (Chris)
>>>  - Add a new local helper function in shrinker file, for dropping the
>>>backing pages, and call the same from gem_shrink() also (Chris)
>>>
>>> v3:
>>>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
>>>
>>> v4:
>>>  - Minor tidy
>>>
>>> v5:
>>>  - Fix unsafe usage of unsafe_drop_pages()
>>>  - Rebase onto vmap-notifier
>>>
>>> v6:
>>> - Remove i915_gem_object_get/put across unsafe_drop_pages() as with
>>>   struct_mutex protection object can't disappear. (Chris)
>>>
>>> Testcase: igt/gem_shrink
>>> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
>>> Cc: Hugh Dickins 
>>> Cc: linux...@kvack.org
>>> Signed-off-by: Sourab Gupta 
>>> Signed-off-by: Akash Goel 
>>> Signed-off-by: Chris Wilson 
>>> Reviewed-by: Joonas Lahtinen 
>>> Reviewed-by: Chris Wilson 
>>
>>
>> I'm confused!  But perhaps it's gone around and around between you all,
>> I'm not sure what the rules are then.  I think this sequence implies
>> that Sourab wrote it originally, then Akash and Chris passed it on
>> with refinements - but then Chris wouldn't add Reviewed-by.
>>
> Thank you very much for the review and sorry for all the needless confusion.
>
> Chris actually conceived the patches and prepared an initial version of them
> (hence he is the Author).
> I & Sourab did the further refinements and fixed issues (all those
> page_private stuff).
> Chris then reviewed the final patch and also recently did a rebase for it.
>
>
>>> ---
>>>  drivers/gpu/drm/i915/i915_drv.h  |   2 +
>>>  drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
>>>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 132
>>> +++
>>>  3 files changed, 142 insertions(+), 1 deletion(-)
>>>
snip
>>
>>> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
>>> goto fail;
>>>
>>> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
>>> +   if (IS_ENABLED(MIGRATION))
>>> +   mask |= __GFP_MOVABLE;
>>
>>
>> I was going to suggest just make that unconditional,
>> mask = GFP_HIGHUSER_MOVABLE |

[Intel-gfx] [PATCH 1/2] shmem: Support for registration of driver/file owner specific ops

2016-11-04 Thread akash . goel
From: Chris Wilson 

This provides support for the drivers or shmem file owners to register
a set of callbacks, which can be invoked from the address space
operations methods implemented by shmem.  This allow the file owners to
hook into the shmem address space operations to do some extra/custom
operations in addition to the default ones.

The private_data field of address_space struct is used to store the
pointer to driver specific ops.  Currently only one ops field is defined,
which is migratepage, but can be extended on an as-needed basis.

The need for driver specific operations arises since some of the
operations (like migratepage) may not be handled completely within shmem,
so as to be effective, and would need some driver specific handling also.
Specifically, i915.ko would like to participate in migratepage().
i915.ko uses shmemfs to provide swappable backing storage for its user
objects, but when those objects are in use by the GPU it must pin the
entire object until the GPU is idle.  As a result, large chunks of memory
can be arbitrarily withdrawn from page migration, resulting in premature
out-of-memory due to fragmentation.  However, if i915.ko can receive the
migratepage() request, it can then flush the object from the GPU, remove
its pin and thus enable the migration.

Since gfx allocations are one of the major consumer of system memory, its
imperative to have such a mechanism to effectively deal with
fragmentation.  And therefore the need for such a provision for initiating
driver specific actions during address space operations.

v2:
- Drop dev_ prefix from the members of shmem_dev_info structure. (Joonas)
- Change the return type of shmem_set_device_op() to void and remove the
  check for pre-existing data. (Joonas)
- Rename shmem_set_device_op() to shmem_set_dev_info() to be consistent
  with shmem_dev_info structure. (Joonas)

Cc: Hugh Dickins 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.linux.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/shmem_fs.h | 13 +
 mm/shmem.c   | 17 -
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index ff078e7..454c3ba 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -39,11 +39,24 @@ struct shmem_sb_info {
unsigned long shrinklist_len; /* Length of shrinklist */
 };
 
+struct shmem_dev_info {
+   void *private_data;
+   int (*migratepage)(struct address_space *mapping,
+  struct page *newpage, struct page *page,
+  enum migrate_mode mode, void *dev_priv_data);
+};
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
return container_of(inode, struct shmem_inode_info, vfs_inode);
 }
 
+static inline void shmem_set_dev_info(struct address_space *mapping,
+ struct shmem_dev_info *info)
+{
+   mapping->private_data = info;
+}
+
 /*
  * Functions in mm/shmem.c called directly from elsewhere:
  */
diff --git a/mm/shmem.c b/mm/shmem.c
index ad7813d..fce8de3 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1290,6 +1290,21 @@ static int shmem_writepage(struct page *page, struct 
writeback_control *wbc)
return 0;
 }
 
+#ifdef CONFIG_MIGRATION
+static int shmem_migratepage(struct address_space *mapping,
+struct page *newpage, struct page *page,
+enum migrate_mode mode)
+{
+   struct shmem_dev_info *dev_info = mapping->private_data;
+
+   if (dev_info && dev_info->migratepage)
+   return dev_info->migratepage(mapping, newpage, page,
+mode, dev_info->private_data);
+
+   return migrate_page(mapping, newpage, page, mode);
+}
+#endif
+
 #if defined(CONFIG_NUMA) && defined(CONFIG_TMPFS)
 static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
 {
@@ -3654,7 +3669,7 @@ static void shmem_destroy_inodecache(void)
.write_end  = shmem_write_end,
 #endif
 #ifdef CONFIG_MIGRATION
-   .migratepage= migrate_page,
+   .migratepage= shmem_migratepage,
 #endif
.error_remove_page = generic_error_remove_page,
 };
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread akash . goel
From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

v6:
- Remove i915_gem_object_get/put across unsafe_drop_pages() as with
  struct_mutex protection object can't disappear. (Chris)

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++
 3 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);
 
+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);
 
/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
 
 err_pages:
sg_mark_end(sg);
-   for_each_sgt_page(page, sgt_iter, st)
+   for_each_sgt_page(page, sgt_iter, st) {
+   set_page_private(page, 0);
put_page(page);
+   }
sg_free_table(st);
kfree(st);
 
@@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
goto fail;
 
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+   if (IS_ENABLED(MIGRATION))
+   mask |= __GFP_MOVABLE;
if (IS_CRESTLINE(dev_priv) || IS_BROADWATER(dev_priv)) {
/* 965gm cannot relocate objects above 4GiB. */
mask &= ~__GFP_HIGHMEM;
@@ -4193,6 +4199,7 @@ struct drm_i915_gem_object *
 
mapping = obj->base.filp->f_mapping;
mapping_set_gfp_mask(mapping, mask);
+

[Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread akash . goel
From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 134 +++
 3 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);
 
+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);
 
/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
 
 err_pages:
sg_mark_end(sg);
-   for_each_sgt_page(page, sgt_iter, st)
+   for_each_sgt_page(page, sgt_iter, st) {
+   set_page_private(page, 0);
put_page(page);
+   }
sg_free_table(st);
kfree(st);
 
@@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
goto fail;
 
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+   if (IS_ENABLED(MIGRATION))
+   mask |= __GFP_MOVABLE;
if (IS_CRESTLINE(dev_priv) || IS_BROADWATER(dev_priv)) {
/* 965gm cannot relocate objects above 4GiB. */
mask &= ~__GFP_HIGHMEM;
@@ -4193,6 +4199,7 @@ struct drm_i915_gem_object *
 
mapping = obj->base.filp->f_mapping;
mapping_set_gfp_mask(mapping, mask);
+   shmem_set_dev_info(mapping, &dev_priv->mm.shmem_info);
 
i915_gem_object_init(obj, &i915_gem_object_ops);
 
diff -

[Intel-gfx] [PATCH 1/2] shmem: Support for registration of driver/file owner specific ops

2016-11-04 Thread akash . goel
From: Chris Wilson 

This provides support for the drivers or shmem file owners to register
a set of callbacks, which can be invoked from the address space
operations methods implemented by shmem.  This allow the file owners to
hook into the shmem address space operations to do some extra/custom
operations in addition to the default ones.

The private_data field of address_space struct is used to store the
pointer to driver specific ops.  Currently only one ops field is defined,
which is migratepage, but can be extended on an as-needed basis.

The need for driver specific operations arises since some of the
operations (like migratepage) may not be handled completely within shmem,
so as to be effective, and would need some driver specific handling also.
Specifically, i915.ko would like to participate in migratepage().
i915.ko uses shmemfs to provide swappable backing storage for its user
objects, but when those objects are in use by the GPU it must pin the
entire object until the GPU is idle.  As a result, large chunks of memory
can be arbitrarily withdrawn from page migration, resulting in premature
out-of-memory due to fragmentation.  However, if i915.ko can receive the
migratepage() request, it can then flush the object from the GPU, remove
its pin and thus enable the migration.

Since gfx allocations are one of the major consumer of system memory, its
imperative to have such a mechanism to effectively deal with
fragmentation.  And therefore the need for such a provision for initiating
driver specific actions during address space operations.

v2:
- Drop dev_ prefix from the members of shmem_dev_info structure. (Joonas)
- Change the return type of shmem_set_device_op() to void and remove the
  check for pre-existing data. (Joonas)
- Rename shmem_set_device_op() to shmem_set_dev_info() to be consistent
  with shmem_dev_info structure. (Joonas)

Cc: Hugh Dickins 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.linux.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/shmem_fs.h | 13 +
 mm/shmem.c   | 17 -
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index ff078e7..22796a0 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -39,11 +39,24 @@ struct shmem_sb_info {
unsigned long shrinklist_len; /* Length of shrinklist */
 };
 
+struct shmem_dev_info {
+   void *private_data;
+   int (*migratepage)(struct address_space *mapping,
+  struct page *newpage, struct page *page,
+  enum migrate_mode mode, void *dev_priv_data);
+};
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
return container_of(inode, struct shmem_inode_info, vfs_inode);
 }
 
+static inline void shmem_set_dev_info(struct address_space *mapping,
+struct shmem_dev_info *info)
+{
+   mapping->private_data = info;
+}
+
 /*
  * Functions in mm/shmem.c called directly from elsewhere:
  */
diff --git a/mm/shmem.c b/mm/shmem.c
index ad7813d..bf71ddd 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1290,6 +1290,21 @@ static int shmem_writepage(struct page *page, struct 
writeback_control *wbc)
return 0;
 }
 
+#ifdef CONFIG_MIGRATION
+static int shmem_migratepage(struct address_space *mapping,
+struct page *newpage, struct page *page,
+enum migrate_mode mode)
+{
+   struct shmem_dev_info *dev_info = mapping->private_data;
+
+   if (dev_info && dev_info->migratepage)
+   return dev_info->migratepage(mapping, newpage, page,
+   mode, dev_info->private_data);
+
+   return migrate_page(mapping, newpage, page, mode);
+}
+#endif
+
 #if defined(CONFIG_NUMA) && defined(CONFIG_TMPFS)
 static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
 {
@@ -3654,7 +3669,7 @@ static void shmem_destroy_inodecache(void)
.write_end  = shmem_write_end,
 #endif
 #ifdef CONFIG_MIGRATION
-   .migratepage= migrate_page,
+   .migratepage= shmem_migratepage,
 #endif
.error_remove_page = generic_error_remove_page,
 };
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v9] igt/gem_trtt: Exercise the TRTT hardware

2016-10-27 Thread akash . goel
From: Akash Goel 

This patch provides the testcase to exercise the TRTT hardware.

Some platforms have an additional address translation hardware support in
form of Tiled Resource Translation Table (TR-TT) which provides an extra level
of abstraction over PPGTT.
This is useful for mapping Sparse/Tiled texture resources.

TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
for a new PPGTT instance, but TR-TT may not enabled for every context.
1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
which such chunk to use is conveyed to HW through a register.
Any GFX address, which lies in that reserved 44 bit range will be translated
through TR-TT first and then through PPGTT to get the actual physical address.

TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
each level is contained within a 4KB page hence L3 and L2 is composed of
512 64b entries and L1 is composed of 1024 32b entries.

There is a provision to keep TR-TT Tables in virtual space, where the pages of
TRTT tables will be mapped to PPGTT. This is the adopted mode, as in this mode
UMD will have a full control on TR-TT management, with bare minimum support
from KMD.
So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
The entries of L1 table will contain the PPGTT offset of BOs actually backing
the Sparse resources.

I915_GEM_CONTEXT_SETPARAM ioctl is used to request KMD to enable TRTT for a
certain context, a new I915_CONTEXT_PARAM_ENABLE_TRTT param has been
added to the CONTEXT_SETPARAM ioctl for that purpose.

v2:
 - Add new wrapper function __gem_context_require_param and used that
   to detect the TR-TT support
 - Use igt_main macro, rename certain function, remove extra white space,
   cleanup the code (Chris)
 - Enhance the basic subtest to exercise all possible TR-TT segment start
   locations (i.e. 16 of them) & for every iteration create a new context.

v3:
 - Get rid of some superfluous local variables (Chris)
 - Add asserts to validate whether the GFX address used in MI_STORE_DATA_IMM
   command is in canonical form & is correctly aligned or not (Chris)
 - Remove clearing of errno in has_trtt_support function (Chris)
 - Use the 48B_ADDRESS flag for batch buffer BO also (Chris)
 - Rebased.

v4:
 - Add new subtest for invalid settings.
 - Add new local function query_trtt to check the Driver state (Chris)
 - Add new helper function gem_uses_64b_ppgtt to detect 64bit PPGTT support
 - Remove local functions uses_full_ppgtt & has_softpin_support, instead use
   existing wrappers gem_has_softpin & gem_uses_64b_ppgtt (Chris).
 - Remove redundant bit masking in emit_store_xxx functions (Chris).

v5:
 - Add 2 new subtests checking the forceful eviction of active/hanging
   objects overlapping with the TR-TT segment (Chris).
 - Move gen8_canonical_addr to igt_aux as its needed by other tests also,
   which does soft pinning, and not just gem_softpin (Michel)

v6:
 - Allow each subtest to have their own private drm file instance.
 - Update the basic subtest to check each segment location for both default
   and User created contexts (Chris).
 - Reorder igt_require(softpin/64b_ppgtt/trtt) to have trtt first (Chris).
 - Update the invalid subtest to check for the improper value of Null &
   Invalid tiles.

v7:
 - Add interruptible version of the 2 eviction subtests (Chris)
 - Correct the alignment check for the MI Store qword command (Chris).

v8:
 - Use new igt_interruptible macro for the interruptible subtests, as its
   more effective & precise.
 - Remove interruptible version of evict_hang subtest, as it doesn't provide
   any additional coverage (Chris)
 - Rebased.

v9:
- Rebased.

Cc: Chris Wilson 
Signed-off-by: Michel Thierry 
Signed-off-by: Akash Goel 
Reviewed-by: Michel Thierry  (v5)
---
 lib/igt_aux.c  |  11 +
 lib/igt_aux.h  |   1 +
 lib/igt_gt.h   |   1 +
 lib/ioctl_wrappers.c   |  39 ++-
 lib/ioctl_wrappers.h   |   3 +
 tests/Makefile.sources |   1 +
 tests/gem_softpin.c|  18 +-
 tests/gem_trtt.c   | 630 +
 8 files changed, 681 insertions(+), 23 deletions(-)
 create mode 100644 tests/gem_trtt.c

diff --git a/lib/igt_aux.c b/lib/igt_aux.c
index 421f6d4..9cb4666 100644
--- a/lib/igt_aux.c
+++ b/lib/igt_aux.c
@@ -487,6 +487,17 @@ void igt_exchange_int(void *array, unsigned i, unsigned j)
int_arr[j] = tmp;
 }
 
+/* igt_canonical_addr
+ * Used to convert any address into canonical form, i.e. [63:48] == [47].
+ * Based on kernel's sign_extend64 implementation.
+ * @address - a virtual address
+*/
+uint64_t igt_canonical_addr(uint64_t address)
+{
+   __u8 shift = 63 - GEN8_HIGH_ADDRESS_BIT;
+   return (__s64)(address << shift) >> shift;
+}
+
 /**
  * i

[Intel-gfx] [PATCH v2] drm/i915/guc: WA to address the Ringbuffer coherency issue

2016-10-25 Thread akash . goel
From: Akash Goel 

Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
pinned in mappable aperture portion of GGTT and for ringbuffer pages
allocated from Stolen memory, access can only be done through GMADR BAR.
In case of GuC based submission, updates done in ringbuffer via GMADR
may not get committed to memory by the time the Command streamer starts
reading them, resulting in fetching of stale data.

For Host based submission, such problem is not there as the write to Ring
Tail or ELSP register happens from the Host side prior to submission.
Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
enforces the ordering between outstanding GMADR writes & new GTTMADR access.
MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
registers within GT is contained within GT, so ordering is not enforced
resulting in a race, which can manifest in form of a hang.

To ensure the flush of in-flight GMADR writes, a POSTING READ is done to
GuC register prior to doorbell ring.
There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
which takes care of GMADR writes from User space to GEM buffers, but not the
ringbuffer writes from KMD.
This WA is needed on all recent HW.

v2:
- Use POSTING_READ_FW instead of POSTING_READ as GuC register do not lie
  in any forcewake domain range and so the overhead of spinlock & search
  in the forcewake table is avoidable. (Chris)

Cc: Chris Wilson 
Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index bf65ffa..74235ea 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -634,6 +634,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
  */
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
+   struct drm_i915_private *dev_priv = rq->i915;
unsigned int engine_id = rq->engine->id;
struct intel_guc *guc = &rq->i915->guc;
struct i915_guc_client *client = guc->execbuf_client;
@@ -641,6 +642,11 @@ static void i915_guc_submit(struct drm_i915_gem_request 
*rq)
 
spin_lock(&client->wq_lock);
guc_wq_item_append(client, rq);
+
+   /* WA to flush out the pending GMADR writes to ring buffer. */
+   if (i915_vma_is_map_and_fenceable(rq->ring->vma))
+   POSTING_READ_FW(GUC_STATUS);
+
b_ret = guc_ring_doorbell(client);
 
client->submissions[engine_id] += 1;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] shmem: Support for registration of Driver/file owner specific ops

2016-10-19 Thread akash goel
On Thu, Mar 24, 2016 at 5:41 PM, Joonas Lahtinen
 wrote:
> On ke, 2016-03-23 at 11:39 +0530, akash.g...@intel.com wrote:
>> From: Chris Wilson 
>>
>> This provides support for the Drivers or shmem file owners to register
>> a set of callbacks, which can be invoked from the address space operations
>> methods implemented by shmem.
>> This allow the file owners to hook into the shmem address space operations
>> to do some extra/custom operations in addition to the default ones.
>>
>> The private_data field of address_space struct is used to store the pointer
>> to driver specific ops.
>> Currently only one ops field is defined, which is migratepage, but can be
>> extended on need basis.
>>
>> The need for driver specific operations arises since some of the operations
>> (like migratepage) may not be handled completely within shmem, so as to be
>> effective, and would need some driver specific handling also.
>>
>> Specifically, i915.ko would like to participate in migratepage().
>> i915.ko uses shmemfs to provide swappable backing storage for its user
>> objects, but when those objects are in use by the GPU it must pin the entire
>> object until the GPU is idle. As a result, large chunks of memory can be
>> arbitrarily withdrawn from page migration, resulting in premature
>> out-of-memory due to fragmentation. However, if i915.ko can receive the
>> migratepage() request, it can then flush the object from the GPU, remove
>> its pin and thus enable the migration.
>>
>> Since Gfx allocations are one of the major consumer of system memory, its
>> imperative to have such a mechanism to effectively deal with fragmentation.
>> And therefore the need for such a provision for initiating driver specific
>> actions during address space operations.
>>
>> Cc: Hugh Dickins 
>> Cc: linux...@kvack.org
>> Signed-off-by: Sourab Gupta 
>> Signed-off-by: Akash Goel 
>> ---
>>  include/linux/shmem_fs.h | 17 +
>>  mm/shmem.c   | 17 -
>>  2 files changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
>> index 4d4780c..6cfa76a 100644
>> --- a/include/linux/shmem_fs.h
>> +++ b/include/linux/shmem_fs.h
>> @@ -34,11 +34,28 @@ struct shmem_sb_info {
>>   struct mempolicy *mpol; /* default memory policy for mappings */
>>  };
>>
>> +struct shmem_dev_info {
>> + void *dev_private_data;
>> + int (*dev_migratepage)(struct address_space *mapping,
>> +struct page *newpage, struct page *page,
>> +enum migrate_mode mode, void *dev_priv_data);
>
> One might want to have a separate shmem_dev_operations struct or
> similar.
>
Sorry for the very late turnaround.

Sorry couldn't get your point here. Are you suggesting to rename the
structure to shmem_dev_operations ?

>> +};
>> +
>>  static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
>>  {
>>   return container_of(inode, struct shmem_inode_info, vfs_inode);
>>  }
>>
>> +static inline int shmem_set_device_ops(struct address_space *mapping,
>> + struct shmem_dev_info *info)
>> +{
>> + if (mapping->private_data != NULL)
>> + return -EEXIST;
>> +
>
> I did a quick random peek and most set functions are just void and
> override existing data. I'd suggest the same.
>
>> + mapping->private_data = info;
>
Fine will change the return type to void and remove the check.

> Also, doesn't this kinda steal the mapping->private_data, might that be
> unexpected for the user? I notice currently it's not being touched at
> all.
>
Sorry by User do you mean the shmem client who called shmem_file_setup() ?
It seems clients are not expected to touch mapping->private_data and
so shmemfs can safely use it.

Best regards
Akash

>> + return 0;
>> +}
>> +
>>  /*
>>   * Functions in mm/shmem.c called directly from elsewhere:
>>   */
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 440e2a7..f8625c4 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -952,6 +952,21 @@ redirty:
>>   return 0;
>>  }
>>
>> +#ifdef CONFIG_MIGRATION
>> +static int shmem_migratepage(struct address_space *mapping,
>> +  struct page *newpage, struct page *page,
>> +  enum migrate_mode mode)
>> +{
>> + struct shmem_dev_info *dev_info = mapping->private_data;
>&

[Intel-gfx] [PATCH] drm/i915/guc: WA to address the Ringbuffer coherency issue

2016-10-14 Thread akash . goel
From: Akash Goel 

Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
pinned in mappable aperture portion of GGTT and for ringbuffer pages
allocated from Stolen memory, access can only be done through GMADR BAR.
In case of GuC based submission, updates done in ringbuffer via GMADR
may not get commited to memory by the time the Command streamer starts
reading them, resulting in fetching of stale data.
For Host based submission, such problem is not there as the write to Ring
Tail or ELSP register happens from the Host side prior to submission.
Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
enforces the ordering between outstanding GMADR writes & new GTTMADR access.
MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
registers within GT is contained within GT, so ordering is not enforced
resulting in a race, which can manifest in form of a hang.
To ensure the flush of in flight GMADR writes, a POSTING READ is done to
GuC register prior to doorbell ring.
There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
which takes care of GMADR writes from User space to GEM buffers, but not the
ringbuffer writes from KMD.
This WA is needed on all recent HW.

Cc: Chris Wilson 
Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index a1f76c8..43c8a72 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
  */
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
+   struct drm_i915_private *dev_priv = rq->i915;
unsigned int engine_id = rq->engine->id;
struct intel_guc *guc = &rq->i915->guc;
struct i915_guc_client *client = guc->execbuf_client;
@@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request 
*rq)
 
spin_lock(&client->wq_lock);
guc_wq_item_append(client, rq);
+
+   /* WA to flush out the pending GMADR writes to ring buffer. */
+   if (i915_vma_is_map_and_fenceable(rq->ring->vma))
+   POSTING_READ(GUC_STATUS);
+
b_ret = guc_ring_doorbell(client);
 
client->submissions[engine_id] += 1;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v11] drm/i915: Allocate intel_engine_cs structure only for the enabled engines

2016-10-13 Thread akash . goel
From: Akash Goel 

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c| 59 ++
 drivers/gpu/drm/i915/i915_drv.c| 11 +++---
 drivers/gpu/drm/i915/i915_drv.h| 22 ---
 drivers/gpu/drm/i915/i915_gem.c| 29 +++
 drivers/gpu/drm/i915/i915_gem_context.c| 15 +---
 drivers/gpu/drm/i915/i915_gem_evict.c  |  3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  6 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c| 14 +--
 drivers/gpu/drm/i915/i915_gem_request.c|  5 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c  | 28 +++---
 drivers/gpu/drm/i915/i915_guc_submission.c | 12 +++---
 drivers/gpu/drm/i915/i915_irq.c| 46 ---
 drivers/gpu/drm/i915/intel_breadcrumbs.c   |  6 ++-
 drivers/gpu/drm/i915/intel_display.c   |  8 ++--
 drivers/gpu/drm/i915/intel_engine_cs.c | 26 +
 drivers/gpu/drm/i915/intel_guc_loader.c|  6 ++-
 drivers/gpu/drm/i915/intel_lrc.c   | 11 +++---
 drivers/gpu/drm/i915/intel_overlay.c   |  2 +-
 drivers/gpu/drm/i915/intel_pm.c| 17 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c| 15 
 drivers/gpu/drm/i915/intel_ringbuffer.h|  8 +---
 22 files changed, 194 insertions(+), 158 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 70980f8..f191d7b 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1308,10 +1308,11 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv)
 {
struct intel_engine_cs *engine;
+   enum intel_engine_id id;
bool active = false;
 
/* If the command parser is not enabled, report 0 - unsupported */
-   for_each_engine(engine, dev_priv) {
+   for_each_engine(engine, dev_priv, id) {
if (intel_engine_needs_cmd_parser(engine)) {
active = true;
break;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 2e312e0..b20c1cc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -150,7 +150,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
   obj->bas

[Intel-gfx] [PATCH 16/18] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-10-12 Thread akash . goel
From: Akash Goel 

To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used. In order to avoid
any performance drop & have fast reads from the GuC log buffer, used SSE4.1
movntdqa based memcpy function i915_memcpy_from_wc, as copying using
movntqda from WC type memory is almost as fast as reading from WB memory.
This way log buffer sampling time will not get increased and so would be
able to deal with the flush interrupt storm when GuC is generating logs at
a very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

v3: Squash the WC type vmalloc mapping patch with this patch. (Chris)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 92bc14a..24d356d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1148,18 +1148,16 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 
/* Just copy the newly written data */
if (read_offset > write_offset) {
-   memcpy(dst_data, src_data, write_offset);
+   i915_memcpy_from_wc(dst_data, src_data, write_offset);
bytes_to_copy = buffer_size - read_offset;
} else {
bytes_to_copy = write_offset - read_offset;
}
-   memcpy(dst_data + read_offset,
-  src_data + read_offset, bytes_to_copy);
+   i915_memcpy_from_wc(dst_data + read_offset,
+   src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
-
-   /* FIXME: invalidate/flush for log buffer needed */
}
 
if (log_buf_snapshot_state)
@@ -1219,8 +1217,11 @@ static int guc_log_create_extras(struct intel_guc *guc)
return 0;
 
if (!guc->log.buf_addr) {
-   /* Create a vmalloc mapping of log buffer pages */
-   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB);
+   /* Create a WC (Uncached for read) vmalloc mapping of log
+* buffer pages, so that we can directly get the data
+* (up-to-date) from memory.
+*/
+   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
@@ -1263,6 +1264,16 @@ static void guc_log_create(struct intel_guc *guc)
 
vma = guc->log.vma;
if (!vma) {
+   /* We require SSE 4.1 for fast reads from the GuC log buffer and
+* it should be present on the chipsets supporting GuC based
+* submisssions.
+*/
+   if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+   /* logging will not be enabled */
+   i915.guc_log_level = -1;
+   return;
+   }
+
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
/* logging will be off */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 15/18] drm/i915: Debugfs support for GuC logging control

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

v5:
- Lock struct_mutex after the NULL check for guc log buffer vma. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 41 -
 drivers/gpu/drm/i915/i915_guc_submission.c | 59 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index e04329b..3690911 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2540,6 +2540,44 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
struct drm_i915_private *dev_priv = node_to_i915(m->private);
@@ -5441,7 +5479,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1564b65..92bc14a 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1603,3 +1613,52 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.value = control_val;
+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;
+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_param.verbosity;
+
+   /* If log_level was set as -1 at boot time, then the relay channel file
+* wouldn't have been created by now and inte

[Intel-gfx] [PATCH 17/18] drm/i915: Early creation of relay channel for capturing boot time logs

2016-10-12 Thread akash . goel
From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

v3:
- Add a comment to justiy setting 'is_global' before the NULL check on the
  parent directory dentry pointer.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 71 --
 1 file changed, 47 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 24d356d..43de16f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -915,15 +915,16 @@ static struct dentry *create_buf_file_callback(const char 
*filename,
 {
struct dentry *buf_file;
 
-   if (!parent)
-   return NULL;
-
/* This to enable the use of a single buffer for the relay channel and
 * correspondingly have a single file exposed to User, through which
 * it can collect the logs in order without any post-processing.
+* Need to set 'is_global' even if parent is NULL for early logging.
 */
*is_global = 1;
 
+   if (!parent)
+   return NULL;
+
/* Not using the channel filename passed as an argument, since for each
 * channel relay appends the corresponding CPU number to the filename
 * passed in relay_open(). This should be fine as relay just needs a
@@ -956,13 +957,40 @@ static void guc_log_remove_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
 }
 
-static int guc_log_create_relay_file(struct intel_guc *guc)
+static int guc_log_create_relay_channel(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;
 
+   /* Keep the size of sub buffers same as shared log buffer */
+   subbuf_size = guc->log.vma->obj->base.size;
+
+   /* Store up to 8 snapshots, which is large enough to buffer sufficient
+* boot time logs and provides enough leeway to User, in terms of
+* latency, for consuming the logs from relay. Also doesn't take
+* up too much memory.
+*/
+   n_subbufs = 8;
+
+   guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
+   n_subbufs, &relay_callbacks, dev_priv);
+   if (!guc_log_relay_chan) {
+   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
+   return -ENOMEM;
+   }
+
+   GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
+   guc->log.relay_chan = guc_log_relay_chan;
+   return 0;
+}
+
+static int guc_log_create_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
/* For now create the log file in /sys/kernel/debug/dri/0 dir */
log_dir = dev_priv->drm.primary->debugfs_root;
 
@@ -982,26 +1010,12 @@ static int guc_log_create_relay_file(struct intel_guc 
*guc)
return -ENODEV;
}
 
-   /* Keep the size of sub buffers same as shared log buffer */
-   subbuf_size = guc->log.vma->obj->base.size;
-
-   /* Store up to 8 snapshots, which is large enough to buffer sufficient
-* boot time logs and provides enough leeway to User, in terms of
-* latency, for consuming the logs from relay. Also doesn't take
-* up too much memory.
-*/
-   n_subbufs = 8;
-
-   guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size,
-   n_subbufs, &relay_callbacks, dev_priv);
-   if (!guc_log_relay_chan) {
-   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
-   return -ENOMEM;
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+  

[Intel-gfx] [PATCH 18/18] drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable

2016-10-12 Thread akash . goel
From: Akash Goel 

The GuC log buffer flush work item has to do a register access to send the
ack to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended. This work item function uses
rpm get/put calls around the Hw access, which covers the rpm suspend case
but for system suspend a sync would be required as kernel can potentially
schedule the work items even after some devices, including GFX, have been
put to suspend. But sync has to be done only for the system suspend case,
as sync along with rpm get/put can cause a deadlock for rpm suspend path.
To have the sync, but like a NOOP, for rpm suspend path also this work
item could have been queued from the irq handler only when the device is
runtime active & kept active while that work item is pending or getting
executed but an interrupt can come even after the device is out of use and
so can potentially lead to missing of this work item.

By marking the workqueue, dedicated for handling GuC log buffer flush
interrupts, as freezable we don't have to bother about flushing of this
work item from the suspend hooks, the pending work item if any will be
either executed before the suspend or scheduled later on resume. This way
the handling of log buffer flush work item can be kept same between system
suspend & rpm suspend.

Suggested-by: Imre Deak 
Cc: Imre Deak 
Signed-off-by: Akash Goel 
Reviewed-by: Imre Deak 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 43de16f..d476db7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1257,10 +1257,19 @@ static int guc_log_create_extras(struct intel_guc *guc)
if (!guc->log.flush_wq) {
INIT_WORK(&guc->log.flush_work, guc_capture_logs_work);
 
-   /* Need a dedicated wq to process log buffer flush interrupts
-* from GuC without much delay so as to avoid any loss of logs.
+/*
+* GuC log buffer flush work item has to do register access to
+* send the ack to GuC and this work item, if not synced before
+* suspend, can potentially get executed after the GFX device is
+* suspended.
+* By marking the WQ as freezable, we don't have to bother about
+* flushing of this work item from the suspend hooks, the 
pending
+* work item if any will be either executed before the suspend
+* or scheduled later on resume. This way the handling of work
+* item can be kept same between system suspend & rpm suspend.
 */
-   guc->log.flush_wq = alloc_ordered_workqueue("i915-guc_log", 
WQ_HIGHPRI);
+   guc->log.flush_wq = alloc_ordered_workqueue("i915-guc_log",
+   WQ_HIGHPRI | 
WQ_FREEZABLE);
if (guc->log.flush_wq == NULL) {
DRM_ERROR("Couldn't allocate the wq for GuC logging\n");
return -ENOMEM;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/18] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-10-12 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

v4: Add curlies for 'else' case also, matching the 'if'. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index e6239d3..c584362 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1057,11 +1057,12 @@ static unsigned int guc_get_log_buffer_size(enum 
guc_log_buffer_type type)
 
 static void guc_read_update_log_buffer(struct intel_guc *guc)
 {
+   unsigned int buffer_size, read_offset, write_offset, bytes_to_copy, 
full_cnt;
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
+   bool new_overflow;
 
if (WARN_ON(!guc->log.buf_addr))
return;
@@ -1084,12 +1085,13 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buf_state_local, log_buf_state,
   sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
+   read_offset = log_buf_state_local.read_ptr;
write_offset = log_buf_state_local.sampled_write_ptr;
full_cnt = log_buf_state_local.buffer_full_cnt;
 
/* Bookkeeping stuff */
guc->log.flush_count[type] += log_buf_state_local.flush_to_file;
-   guc_check_log_buf_overflow(guc, type, full_cnt);
+   new_overflow = guc_check_log_buf_overflow(guc, type, full_cnt);
 
/* Update the state of shared log buffer */
log_buf_state->read_ptr = write_offset;
@@ -1112,7 +1114,27 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
log_buf_snapshot_state++;
 
/* Now copy the actual logs. */
-   memcpy(dst_data, src_data, buffer_size);
+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable */
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset > write_offset) {
+   memcpy(dst_data, src_data, write_offset);
+   bytes_to_copy = buffer_size - read_offset;
+   } else {
+   bytes_to_copy = write_offset - read_offset;
+   }
+   memcpy(dst_data + read_offset,
+  src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/18] drm/i915: Support for forceful flush of GuC log buffer

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

GuC firmware sends a flush interrupt to Host when the log buffer is half
full and at that time only it updates the log buffer state.
But in certain cases, as described below, it could be useful to have all
that even when log buffer is only partially full. For that there is a force
log buffer flush Host2GuC action supported by GuC firmware.

For Validation requirements, a forceful flush is needed to collect the
left over logs on disabling logging. The same can be done before proceeding
with GPU/GuC reset as there could be some data in log buffer which is yet
to be captured and those logs would be particularly useful to understand
that why the reset was initiated.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c584362..1564b65 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
 }
 
+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1554,6 +1564,26 @@ void i915_guc_capture_logs(struct drm_i915_private 
*dev_priv)
intel_runtime_pm_put(dev_priv);
 }
 
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv)
+{
+   if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+   return;
+
+   /* First disable the interrupts, will be renabled afterwards */
+   gen9_disable_guc_interrupts(dev_priv);
+
+   /* Before initiating the forceful flush, wait for any pending/ongoing
+* flush to complete otherwise forceful flush may not actually happen.
+*/
+   flush_work(&dev_priv->guc.log.flush_work);
+
+   /* Ask GuC to update the log buffer state */
+   host2guc_force_logbuffer_flush(&dev_priv->guc);
+
+   /* GuC would have updated log buffer by now, so capture it */
+   i915_guc_capture_logs(dev_priv);
+}
+
 void i915_guc_unregister(struct drm_i915_private *dev_priv)
 {
if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index d034c98..71da75e 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -185,6 +185,7 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
*request);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/18] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-10-12 Thread akash . goel
From: Akash Goel 

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index adb1ffd..324ea90 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
 #define   GUC_LOG_CRASH_PAGES  1
 #define   GUC_LOG_CRASH_SHIFT  4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
 #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
 
@@ -437,9 +437,9 @@ enum guc_log_buffer_type {
  *|   Crash dump state header |
  * Page1  +---+
  *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
  * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
  *| Crash Dump logs   |
  *+---+
  *
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/18] drm/i915: Add stats for GuC log buffer flush interrupts

2016-10-12 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 28 +++
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 +-
 drivers/gpu/drm/i915/i915_irq.c|  2 ++
 drivers/gpu/drm/i915/intel_guc.h   |  7 +++
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 0fb9938..e04329b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2408,6 +2408,32 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
return 0;
 }
 
+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+
+   seq_puts(m, "\nGuC logging stats:\n");
+
+   seq_printf(m, "\tISR:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+   seq_printf(m, "\tDPC:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+   seq_printf(m, "\tCRASH: flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+   seq_printf(m, "\tTotal flush interrupt count: %u\n",
+  guc->log.flush_interrupt_count);
+
+   seq_printf(m, "\tCapture miss count: %u\n",
+  guc->log.capture_miss_count);
+}
+
 static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
@@ -2481,6 +2507,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
i915_guc_client_info(m, dev_priv, &client);
 
+   i915_guc_log_info(m, dev_priv);
+
/* Add more as required ... */
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 04b2e87..e6239d3 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1016,6 +1016,29 @@ static void *guc_get_write_buffer(struct intel_guc *guc)
return relay_reserve(guc->log.relay_chan, 0);
 }
 
+static bool
+guc_check_log_buf_overflow(struct intel_guc *guc,
+  enum guc_log_buffer_type type, unsigned int full_cnt)
+{
+   unsigned int prev_full_cnt = guc->log.prev_overflow_count[type];
+   bool overflow = false;
+
+   if (full_cnt != prev_full_cnt) {
+   overflow = true;
+
+   guc->log.prev_overflow_count[type] = full_cnt;
+   guc->log.total_overflow_count[type] += full_cnt - prev_full_cnt;
+
+   if (full_cnt < prev_full_cnt) {
+   /* buffer_full_cnt is a 4 bit counter */
+   guc->log.total_overflow_count[type] += 16;
+   }
+   DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   }
+
+   return overflow;
+}
+
 static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
 {
switch (type) {
@@ -1036,7 +1059,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 {
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset;
+   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
 
@@ -1062,6 +1085,11 @@ static void guc_re

[Intel-gfx] [PATCH 04/18] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-10-12 Thread akash . goel
From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +-
 drivers/gpu/drm/i915/i915_irq.c | 75 ++---
 drivers/gpu/drm/i915/intel_drv.h|  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
 4 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bf397b6..3870786 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1834,7 +1834,8 @@ struct drm_i915_private {
u32 de_irq_mask[I915_MAX_PIPES];
};
u32 gt_irq_mask;
-   u32 pm_irq_mask;
+   u32 pm_imr;
+   u32 pm_ier;
u32 pm_rps_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index ddff6f9..3a7ab2e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private 
*dev_priv,
 
assert_spin_locked(&dev_priv->irq_lock);
 
-   new_val = dev_priv->pm_irq_mask;
+   new_val = dev_priv->pm_imr;
new_val &= ~interrupt_mask;
new_val |= (~enabled_irq_mask & interrupt_mask);
 
-   if (new_val != dev_priv->pm_irq_mask) {
-   dev_priv->pm_irq_mask = new_val;
-   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
+   if (new_val != dev_priv->pm_imr) {
+   dev_priv->pm_imr = new_val;
+   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
POSTING_READ(gen6_pm_imr(dev_priv));
}
 }
 
-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
@@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private 
*dev_priv, uint32_t mask)
snb_update_pm_irq(dev_priv, mask, mask);
 }
 
-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
- uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
snb_update_pm_irq(dev_priv, mask, 0);
 }
 
-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
 
-   __gen6_disable_pm_irq(dev_priv, mask);
+   __gen6_mask_pm_irq(dev_priv, mask);
 }
 
-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
 {
i915_reg_t reg = gen6_pm_iir(dev_priv);
 
-   spin_lock_irq(&dev_priv->irq_lock);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   I915_WRITE(reg, reset_mask);
+   I915_WRITE(reg, reset_mask);
POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier |= enable_mask;
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   gen6_unmask_pm_irq(dev_priv, enable_mask);
+   /* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier &= ~disable_mask;
+   __gen6_mask_pm_irq(dev_priv, disable_mask);
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   /* though a barrier is missing here, but don't really need a one */
+}
+
+void gen6_reset_rps_interrup

[Intel-gfx] [PATCH 13/18] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-10-12 Thread akash . goel
From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

v2:
- For uniformity use existing helper function print_error_obj() to
  dump out contents of GuC log buffer, pretty printing is better left
  to userspace. (Chris)
- Skip the dumping of GuC log buffer when logging is disabled as it
  won't be of any use.
- Rebase.

v3: Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 15 +++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f8389af..c67dea5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -777,6 +777,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore;
+   struct drm_i915_error_object *guc_log;
 
struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 78cc13b..fc0f0b4 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -692,6 +692,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf 
*m,
 
print_error_obj(m, NULL, "Semaphores", error->semaphore);
 
+   print_error_obj(m, NULL, "GuC log buffer", error->guc_log);
+
if (error->overlay)
intel_overlay_print_error_state(m, error->overlay);
 
@@ -772,6 +774,7 @@ static void i915_error_state_free(struct kref *error_ref)
}
 
i915_error_object_free(error->semaphore);
+   i915_error_object_free(error->guc_log);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -1382,6 +1385,17 @@ static void i915_capture_pinned_buffers(struct 
drm_i915_private *dev_priv,
error->pinned_bo = bo;
 }
 
+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+   struct drm_i915_error_state *error)
+{
+   /* Capturing log buf contents won't be useful if logging was disabled */
+   if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
+   return;
+
+   error->guc_log = i915_error_object_create(dev_priv,
+ dev_priv->guc.log.vma);
+}
+
 /* Capture all registers which don't fit into another category. */
 static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
   struct drm_i915_error_state *error)
@@ -1506,6 +1520,7 @@ static int capture(void *data)
i915_gem_record_rings(error->i915, error);
i915_capture_active_buffers(error->i915, error);
i915_capture_pinned_buffers(error->i915, error);
+   i915_gem_capture_guc_log_buffer(error->i915, error);
 
do_gettimeofday(&error->time);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-10-12 Thread akash . goel
From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
  associated with the debugfs file. (Tvrtko)

v6:
- Move the write to 'is_global' after the NULL check on parent directory
  dentry pointer. (Tvrtko)

v7: Add a BUG_ON to validate relay buffer allocation size. (Chris)

Testcase: igt/tools/intel_guc_logger

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 213 -
 drivers/gpu/drm/i915/intel_guc.h   |   3 +
 4 files changed, 217 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 6aedc96..105c468 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -12,6 +12,7 @@ config DRM_I915
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
+   select RELAY
# i915 depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 89d3222..09a2944 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1128,6 +1128,7 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
/* Reveal our presence to userspace */
if (drm_dev_register(dev, 0) == 0) {
i915_debugfs_register(dev_priv);
+   i915_guc_register(dev_priv);
i915_setup_sysfs(dev_priv);
} else
DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1166,6 +1167,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
intel_opregion_unregister(dev_priv);
 
i915_teardown_sysfs(dev_priv);
+   i915_guc_unregister(dev_priv);
i915_debugfs_unregister(dev_priv);
drm_dev_unregister(&dev_priv->drm);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b3c81a2..2cdc3ca 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include "i915_drv.h"
 #include "intel_guc.h"
 
@@ -856,13 +858,160 @@ err:
return NULL;
 }
 
+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+void *subbuf,
+void *prev_subbuf,
+size_t prev_padding)
+{
+   /* Use no-overwrite mode by default, where relay will stop accepting
+* new data if there are no empty sub buffers left.
+* There is no strict synchronization enforced by relay between Consumer
+* and Producer. In overwrite mode, there is a possibility of getting
+* inconsistent/garbled data, the producer could be writing on to the
+* same sub buffer from which Consumer is reading. This can't be avoided
+* unless Consumer is fast enough and ca

[Intel-gfx] [PATCH 05/18] drm/i915: Support for GuC interrupts

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  5 +++
 drivers/gpu/drm/i915/i915_irq.c| 56 +++---
 drivers/gpu/drm/i915/i915_reg.h| 11 ++
 drivers/gpu/drm/i915/intel_drv.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc_loader.c|  4 +++
 7 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3870786..f8389af 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1837,6 +1837,7 @@ struct drm_i915_private {
u32 pm_imr;
u32 pm_ier;
u32 pm_rps_events;
+   u32 pm_guc_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef38a32..889a72b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1084,6 +1084,8 @@ int intel_guc_suspend(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   gen9_disable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1110,6 +1112,9 @@ int intel_guc_resume(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   if (i915.guc_log_level >= 0)
+   gen9_enable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3a7ab2e..69f737f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private 
*dev_priv,
 } while (0)
 
 static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
 
 /* For display hotplug interrupt */
 static inline void
@@ -417,6 +418,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private 
*dev_priv)
gen6_reset_rps_interrupts(dev_priv);
 }
 
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+  dev_priv->pm_guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+   }
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   dev_priv-

[Intel-gfx] [PATCH 09/18] drm/i915: New lock to serialize the Host2GuC actions

2016-10-12 Thread akash . goel
From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
while waiting for the action's response from GuC. (Tvrtko)

v3: To conform to the general rules, acquire mutex before taking the
forcewake. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/intel_guc.h   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2cdc3ca..04b2e87 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -87,6 +87,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
if (WARN_ON(len < 1 || len > 15))
return -EINVAL;
 
+   mutex_lock(&guc->action_lock);
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
dev_priv->guc.action_count += 1;
@@ -125,6 +126,7 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
dev_priv->guc.action_status = status;
 
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+   mutex_unlock(&guc->action_lock);
 
return ret;
 }
@@ -1365,6 +1367,7 @@ int i915_guc_submission_init(struct drm_i915_private 
*dev_priv)
 
guc->ctx_pool_vma = vma;
ida_init(&guc->ctx_ids);
+   mutex_init(&guc->action_lock);
guc_log_create(guc);
guc_addon_create(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index e7999dd..907d13a 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -157,6 +157,9 @@ struct intel_guc {
 
uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
+
+   /* To serialize the Host2GuC actions */
+   struct mutex action_lock;
 };
 
 /* intel_guc_loader.c */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

v7:
- Use switch statement instead of 'if else' for retrieving the GuC log
  buffer size. (Tvrtko)
- Refactored the log buffer copying function and shortended the name of
  couple of variables for better readability. (Tvrtko)

v8:
- Make the dedicated wq as a high priority one to further reduce the
  turnaround time of handing log buffer flush event from GuC.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 186 +
 drivers/gpu/drm/i915/i915_irq.c|  28 -
 drivers/gpu/drm/i915/intel_guc.h   |   4 +
 3 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 889a72b..b3c81a2 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -170,6 +170,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
 
+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+   u32 data[1];
+
+   data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+   return host2guc_action(guc, data, 1);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -847,6 +856,163 @@ err:
return NULL;
 }
 
+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+   return NULL;
+}
+
+static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
+{
+   switch (type) {
+   case GUC_ISR_LOG_BUFFER:
+   return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE;
+   case GUC_DPC_LOG_BUFFER:
+   return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE;
+   case GUC_CRASH_DUMP_LOG_BUFFER:
+   return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE;
+   default:
+   MISSING_CASE(type);
+   }
+
+   return 0;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
+   struct guc_log_buffer_state log_buf_state_local;
+   unsigned int buffer_size, write_offset;
+   enum guc_log_buffer_type type;
+   void *src_data, *dst_data;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buf_state = src_data = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   log_buf_snapshot_state = dst_data = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data += PAGE_SIZE;
+   

[Intel-gfx] [PATCH 07/18] relay: Use per CPU constructs for the relay channel buffer pointers

2016-10-12 Thread akash . goel
From: Akash Goel 

relay essentially needs to maintain the per CPU array of channel buffer
pointers but it manually creates that array.
Instead its better to avail the per CPU constructs, provided by the
kernel, to allocate & access the array of pointer to channel buffers.

This patch is queued for merge in linux-next via akpm.

v2: Include  in relay.h so that it pulls in the percpu
api explicitly. (Chris)

Link: 
http://lkml.kernel.org/r/1470909140-25919-1-git-send-email-akash.g...@intel.com
Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/relay.h | 17 +++-
 kernel/relay.c| 74 +--
 2 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index d7c8359..eb295e3 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Tracks changes to rchan/rchan_buf structs
@@ -63,7 +64,7 @@ struct rchan
struct kref kref;   /* channel refcount */
void *private_data; /* for user-defined data */
size_t last_toobig; /* tried to log event > subbuf size */
-   struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
+   struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
int is_global;  /* One global buffer ? */
struct list_head list;  /* for channel list */
struct dentry *parent;  /* parent dentry passed to open */
@@ -204,7 +205,7 @@ static inline void relay_write(struct rchan *chan,
struct rchan_buf *buf;
 
local_irq_save(flags);
-   buf = chan->buf[smp_processor_id()];
+   buf = *this_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
@@ -230,12 +231,12 @@ static inline void __relay_write(struct rchan *chan,
 {
struct rchan_buf *buf;
 
-   buf = chan->buf[get_cpu()];
+   buf = *get_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
-   put_cpu();
+   put_cpu_ptr(chan->buf);
 }
 
 /**
@@ -251,17 +252,19 @@ static inline void __relay_write(struct rchan *chan,
  */
 static inline void *relay_reserve(struct rchan *chan, size_t length)
 {
-   void *reserved;
-   struct rchan_buf *buf = chan->buf[smp_processor_id()];
+   void *reserved = NULL;
+   struct rchan_buf *buf = *get_cpu_ptr(chan->buf);
 
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
-   return NULL;
+   goto end;
}
reserved = buf->data + buf->offset;
buf->offset += length;
 
+end:
+   put_cpu_ptr(chan->buf);
return reserved;
 }
 
diff --git a/kernel/relay.c b/kernel/relay.c
index d797502..f55ab82 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -214,7 +214,7 @@ static void relay_destroy_buf(struct rchan_buf *buf)
__free_page(buf->page_array[i]);
relay_free_page_array(buf->page_array);
}
-   chan->buf[buf->cpu] = NULL;
+   *per_cpu_ptr(chan->buf, buf->cpu) = NULL;
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
@@ -382,20 +382,21 @@ static void __relay_reset(struct rchan_buf *buf, unsigned 
int init)
  */
 void relay_reset(struct rchan *chan)
 {
+   struct rchan_buf *buf;
unsigned int i;
 
if (!chan)
return;
 
-   if (chan->is_global && chan->buf[0]) {
-   __relay_reset(chan->buf[0], 0);
+   if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+   __relay_reset(buf, 0);
return;
}
 
mutex_lock(&relay_channels_mutex);
for_each_possible_cpu(i)
-   if (chan->buf[i])
-   __relay_reset(chan->buf[i], 0);
+   if ((buf = *per_cpu_ptr(chan->buf, i)))
+   __relay_reset(buf, 0);
mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_reset);
@@ -440,7 +441,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
struct dentry *dentry;
 
if (chan->is_global)
-   return chan->buf[0];
+   return *per_cpu_ptr(chan->buf, 0);
 
buf = relay_create_buf(chan);
   

[Intel-gfx] [PATCH v10 00/18] Support for sustained capturing of GuC firmware logs

2016-10-12 Thread akash . goel
From: Akash Goel 

GuC firmware log its debug messages into a Host-GuC shared memory buffer
and when the buffer is half full it sends a Flush interrupt to Host.
GuC firmware follows the half-full draining protocol where it expects that
while it is writing to 2nd half of the buffer, 1st half would get consumed
by Host and then get a flush completed acknowledgment from Host, so that
it does not end up doing any overwrite causing loss of logs.
So far flush interrupt wasn't enabled on Host side & User could capture the
contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface.
But this couldn't meet couple of key requirements, especially of Validation,
first is to ensure capturing of all boot time logs even with high verbosity
level and second is to enable capturing of logs in a sustained manner like
for the entire duration of a workload.
Now Driver will enable flush interrupt and on receiving it, would copy the
contents of log buffer into its local buffer. The size of local buffer would
be big enough to contain multiple snapshots of the log buffer giving ample
time to User to pull boot time messages.
Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to
collect the logs. Availed relay framework to implement this interface, where
Driver will have to just use a relay API to store snapshots of GuC log buffer
in a buffer managed by relay. The relay buffer can be operated in a mode,
equivalent to 'dmesg -c' where the old data, not yet collected by User, will
be overwritten if buffer becomes full or it can be operated in no-overwrite
mode where relay will stop accepting new data if all sub buffers are full.
Have used the latter mode to avoid the possibility of getting garbled data. 
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the log
buffer is taken by Driver, so can run in tandem with the Driver and thus
capture logs in a sustained/streaming manner, without any loss of data.

The logs can be captured from relay backed debugfs file through the utility
igt/tools/intel_guc_logger.

v2: Rebased to the latest drm-intel-nightly.

v3: Aligned with the modification of late debugfs registration, at the end of
i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3
new patches to optimize the log-buffer flush interrupt handling, gather
and report the logging related stats.

v4: Added 2 new patches to further optimize the log-buffer flush interrupt
handling. Did cleanup as per Chris's review comments, fixed couple of
issues related to clearing of Guc2Host message register. Switched to
no-overwrite mode for the relay.

v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided
by a patch from Chris. Dropped the rt priority kthread patch, after
evaluating all the optimizations with certain benchmarks like
synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts
almost at every ms or less. Updated the older patches as per the review
comments from Tvrtko and Chris W. Added a new patch to augment i915 error
state with the GuC log buffer contents. Fixed the issue of User interrupt
getting disabled for VEBOX ring, causing failure for certain IGTs.
Also included 2 patches to support early logging for capturing boot
time logs and use per CPU constructs on the relay side so as to address
a WARNING issue with the call to relay_reserve(), without disabling
preemption.

v6: Mainly did the rebasing, refactoring, cleanup as per the review comments
and fixed error/warnings reported by checkpatch.

v7: Added a new patch to complete the pending log buffer flush work item in
system suspend case. Cleaned up the irq handler & work item function
by removing the check for GuC interrupts.

v8: Replaced the patch added in last version with a patch which marks the
GuC log buffer flush interrupt handling WQ as freezable, as per the inputs
from Imre. Refactored the log buffer sampling function and added a new
helper function to improve the readability as per suggestions from Tvrtko.

v9: As per Chris's comment, removed the forceful flush of GuC log buffer from
the error state capture path as that could have disturbed the atomicity
required in error state path. Squashed the wc type vmalloc mapping patch
with SSE4.1 movntdqa based memcpy patch. Added a BUG_ON for the relay
buffer allocation size.

v10: Mainly rebasing. Made the dedicated WQ as a high priority one.

Akash Goel (12):
  drm/i915: New structure to contain GuC logging related fields
  drm/i915: Add low level set of routines for programming PM IER/IIR/IMR
register set
  relay: Use per CPU constructs for the relay channel buffer pointers
  drm/i915: 

[Intel-gfx] [PATCH 03/18] drm/i915: New structure to contain GuC logging related fields

2016-10-12 Thread akash . goel
From: Akash Goel 

So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 
 drivers/gpu/drm/i915/intel_guc.h   | 8 ++--
 drivers/gpu/drm/i915/intel_guc_loader.c| 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 358663e..0fb9938 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2492,10 +2492,10 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!dev_priv->guc.log_vma)
+   if (!dev_priv->guc.log.vma)
return 0;
 
-   obj = dev_priv->guc.log_vma->obj;
+   obj = dev_priv->guc.log.vma->obj;
for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2ce98aaf..ef38a32 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -862,7 +862,7 @@ static void guc_log_create(struct intel_guc *guc)
GUC_LOG_ISR_PAGES + 1 +
GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
 
-   vma = guc->log_vma;
+   vma = guc->log.vma;
if (!vma) {
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
@@ -871,7 +871,7 @@ static void guc_log_create(struct intel_guc *guc)
return;
}
 
-   guc->log_vma = vma;
+   guc->log.vma = vma;
}
 
/* each allocated unit is a page */
@@ -881,7 +881,7 @@ static void guc_log_create(struct intel_guc *guc)
(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
 
offset = i915_ggtt_offset(vma) >> PAGE_SHIFT; /* in pages */
-   guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+   guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
 static void guc_policies_init(struct guc_policies *policies)
@@ -1063,7 +1063,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
struct intel_guc *guc = &dev_priv->guc;
 
i915_vma_unpin_and_release(&guc->ads_vma);
-   i915_vma_unpin_and_release(&guc->log_vma);
+   i915_vma_unpin_and_release(&guc->log.vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5cdf7aa..c732941 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,10 +123,14 @@ struct intel_guc_fw {
uint32_t ucode_offset;
 };
 
+struct intel_guc_log {
+   uint32_t flags;
+   struct i915_vma *vma;
+};
+
 struct intel_guc {
struct intel_guc_fw guc_fw;
-   uint32_t log_flags;
-   struct i915_vma *log_vma;
+   struct intel_guc_log log;
 
struct i915_vma *ads_vma;
struct i915_vma *ctx_pool_vma;
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index c37829f..410d98a 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -209,7 +209,7 @@ static void guc_params_init(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+   params[GUC_CTL_LOG_PARAMS] = guc->log.flags;
 
if (i915.guc_log_level >= 0) {
params[GUC_CTL_DEBUG] =
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/18] drm/i915: Add GuC ukernel logging related fields to fw interface file

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index e40db2d..adb1ffd 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -419,15 +419,87 @@ struct guc_ads {
u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+   GUC_ISR_LOG_BUFFER,
+   GUC_DPC_LOG_BUFFER,
+   GUC_CRASH_DUMP_LOG_BUFFER,
+   GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +---+
+ *|   ISR state header (32 bytes) |
+ *|  DPC state header |
+ *|   Crash dump state header |
+ * Page1  +---+
+ *|   ISR logs|
+ * Page5  +---+
+ *|   DPC logs|
+ * Page9  +---+
+ *| Crash Dump logs   |
+ *+---+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgment from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receiving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgment to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+   u32 marker[2];
+   u32 read_ptr;
+   u32 write_ptr;
+   u32 size;
+   u32 sampled_write_ptr;
+   union {
+   struct {
+   u32 flush_to_file:1;
+   u32 buffer_full_cnt:4;
+   u32 reserved:27;
+   };
+   u32 flags;
+   };
+   u32 version;
+} __packed;
+
+union guc_log_control {
+   struct {
+   u32 logging_enabled:1;
+   u32 reserved1:3;
+   u32 verbosity:4;
+   u32 reserved2:24;
+   };
+   u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
HOST2GUC_ACTION_DEFAULT = 0x0,
HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+   HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+   HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
HOST2GUC_ACTION_LIMIT
 };
 
@@ -449,4 +521,10 @@ enum guc2host_status {
GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0xF000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+   GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+   GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists

[Intel-gfx] [PATCH 01/18] drm/i915: Decouple GuC log setup from verbosity parameter

2016-10-12 Thread akash . goel
From: Sagar Arun Kamble 

GuC Log buffer allocation was tied up with verbosity level module param
i915.guc_log_level. User would be given a provision to enable firmware
logging at runtime, through a host2guc action, and not necessarily during
Driver load time. But the address of log buffer can be passed only in
init params, at firmware load time, so GuC has to be reset and firmware
needs to be reloaded to pass the log buffer address at runtime.
To avoid reset of GuC & reload of firmware, allocation of log buffer will
be done always but logging would be enabled initially on GuC side based on
the value of module parameter guc_log_level.

v2: Update commit message to describe the constraint with allocation of
log buffer at runtime. (Tvrtko)

v3: Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
 drivers/gpu/drm/i915/intel_guc_loader.c| 6 --
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 3106dcc..2ce98aaf 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -853,9 +853,6 @@ static void guc_log_create(struct intel_guc *guc)
unsigned long offset;
uint32_t size, flags;
 
-   if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-   return;
-
if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 7ace96b..c37829f 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -209,11 +209,13 @@ static void guc_params_init(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
+   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
if (i915.guc_log_level >= 0) {
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
params[GUC_CTL_DEBUG] =
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-   }
+   } else
+   params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;
 
if (guc->ads_vma) {
u32 ads = i915_ggtt_offset(guc->ads_vma) >> PAGE_SHIFT;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v4] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-10-10 Thread akash . goel
From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware logs and
then dump them to file.
The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and
by default stored into a file 'guc_log_dump.dat'. The name, including the
location, of the output file can be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival of new
logs and as soon as new set of logs are produced it captures them in its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility (CTRL+C).

Before entering into a loop, it first discards whatever logs are present in
the debugfs file.
This way User can first launch this utility and then start a workload/activity
for which GuC firmware logs are to be actually captured and keep running the
utility for as long as its needed, like once the workload is over this utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time, utility
will first enable the logging and later on when it is stopped (CTRL+C) it will
also pause the logging on GuC side.

v2:
- Use combination of alarm system call & SIGALRM signal to run the utility
  for required duration. (Tvrtko)
- Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko)

v3:
- Fix discrepancy for the output file command line option and update the
  Usage/help string.

v4:
- Update the exit condition for flusher thread, now will exit only after
  the capture loop is over and not when the flag to stop logging is set.
  This handles a corner case, due to which the dump of last captured buffer
  was getting missed.
- Add a newline character at the end of assert messages.
- Avoid the assert for the case, which occurs very rarely, when there are no
  bytes read from the relay file.

Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin  (v3)
---
 tools/Makefile.sources   |   1 +
 tools/intel_guc_logger.c | 438 +++
 2 files changed, 439 insertions(+)
 create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
intel_gpu_time  \
intel_gpu_top   \
intel_gtt   \
+   intel_guc_logger\
intel_infoframes\
intel_l3_parity \
intel_lid   \
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..159a54e
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,438 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+/* Currently the size of GuC log buffer is 19 pages & so is the size of relay
+ * subbuffer. If the size changes in future, then this define also needs to be
+ * updated accordingly.
+ */
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency, Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define DEFAULT_OUTPUT_FILE_NAME  "guc_log_dump.dat"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+bool stop_logging, discard_oldlogs, capturing_stopped;
+
+static void guc_log_control(bool enable_logging)
+{
+   int control_fd;
+   char data[19];
+   uint64_t val;
+   int ret;
+
+   control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+   igt_assert_f(control_fd >= 0, "couldn't open the guc log control 
file\n");
+
+   val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+   ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
+   igt_assert(ret > 2 && ret < sizeof(data));
+
+   ret = write(control_fd, data, ret);
+   igt_assert_f(ret > 0, "couldn't write to the log control file\n");
+
+   close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+   igt_info("received s

[Intel-gfx] [PATCH v10] drm/i915: Allocate intel_engine_cs structure only for the enabled engines

2016-10-07 Thread akash . goel
From: Akash Goel 

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

Cc: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c| 57 ++
 drivers/gpu/drm/i915/i915_drv.c| 11 +++---
 drivers/gpu/drm/i915/i915_drv.h| 22 
 drivers/gpu/drm/i915/i915_gem.c| 29 +++
 drivers/gpu/drm/i915/i915_gem_context.c| 15 
 drivers/gpu/drm/i915/i915_gem_evict.c  |  3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  6 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c| 14 +---
 drivers/gpu/drm/i915/i915_gem_request.c|  5 +--
 drivers/gpu/drm/i915/i915_gpu_error.c  | 30 
 drivers/gpu/drm/i915/i915_guc_submission.c | 12 ---
 drivers/gpu/drm/i915/i915_irq.c| 44 ---
 drivers/gpu/drm/i915/intel_breadcrumbs.c   |  6 ++--
 drivers/gpu/drm/i915/intel_display.c   |  8 ++---
 drivers/gpu/drm/i915/intel_engine_cs.c | 26 ++
 drivers/gpu/drm/i915/intel_guc_loader.c|  6 ++--
 drivers/gpu/drm/i915/intel_lrc.c   | 11 +++---
 drivers/gpu/drm/i915/intel_overlay.c   |  2 +-
 drivers/gpu/drm/i915/intel_pm.c| 17 +
 drivers/gpu/drm/i915/intel_ringbuffer.c| 15 
 drivers/gpu/drm/i915/intel_ringbuffer.h|  8 +
 22 files changed, 193 insertions(+), 157 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 70980f8..f191d7b 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1308,10 +1308,11 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv)
 {
struct intel_engine_cs *engine;
+   enum intel_engine_id id;
bool active = false;
 
/* If the command parser is not enabled, report 0 - unsupported */
-   for_each_engine(engine, dev_priv) {
+   for_each_engine(engine, dev_priv, id) {
if (intel_engine_needs_cmd_parser(engine)) {
active = true;
break;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 20689f1..ba4f832 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -150,7 +150,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
   obj->bas

Re: [Intel-gfx] [PATCH v9] drm/i915: Allocate intel_engine_cs structure only for the enabled engines

2016-10-07 Thread akash goel
On Fri, Oct 7, 2016 at 4:19 PM, Joonas Lahtinen
 wrote:
> On pe, 2016-10-07 at 15:03 +0530, akash.g...@intel.com wrote:
>> > From: Akash Goel 
>>
>> With the possibility of addition of many more number of rings in future,
>> the drm_i915_private structure could bloat as an array, of type
>> intel_engine_cs, is embedded inside it.
>>   struct intel_engine_cs engine[I915_NUM_ENGINES];
>> Though this is still fine as generally there is only a single instance of
>> drm_i915_private structure used, but not all of the possible rings would be
>> enabled or active on most of the platforms. Some memory can be saved by
>> allocating intel_engine_cs structure only for the enabled/active engines.
>> Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
>> indexed using the enums defined in intel_engine_id.
>> To save memory and continue using the static engine/ring IDs, 'engine' is
>> defined as an array of pointers.
>>   struct intel_engine_cs *engine[I915_NUM_ENGINES];
>> dev_priv->engine[engine_ID] will be NULL for disabled engine instances.
>>
>> v2:
>> - Remove the engine iterator field added in drm_i915_private structure,
>>   instead pass a local iterator variable to the for_each_engine**
>>   macros. (Chris)
>> - Do away with intel_engine_initialized() and instead directly use the
>>   NULL pointer check on engine pointer. (Chris)
>>
>> v3:
>> - Remove for_each_engine_id() macro, as the updated macro for_each_engine()
>>   can be used in place of it. (Chris)
>> - Protect the access to Render engine Fault register with a NULL check, as
>>   engine specific init is done later in Driver load sequence.
>>
>> v4:
>> - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
>> - Kill the superfluous init_engine_lists().
>>
>> v5:
>> - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
>>   allocation of intel_engine_cs structure. (Chris)
>>
>> v6:
>> - Rebase.
>>
>> v7:
>> - Optimize the for_each_engine_masked() macro. (Chris)
>> - Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
>> - Rebase.
>>
>
> Would not it be consistent to go with 'id' everywhere rather than
> 'iter'. Consistency is good, and my vote for 'id' as it's more
> descriptive?

Fine will then go with 'id' only.

>
>> @@ -153,9 +163,9 @@ int intel_engines_init(struct drm_device *dev)
>>  cleanup:
>>   for (i = 0; i < I915_NUM_ENGINES; i++) {
>
> Use for_each_engine here too.
>

Fine after using 'for_each_engine' here, the below Null pointer check
on 'engine' can be removed safely.
As you suggested, will not keep BUG_ON also.

>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 936f6f6..08303e3 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1645,7 +1645,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs 
>> *engine)
>>  {
>>   struct drm_i915_private *dev_priv;
>>
>> - if (!intel_engine_initialized(engine))
>> + if (!engine)
>>   return;
>
> Remove this check or make it GEM_BUG_ON(!engine); but I don't think we
> need that much paranoia.
>
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>
> 
>
>> @@ -2091,7 +2092,7 @@ void intel_engine_cleanup(struct intel_engine_cs 
>> *engine)
>>  {
>> > struct drm_i915_private *dev_priv;
>>
>> - if (!intel_engine_initialized(engine))
>> + if (!engine)
>>   return;
>
> Same as above.
>
> With those points fixed;
>
Chris found an issue inside intel_engine_sync_index(), will add his
suggested fix also in the next version.


> Reviewed-by: Joonas Lahtinen 

Thanks much for the review.

Best regards
Akash

>
> Regards, Joonas
> --
> Joonas Lahtinen
> Open Source Technology Center
> Intel Corporation
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v9] drm/i915: Allocate intel_engine_cs structure only for the enabled engines

2016-10-07 Thread akash . goel
From: Akash Goel 

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

Cc: Chris Wilson 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c| 57 ++
 drivers/gpu/drm/i915/i915_drv.c| 11 +++---
 drivers/gpu/drm/i915/i915_drv.h| 22 
 drivers/gpu/drm/i915/i915_gem.c| 29 +++
 drivers/gpu/drm/i915/i915_gem_context.c| 15 
 drivers/gpu/drm/i915/i915_gem_evict.c  |  3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  6 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c| 14 +---
 drivers/gpu/drm/i915/i915_gem_request.c|  5 +--
 drivers/gpu/drm/i915/i915_gpu_error.c  | 30 
 drivers/gpu/drm/i915/i915_guc_submission.c | 12 ---
 drivers/gpu/drm/i915/i915_irq.c| 44 ---
 drivers/gpu/drm/i915/intel_breadcrumbs.c   |  6 ++--
 drivers/gpu/drm/i915/intel_display.c   |  8 ++---
 drivers/gpu/drm/i915/intel_engine_cs.c | 22 
 drivers/gpu/drm/i915/intel_guc_loader.c|  6 ++--
 drivers/gpu/drm/i915/intel_lrc.c   | 10 --
 drivers/gpu/drm/i915/intel_overlay.c   |  2 +-
 drivers/gpu/drm/i915/intel_pm.c| 17 +
 drivers/gpu/drm/i915/intel_ringbuffer.c| 14 +---
 drivers/gpu/drm/i915/intel_ringbuffer.h|  6 
 22 files changed, 191 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 70980f8..bbcd2ec 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1308,10 +1308,11 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv)
 {
struct intel_engine_cs *engine;
+   enum intel_engine_id iter;
bool active = false;
 
/* If the command parser is not enabled, report 0 - unsupported */
-   for_each_engine(engine, dev_priv) {
+   for_each_engine(engine, dev_priv, iter) {
if (intel_engine_needs_cmd_parser(engine)) {
active = true;
break;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 20689f1..f9fd949 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -150,7 +150,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
   obj->base.size / 1024,
   obj->base.read_domains,
   obj->base.write_domain);
-   for_each_engine_id(engine, dev_priv, id)
+   for_each_engine(engine, dev_priv, id)
seq_printf(m, "%x ",
   i915_gem_active_get_seqno(&obj->last_read[id],
 
&obj->base.dev->struct_mutex));
@@ -332,11 +332,12 @@ static void print_batch_pool_stats(struct seq_file *m,
struct drm_i915_gem_object *obj;
struct file_stats stats;

[Intel-gfx] [PATCH v3] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-08 Thread akash . goel
From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware logs and
then dump them to file.
The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and
by default stored into a file 'guc_log_dump.dat'. The name, including the
location, of the output file can be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival of new
logs and as soon as new set of logs are produced it captures them in its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility (CTRL+C).

Before entering into a loop, it first discards whatever logs are present in
the debugfs file.
This way User can first launch this utility and then start a workload/activity
for which GuC firmware logs are to be actually captured and keep running the
utility for as long as its needed, like once the workload is over this utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time, utility
will first enable the logging and later on when it is stopped (CTRL+C) it will
also pause the logging on GuC side.

v2:
- Use combination of alarm system call & SIGALRM signal to run the utility
  for required duration. (Tvrtko)
- Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko)

v3:
- Fix discrepancy for the output file command line option and update the
  Usage/help string.

Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 tools/Makefile.sources   |   1 +
 tools/intel_guc_logger.c | 428 +++
 2 files changed, 429 insertions(+)
 create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
intel_gpu_time  \
intel_gpu_top   \
intel_gtt   \
+   intel_guc_logger\
intel_infoframes\
intel_l3_parity \
intel_lid   \
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..c7addd5
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,428 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+/* Currently the size of GuC log buffer is 19 pages & so is the size of relay
+ * subbuffer. If the size changes in future, then this define also needs to be
+ * updated accordingly.
+ */
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency, Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define DEFAULT_OUTPUT_FILE_NAME  "guc_log_dump.dat"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+bool stop_logging, discard_oldlogs;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+
+static void guc_log_control(bool enable_logging)
+{
+   int control_fd;
+   char data[19];
+   uint64_t val;
+   int ret;
+
+   control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+   igt_assert_f(control_fd >= 0, "couldn't open the guc log control file");
+
+   val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+   ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
+   igt_assert(ret > 2 && ret < sizeof(data));
+
+   ret = write(control_fd, data, ret);
+   igt_assert_f(ret > 0, "couldn't write to the log control file");
+
+   close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+   igt_info("received signal %d\n", sig);
+
+   stop_logging = true;
+}
+
+static void pull_leftover_data(void)
+{
+   unsigned int bytes_read = 0;
+   int ret;
+
+   do {
+   /* Read the logs from relay buffer */
+   ret = read(relay_fd, read_buffer, SUBBUF_SIZE);
+   if (!ret)
+   break;
+
+   igt_assert_f(ret > 0, "failed to read from the guc log file&quo

[Intel-gfx] [PATCH 17/18] drm/i915: Early creation of relay channel for capturing boot time logs

2016-09-08 Thread akash . goel
From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

v3:
- Add a comment to justiy setting 'is_global' before the NULL check on the
  parent directory dentry pointer.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 75 --
 1 file changed, 51 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9ca9e0d..b9fede0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -894,15 +894,16 @@ static struct dentry *create_buf_file_callback(const char 
*filename,
 {
struct dentry *buf_file;
 
-   if (!parent)
-   return NULL;
-
/* This to enable the use of a single buffer for the relay channel and
 * correspondingly have a single file exposed to User, through which
 * it can collect the logs in order without any post-processing.
+* Need to set 'is_global' even if parent is NULL for early logging.
 */
*is_global = 1;
 
+   if (!parent)
+   return NULL;
+
/* Not using the channel filename passed as an argument, since for each
 * channel relay appends the corresponding CPU number to the filename
 * passed in relay_open(). This should be fine as relay just needs a
@@ -935,13 +936,40 @@ static void guc_remove_log_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
 }
 
-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;
 
+   /* Keep the size of sub buffers same as shared log buffer */
+   subbuf_size = guc->log.vma->obj->base.size;
+
+   /* Store up to 8 snapshots, which is large enough to buffer sufficient
+* boot time logs and provides enough leeway to User, in terms of
+* latency, for consuming the logs from relay. Also doesn't take
+* up too much memory.
+*/
+   n_subbufs = 8;
+
+   guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
+   n_subbufs, &relay_callbacks, dev_priv);
+   if (!guc_log_relay_chan) {
+   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
+   return -ENOMEM;
+   }
+
+   GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
+   guc->log.relay_chan = guc_log_relay_chan;
+   return 0;
+}
+
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
/* For now create the log file in /sys/kernel/debug/dri/0 dir */
log_dir = dev_priv->drm.primary->debugfs_root;
 
@@ -961,26 +989,12 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
return -ENODEV;
}
 
-   /* Keep the size of sub buffers same as shared log buffer */
-   subbuf_size = guc->log.vma->obj->base.size;
-
-   /* Store up to 8 snapshots, which is large enough to buffer sufficient
-* boot time logs and provides enough leeway to User, in terms of
-* latency, for consuming the logs from relay. Also doesn't take
-* up too much memory.
-*/
-   n_subbufs = 8;
-
-   guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size,
-   n_subbufs, &relay_callbacks, dev_priv);
-   if (!guc_log_relay_chan) {
-   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
-   return -ENOMEM;
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+  

[Intel-gfx] [PATCH 07/18] relay: Use per CPU constructs for the relay channel buffer pointers

2016-09-08 Thread akash . goel
From: Akash Goel 

relay essentially needs to maintain the per CPU array of channel buffer
pointers but it manually creates that array.
Instead its better to avail the per CPU constructs, provided by the
kernel, to allocate & access the array of pointer to channel buffers.

This patch is queued for merge in linux-next via akpm.

v2: Include  in relay.h so that it pulls in the percpu
api explicitly. (Chris)

Link: 
http://lkml.kernel.org/r/1470909140-25919-1-git-send-email-akash.g...@intel.com
Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/relay.h | 17 +++-
 kernel/relay.c| 74 +--
 2 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index d7c8359..eb295e3 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Tracks changes to rchan/rchan_buf structs
@@ -63,7 +64,7 @@ struct rchan
struct kref kref;   /* channel refcount */
void *private_data; /* for user-defined data */
size_t last_toobig; /* tried to log event > subbuf size */
-   struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
+   struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
int is_global;  /* One global buffer ? */
struct list_head list;  /* for channel list */
struct dentry *parent;  /* parent dentry passed to open */
@@ -204,7 +205,7 @@ static inline void relay_write(struct rchan *chan,
struct rchan_buf *buf;
 
local_irq_save(flags);
-   buf = chan->buf[smp_processor_id()];
+   buf = *this_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
@@ -230,12 +231,12 @@ static inline void __relay_write(struct rchan *chan,
 {
struct rchan_buf *buf;
 
-   buf = chan->buf[get_cpu()];
+   buf = *get_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
-   put_cpu();
+   put_cpu_ptr(chan->buf);
 }
 
 /**
@@ -251,17 +252,19 @@ static inline void __relay_write(struct rchan *chan,
  */
 static inline void *relay_reserve(struct rchan *chan, size_t length)
 {
-   void *reserved;
-   struct rchan_buf *buf = chan->buf[smp_processor_id()];
+   void *reserved = NULL;
+   struct rchan_buf *buf = *get_cpu_ptr(chan->buf);
 
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
-   return NULL;
+   goto end;
}
reserved = buf->data + buf->offset;
buf->offset += length;
 
+end:
+   put_cpu_ptr(chan->buf);
return reserved;
 }
 
diff --git a/kernel/relay.c b/kernel/relay.c
index d797502..f55ab82 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -214,7 +214,7 @@ static void relay_destroy_buf(struct rchan_buf *buf)
__free_page(buf->page_array[i]);
relay_free_page_array(buf->page_array);
}
-   chan->buf[buf->cpu] = NULL;
+   *per_cpu_ptr(chan->buf, buf->cpu) = NULL;
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
@@ -382,20 +382,21 @@ static void __relay_reset(struct rchan_buf *buf, unsigned 
int init)
  */
 void relay_reset(struct rchan *chan)
 {
+   struct rchan_buf *buf;
unsigned int i;
 
if (!chan)
return;
 
-   if (chan->is_global && chan->buf[0]) {
-   __relay_reset(chan->buf[0], 0);
+   if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+   __relay_reset(buf, 0);
return;
}
 
mutex_lock(&relay_channels_mutex);
for_each_possible_cpu(i)
-   if (chan->buf[i])
-   __relay_reset(chan->buf[i], 0);
+   if ((buf = *per_cpu_ptr(chan->buf, i)))
+   __relay_reset(buf, 0);
mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_reset);
@@ -440,7 +441,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
struct dentry *dentry;
 
if (chan->is_global)
-   return chan->buf[0];
+   return *per_cpu_ptr(chan->buf, 0);
 
buf = relay_create_buf(chan);
   

[Intel-gfx] [PATCH 12/18] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-09-08 Thread akash . goel
From: Akash Goel 

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index adb1ffd..324ea90 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
 #define   GUC_LOG_CRASH_PAGES  1
 #define   GUC_LOG_CRASH_SHIFT  4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
 #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
 
@@ -437,9 +437,9 @@ enum guc_log_buffer_type {
  *|   Crash dump state header |
  * Page1  +---+
  *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
  * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
  *| Crash Dump logs   |
  *+---+
  *
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/18] drm/i915: Add stats for GuC log buffer flush interrupts

2016-09-08 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 28 +++
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 +-
 drivers/gpu/drm/i915/i915_irq.c|  2 ++
 drivers/gpu/drm/i915/intel_guc.h   |  7 +++
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c93d423..0ee8411 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2433,6 +2433,32 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
return 0;
 }
 
+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+
+   seq_puts(m, "\nGuC logging stats:\n");
+
+   seq_printf(m, "\tISR:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+   seq_printf(m, "\tDPC:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+   seq_printf(m, "\tCRASH: flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+   seq_printf(m, "\tTotal flush interrupt count: %u\n",
+  guc->log.flush_interrupt_count);
+
+   seq_printf(m, "\tCapture miss count: %u\n",
+  guc->log.capture_miss_count);
+}
+
 static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
@@ -2506,6 +2532,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
i915_guc_client_info(m, dev_priv, &client);
 
+   i915_guc_log_info(m, dev_priv);
+
/* Add more as required ... */
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6eaf4af..dbf7c84 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -995,6 +995,29 @@ static void *guc_get_write_buffer(struct intel_guc *guc)
return relay_reserve(guc->log.relay_chan, 0);
 }
 
+static bool
+guc_check_log_buf_overflow(struct intel_guc *guc,
+  enum guc_log_buffer_type type, unsigned int full_cnt)
+{
+   unsigned int prev_full_cnt = guc->log.prev_overflow_count[type];
+   bool overflow = false;
+
+   if (full_cnt != prev_full_cnt) {
+   overflow = true;
+
+   guc->log.prev_overflow_count[type] = full_cnt;
+   guc->log.total_overflow_count[type] += full_cnt - prev_full_cnt;
+
+   if (full_cnt < prev_full_cnt) {
+   /* buffer_full_cnt is a 4 bit counter */
+   guc->log.total_overflow_count[type] += 16;
+   }
+   DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   }
+
+   return overflow;
+}
+
 static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
 {
switch (type) {
@@ -1015,7 +1038,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 {
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset;
+   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
 
@@ -1041,6 +1064,11 @@ static void guc_re

[Intel-gfx] [PATCH 02/18] drm/i915: Add GuC ukernel logging related fields to fw interface file

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index e40db2d..adb1ffd 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -419,15 +419,87 @@ struct guc_ads {
u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+   GUC_ISR_LOG_BUFFER,
+   GUC_DPC_LOG_BUFFER,
+   GUC_CRASH_DUMP_LOG_BUFFER,
+   GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +---+
+ *|   ISR state header (32 bytes) |
+ *|  DPC state header |
+ *|   Crash dump state header |
+ * Page1  +---+
+ *|   ISR logs|
+ * Page5  +---+
+ *|   DPC logs|
+ * Page9  +---+
+ *| Crash Dump logs   |
+ *+---+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgment from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receiving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgment to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+   u32 marker[2];
+   u32 read_ptr;
+   u32 write_ptr;
+   u32 size;
+   u32 sampled_write_ptr;
+   union {
+   struct {
+   u32 flush_to_file:1;
+   u32 buffer_full_cnt:4;
+   u32 reserved:27;
+   };
+   u32 flags;
+   };
+   u32 version;
+} __packed;
+
+union guc_log_control {
+   struct {
+   u32 logging_enabled:1;
+   u32 reserved1:3;
+   u32 verbosity:4;
+   u32 reserved2:24;
+   };
+   u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
HOST2GUC_ACTION_DEFAULT = 0x0,
HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+   HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+   HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
HOST2GUC_ACTION_LIMIT
 };
 
@@ -449,4 +521,10 @@ enum guc2host_status {
GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0xF000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+   GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+   GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists

[Intel-gfx] [PATCH 11/18] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-09-08 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

v4: Add curlies for 'else' case also, matching the 'if'. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index dbf7c84..ae52d10 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1036,11 +1036,12 @@ static unsigned int guc_get_log_buffer_size(enum 
guc_log_buffer_type type)
 
 static void guc_read_update_log_buffer(struct intel_guc *guc)
 {
+   unsigned int buffer_size, read_offset, write_offset, bytes_to_copy, 
full_cnt;
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
+   bool new_overflow;
 
if (WARN_ON(!guc->log.buf_addr))
return;
@@ -1063,12 +1064,13 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buf_state_local, log_buf_state,
   sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
+   read_offset = log_buf_state_local.read_ptr;
write_offset = log_buf_state_local.sampled_write_ptr;
full_cnt = log_buf_state_local.buffer_full_cnt;
 
/* Bookkeeping stuff */
guc->log.flush_count[type] += log_buf_state_local.flush_to_file;
-   guc_check_log_buf_overflow(guc, type, full_cnt);
+   new_overflow = guc_check_log_buf_overflow(guc, type, full_cnt);
 
/* Update the state of shared log buffer */
log_buf_state->read_ptr = write_offset;
@@ -1091,7 +1093,27 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
log_buf_snapshot_state++;
 
/* Now copy the actual logs. */
-   memcpy(dst_data, src_data, buffer_size);
+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable */
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset > write_offset) {
+   memcpy(dst_data, src_data, write_offset);
+   bytes_to_copy = buffer_size - read_offset;
+   } else {
+   bytes_to_copy = write_offset - read_offset;
+   }
+   memcpy(dst_data + read_offset,
+  src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 15/18] drm/i915: Debugfs support for GuC logging control

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

v5:
- Lock struct_mutex after the NULL check for guc log buffer vma. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 41 -
 drivers/gpu/drm/i915/i915_guc_submission.c | 59 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 0ee8411..6d7bf5d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2565,6 +2565,44 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
struct drm_i915_private *dev_priv = node_to_i915(m->private);
@@ -5332,7 +5370,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b8f6baf..617ded1 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -193,6 +193,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1572,3 +1582,52 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.value = control_val;
+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;
+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_param.verbosity;
+
+   /* If log_level was set as -1 at boot time, then the relay channel file
+* wouldn't have been created by now and inte

[Intel-gfx] [PATCH 09/18] drm/i915: New lock to serialize the Host2GuC actions

2016-09-08 Thread akash . goel
From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
while waiting for the action's response from GuC. (Tvrtko)

v3: To conform to the general rules, acquire mutex before taking the
forcewake. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/intel_guc.h   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index e2863b7..6eaf4af 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -87,6 +87,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
if (WARN_ON(len < 1 || len > 15))
return -EINVAL;
 
+   mutex_lock(&guc->action_lock);
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
dev_priv->guc.action_count += 1;
@@ -125,6 +126,7 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
dev_priv->guc.action_status = status;
 
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+   mutex_unlock(&guc->action_lock);
 
return ret;
 }
@@ -1343,6 +1345,7 @@ int i915_guc_submission_init(struct drm_i915_private 
*dev_priv)
 
guc->ctx_pool_vma = vma;
ida_init(&guc->ctx_ids);
+   mutex_init(&guc->action_lock);
guc_create_log(guc);
guc_create_ads(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 3299cce..1704495 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -155,6 +155,9 @@ struct intel_guc {
 
uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
+
+   /* To serialize the Host2GuC actions */
+   struct mutex action_lock;
 };
 
 /* intel_guc_loader.c */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/18] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-09-08 Thread akash . goel
From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
  associated with the debugfs file. (Tvrtko)

v6:
- Move the write to 'is_global' after the NULL check on parent directory
  dentry pointer. (Tvrtko)

v7: Add a BUG_ON to validate relay buffer allocation size. (Chris)

Testcase: igt/tools/intel_guc_logger

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 213 -
 drivers/gpu/drm/i915/intel_guc.h   |   3 +
 4 files changed, 217 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
+   select RELAY
# i915 depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 02c34d6..b61f1c5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1122,6 +1122,7 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
/* Reveal our presence to userspace */
if (drm_dev_register(dev, 0) == 0) {
i915_debugfs_register(dev_priv);
+   i915_guc_register(dev_priv);
i915_setup_sysfs(dev_priv);
} else
DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1160,6 +1161,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
intel_opregion_unregister(dev_priv);
 
i915_teardown_sysfs(dev_priv);
+   i915_guc_unregister(dev_priv);
i915_debugfs_unregister(dev_priv);
drm_dev_unregister(&dev_priv->drm);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 21e7c183..e2863b7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include "i915_drv.h"
 #include "intel_guc.h"
 
@@ -835,13 +837,160 @@ err:
return NULL;
 }
 
+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+void *subbuf,
+void *prev_subbuf,
+size_t prev_padding)
+{
+   /* Use no-overwrite mode by default, where relay will stop accepting
+* new data if there are no empty sub buffers left.
+* There is no strict synchronization enforced by relay between Consumer
+* and Producer. In overwrite mode, there is a possibility of getting
+* inconsistent/garbled data, the producer could be writing on to the
+* same sub buffer from which Consumer is reading. This can't be avoided
+* unless Consumer is fast enough and ca

[Intel-gfx] [PATCH 06/18] drm/i915: Handle log buffer flush interrupt event from GuC

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

v7:
- Use switch statement instead of 'if else' for retrieving the GuC log
  buffer size. (Tvrtko)
- Refactored the log buffer copying function and shortended the name of
  couple of variables for better readability. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 186 +
 drivers/gpu/drm/i915/i915_irq.c|  28 -
 drivers/gpu/drm/i915/intel_guc.h   |   4 +
 3 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index acd9c3d..21e7c183 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -170,6 +170,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
 
+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+   u32 data[1];
+
+   data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+   return host2guc_action(guc, data, 1);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -826,6 +835,163 @@ err:
return NULL;
 }
 
+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+   return NULL;
+}
+
+static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
+{
+   switch (type) {
+   case GUC_ISR_LOG_BUFFER:
+   return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE;
+   case GUC_DPC_LOG_BUFFER:
+   return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE;
+   case GUC_CRASH_DUMP_LOG_BUFFER:
+   return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE;
+   default:
+   MISSING_CASE(type);
+   }
+
+   return 0;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
+   struct guc_log_buffer_state log_buf_state_local;
+   unsigned int buffer_size, write_offset;
+   enum guc_log_buffer_type type;
+   void *src_data, *dst_data;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buf_state = src_data = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   log_buf_snapshot_state = dst_data = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data += PAGE_SIZE;
+   dst_data += PAGE_SIZE;
+
+   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   /* Make a copy 

[Intel-gfx] [PATCH 13/18] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-09-08 Thread akash . goel
From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

v2:
- For uniformity use existing helper function print_error_obj() to
  dump out contents of GuC log buffer, pretty printing is better left
  to userspace. (Chris)
- Skip the dumping of GuC log buffer when logging is disabled as it
  won't be of any use.
- Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 20 
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7e7eb17..ea02037 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -763,6 +763,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore;
+   struct drm_i915_error_object *guc_log;
 
struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 334f15d..f9eebd9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -576,6 +576,13 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}
 
+   obj = error->guc_log;
+   if (obj) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   print_error_obj(m, obj);
+   }
+
if (error->overlay)
intel_overlay_print_error_state(m, error->overlay);
 
@@ -656,6 +663,7 @@ static void i915_error_state_free(struct kref *error_ref)
}
 
i915_error_object_free(error->semaphore);
+   i915_error_object_free(error->guc_log);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -1302,6 +1310,17 @@ static void i915_capture_pinned_buffers(struct 
drm_i915_private *dev_priv,
error->pinned_bo = bo;
 }
 
+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+   struct drm_i915_error_state *error)
+{
+   /* Capturing log buf contents won't be useful if logging was disabled */
+   if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
+   return;
+
+   error->guc_log = i915_error_object_create(dev_priv,
+ dev_priv->guc.log.vma);
+}
+
 /* Capture all registers which don't fit into another category. */
 static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
   struct drm_i915_error_state *error)
@@ -1453,6 +1472,7 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
i915_gem_record_rings(dev_priv, error);
i915_capture_active_buffers(dev_priv, error);
i915_capture_pinned_buffers(dev_priv, error);
+   i915_gem_capture_guc_log_buffer(dev_priv, error);
 
do_gettimeofday(&error->time);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 16/18] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-09-08 Thread akash . goel
From: Akash Goel 

To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used. In order to avoid
any performance drop & have fast reads from the GuC log buffer, used SSE4.1
movntdqa based memcpy function i915_memcpy_from_wc, as copying using
movntqda from WC type memory is almost as fast as reading from WB memory.
This way log buffer sampling time will not get increased and so would be
able to deal with the flush interrupt storm when GuC is generating logs at
a very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

v3: Squash the WC type vmalloc mapping patch with this patch. (Chris)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 617ded1..9ca9e0d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1127,18 +1127,16 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 
/* Just copy the newly written data */
if (read_offset > write_offset) {
-   memcpy(dst_data, src_data, write_offset);
+   i915_memcpy_from_wc(dst_data, src_data, write_offset);
bytes_to_copy = buffer_size - read_offset;
} else {
bytes_to_copy = write_offset - read_offset;
}
-   memcpy(dst_data + read_offset,
-  src_data + read_offset, bytes_to_copy);
+   i915_memcpy_from_wc(dst_data + read_offset,
+   src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
-
-   /* FIXME: invalidate/flush for log buffer needed */
}
 
if (log_buf_snapshot_state)
@@ -1198,8 +1196,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
return 0;
 
if (!guc->log.buf_addr) {
-   /* Create a vmalloc mapping of log buffer pages */
-   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB);
+   /* Create a WC (Uncached for read) vmalloc mapping of log
+* buffer pages, so that we can directly get the data
+* (up-to-date) from memory.
+*/
+   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
@@ -1242,6 +1243,16 @@ static void guc_create_log(struct intel_guc *guc)
 
vma = guc->log.vma;
if (!vma) {
+   /* We require SSE 4.1 for fast reads from the GuC log buffer and
+* it should be present on the chipsets supporting GuC based
+* submisssions.
+*/
+   if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+   /* logging will not be enabled */
+   i915.guc_log_level = -1;
+   return;
+   }
+
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
/* logging will be off */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/18] drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable

2016-09-08 Thread akash . goel
From: Akash Goel 

The GuC log buffer flush work item has to do a register access to send the
ack to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended. This work item function uses
rpm get/put calls around the Hw access, which covers the rpm suspend case
but for system suspend a sync would be required as kernel can potentially
schedule the work items even after some devices, including GFX, have been
put to suspend. But sync has to be done only for the system suspend case,
as sync along with rpm get/put can cause a deadlock for rpm suspend path.
To have the sync, but like a NOOP, for rpm suspend path also this work
item could have been queued from the irq handler only when the device is
runtime active & kept active while that work item is pending or getting
executed but an interrupt can come even after the device is out of use and
so can potentially lead to missing of this work item.

By marking the workqueue, dedicated for handling GuC log buffer flush
interrupts, as freezable we don't have to bother about flushing of this
work item from the suspend hooks, the pending work item if any will be
either executed before the suspend or scheduled later on resume. This way
the handling of log buffer flush work item can be kept same between system
suspend & rpm suspend.

Suggested-by: Imre Deak 
Cc: Imre Deak 
Signed-off-by: Akash Goel 
Reviewed-by: Imre Deak 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b9fede0..593a01d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1238,8 +1238,19 @@ static int guc_create_log_extras(struct intel_guc *guc)
 
/* Need a dedicated wq to process log buffer flush interrupts
 * from GuC without much delay so as to avoid any loss of logs.
+*
+* GuC log buffer flush work item has to do register access to
+* send the ack to GuC and this work item, if not synced before
+* suspend, can potentially get executed after the GFX device is
+* suspended.
+* By marking the WQ as freezable, we don't have to bother about
+* flushing of this work item from the suspend hooks, the 
pending
+* work item if any will be either executed before the suspend
+* or scheduled later on resume. This way the handling of work
+* item can be kept same between system suspend & rpm suspend.
 */
-   guc->log.flush_wq = alloc_ordered_workqueue("i915-guc_log", 0);
+   guc->log.flush_wq =
+   alloc_ordered_workqueue("i915-guc_log", WQ_FREEZABLE);
if (guc->log.flush_wq == NULL) {
DRM_ERROR("Couldn't allocate the wq for GuC logging\n");
return -ENOMEM;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/18] drm/i915: Support for forceful flush of GuC log buffer

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

GuC firmware sends a flush interrupt to Host when the log buffer is half
full and at that time only it updates the log buffer state.
But in certain cases, as described below, it could be useful to have all
that even when log buffer is only partially full. For that there is a force
log buffer flush Host2GuC action supported by GuC firmware.

For Validation requirements, a forceful flush is needed to collect the
left over logs on disabling logging. The same can be done before proceeding
with GPU/GuC reset as there could be some data in log buffer which is yet
to be captured and those logs would be particularly useful to understand
that why the reset was initiated.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ae52d10..b8f6baf 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,6 +183,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
 }
 
+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1523,6 +1533,26 @@ void i915_guc_capture_logs(struct drm_i915_private 
*dev_priv)
intel_runtime_pm_put(dev_priv);
 }
 
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv)
+{
+   if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+   return;
+
+   /* First disable the interrupts, will be renabled afterwards */
+   gen9_disable_guc_interrupts(dev_priv);
+
+   /* Before initiating the forceful flush, wait for any pending/ongoing
+* flush to complete otherwise forceful flush may not actually happen.
+*/
+   flush_work(&dev_priv->guc.log.flush_work);
+
+   /* Ask GuC to update the log buffer state */
+   host2guc_force_logbuffer_flush(&dev_priv->guc);
+
+   /* GuC would have updated log buffer by now, so capture it */
+   i915_guc_capture_logs(dev_priv);
+}
+
 void i915_guc_unregister(struct drm_i915_private *dev_priv)
 {
if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 8598f38..d918241 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request 
*rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/18] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-09-08 Thread akash . goel
From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +-
 drivers/gpu/drm/i915/i915_irq.c | 75 ++---
 drivers/gpu/drm/i915/intel_drv.h|  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
 4 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f39bede..51b5334 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1810,7 +1810,8 @@ struct drm_i915_private {
u32 de_irq_mask[I915_MAX_PIPES];
};
u32 gt_irq_mask;
-   u32 pm_irq_mask;
+   u32 pm_imr;
+   u32 pm_ier;
u32 pm_rps_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 82358d4..c88df94 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private 
*dev_priv,
 
assert_spin_locked(&dev_priv->irq_lock);
 
-   new_val = dev_priv->pm_irq_mask;
+   new_val = dev_priv->pm_imr;
new_val &= ~interrupt_mask;
new_val |= (~enabled_irq_mask & interrupt_mask);
 
-   if (new_val != dev_priv->pm_irq_mask) {
-   dev_priv->pm_irq_mask = new_val;
-   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
+   if (new_val != dev_priv->pm_imr) {
+   dev_priv->pm_imr = new_val;
+   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
POSTING_READ(gen6_pm_imr(dev_priv));
}
 }
 
-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
@@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private 
*dev_priv, uint32_t mask)
snb_update_pm_irq(dev_priv, mask, mask);
 }
 
-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
- uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
snb_update_pm_irq(dev_priv, mask, 0);
 }
 
-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
 
-   __gen6_disable_pm_irq(dev_priv, mask);
+   __gen6_mask_pm_irq(dev_priv, mask);
 }
 
-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
 {
i915_reg_t reg = gen6_pm_iir(dev_priv);
 
-   spin_lock_irq(&dev_priv->irq_lock);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   I915_WRITE(reg, reset_mask);
+   I915_WRITE(reg, reset_mask);
POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier |= enable_mask;
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   gen6_unmask_pm_irq(dev_priv, enable_mask);
+   /* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier &= ~disable_mask;
+   __gen6_mask_pm_irq(dev_priv, disable_mask);
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   /* though a barrier is missing here, but don't really need a one */
+}
+
+void gen6_reset_rps_interrup

[Intel-gfx] [PATCH 05/18] drm/i915: Support for GuC interrupts

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  5 +++
 drivers/gpu/drm/i915/i915_irq.c| 56 +++---
 drivers/gpu/drm/i915/i915_reg.h| 11 ++
 drivers/gpu/drm/i915/intel_drv.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc_loader.c|  4 +++
 7 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 51b5334..7e7eb17 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1813,6 +1813,7 @@ struct drm_i915_private {
u32 pm_imr;
u32 pm_ier;
u32 pm_rps_events;
+   u32 pm_guc_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c234a96..acd9c3d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1053,6 +1053,8 @@ int intel_guc_suspend(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   gen9_disable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1079,6 +1081,9 @@ int intel_guc_resume(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   if (i915.guc_log_level >= 0)
+   gen9_enable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c88df94..04c48ff 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private 
*dev_priv,
 } while (0)
 
 static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
 
 /* For display hotplug interrupt */
 static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private 
*dev_priv)
gen6_reset_rps_interrupts(dev_priv);
 }
 
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+  dev_priv->pm_guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+   }
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   dev_priv-

[Intel-gfx] [PATCH 03/18] drm/i915: New structure to contain GuC logging related fields

2016-09-08 Thread akash . goel
From: Akash Goel 

So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 
 drivers/gpu/drm/i915/intel_guc.h   | 8 ++--
 drivers/gpu/drm/i915/intel_guc_loader.c| 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 02b627e..c93d423 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2517,10 +2517,10 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!dev_priv->guc.log_vma)
+   if (!dev_priv->guc.log.vma)
return 0;
 
-   obj = dev_priv->guc.log_vma->obj;
+   obj = dev_priv->guc.log.vma->obj;
for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c681680..c234a96 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -841,7 +841,7 @@ static void guc_create_log(struct intel_guc *guc)
GUC_LOG_ISR_PAGES + 1 +
GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
 
-   vma = guc->log_vma;
+   vma = guc->log.vma;
if (!vma) {
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
@@ -850,7 +850,7 @@ static void guc_create_log(struct intel_guc *guc)
return;
}
 
-   guc->log_vma = vma;
+   guc->log.vma = vma;
}
 
/* each allocated unit is a page */
@@ -860,7 +860,7 @@ static void guc_create_log(struct intel_guc *guc)
(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
 
offset = i915_ggtt_offset(vma) >> PAGE_SHIFT; /* in pages */
-   guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+   guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
 static void init_guc_policies(struct guc_policies *policies)
@@ -1032,7 +1032,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
struct intel_guc *guc = &dev_priv->guc;
 
i915_vma_unpin_and_release(&guc->ads_vma);
-   i915_vma_unpin_and_release(&guc->log_vma);
+   i915_vma_unpin_and_release(&guc->log.vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index c973262..d32023c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -121,10 +121,14 @@ struct intel_guc_fw {
uint32_t ucode_offset;
 };
 
+struct intel_guc_log {
+   uint32_t flags;
+   struct i915_vma *vma;
+};
+
 struct intel_guc {
struct intel_guc_fw guc_fw;
-   uint32_t log_flags;
-   struct i915_vma *log_vma;
+   struct intel_guc_log log;
 
struct i915_vma *ads_vma;
struct i915_vma *ctx_pool_vma;
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index f747fe0..17f3ff1 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -189,7 +189,7 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+   params[GUC_CTL_LOG_PARAMS] = guc->log.flags;
 
if (i915.guc_log_level >= 0) {
params[GUC_CTL_DEBUG] =
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/18] drm/i915: Decouple GuC log setup from verbosity parameter

2016-09-08 Thread akash . goel
From: Sagar Arun Kamble 

GuC Log buffer allocation was tied up with verbosity level module param
i915.guc_log_level. User would be given a provision to enable firmware
logging at runtime, through a host2guc action, and not necessarily during
Driver load time. But the address of log buffer can be passed only in
init params, at firmware load time, so GuC has to be reset and firmware
needs to be reloaded to pass the log buffer address at runtime.
To avoid reset of GuC & reload of firmware, allocation of log buffer will
be done always but logging would be enabled initially on GuC side based on
the value of module parameter guc_log_level.

v2: Update commit message to describe the constraint with allocation of
log buffer at runtime. (Tvrtko)

v3: Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
 drivers/gpu/drm/i915/intel_guc_loader.c| 6 --
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 77526d7..c681680 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -832,9 +832,6 @@ static void guc_create_log(struct intel_guc *guc)
unsigned long offset;
uint32_t size, flags;
 
-   if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-   return;
-
if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 853928f..f747fe0 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -189,11 +189,13 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
+   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
if (i915.guc_log_level >= 0) {
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
params[GUC_CTL_DEBUG] =
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-   }
+   } else
+   params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;
 
if (guc->ads_vma) {
u32 ads = i915_ggtt_offset(guc->ads_vma) >> PAGE_SHIFT;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v9 00/18] Support for sustained capturing of GuC firmware logs

2016-09-08 Thread akash . goel
From: Akash Goel 

GuC firmware log its debug messages into a Host-GuC shared memory buffer
and when the buffer is half full it sends a Flush interrupt to Host.
GuC firmware follows the half-full draining protocol where it expects that
while it is writing to 2nd half of the buffer, 1st half would get consumed
by Host and then get a flush completed acknowledgment from Host, so that
it does not end up doing any overwrite causing loss of logs.
So far flush interrupt wasn't enabled on Host side & User could capture the
contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface.
But this couldn't meet couple of key requirements, especially of Validation,
first is to ensure capturing of all boot time logs even with high verbosity
level and second is to enable capturing of logs in a sustained manner like
for the entire duration of a workload.
Now Driver will enable flush interrupt and on receiving it, would copy the
contents of log buffer into its local buffer. The size of local buffer would
be big enough to contain multiple snapshots of the log buffer giving ample
time to User to pull boot time messages.
Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to
collect the logs. Availed relay framework to implement this interface, where
Driver will have to just use a relay API to store snapshots of GuC log buffer
in a buffer managed by relay. The relay buffer can be operated in a mode,
equivalent to 'dmesg -c' where the old data, not yet collected by User, will
be overwritten if buffer becomes full or it can be operated in no-overwrite
mode where relay will stop accepting new data if all sub buffers are full.
Have used the latter mode to avoid the possibility of getting garbled data. 
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the log
buffer is taken by Driver, so can run in tandem with the Driver and thus
capture logs in a sustained/streaming manner, without any loss of data.

The logs can be captured from relay backed debugfs file through the utility
igt/tools/intel_guc_logger.

v2: Rebased to the latest drm-intel-nightly.

v3: Aligned with the modification of late debugfs registration, at the end of
i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3
new patches to optimize the log-buffer flush interrupt handling, gather
and report the logging related stats.

v4: Added 2 new patches to further optimize the log-buffer flush interrupt
handling. Did cleanup as per Chris's review comments, fixed couple of
issues related to clearing of Guc2Host message register. Switched to
no-overwrite mode for the relay.

v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided
by a patch from Chris. Dropped the rt priority kthread patch, after
evaluating all the optimizations with certain benchmarks like
synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts
almost at every ms or less. Updated the older patches as per the review
comments from Tvrtko and Chris W. Added a new patch to augment i915 error
state with the GuC log buffer contents. Fixed the issue of User interrupt
getting disabled for VEBOX ring, causing failure for certain IGTs.
Also included 2 patches to support early logging for capturing boot
time logs and use per CPU constructs on the relay side so as to address
a WARNING issue with the call to relay_reserve(), without disabling
preemption.

v6: Mainly did the rebasing, refactoring, cleanup as per the review comments
and fixed error/warnings reported by checkpatch.

v7: Added a new patch to complete the pending log buffer flush work item in
system suspend case. Cleaned up the irq handler & work item function
by removing the check for GuC interrupts.

v8: Replaced the patch added in last version with a patch which marks the
GuC log buffer flush interrupt handling WQ as freezable, as per the inputs
from Imre. Refactored the log buffer sampling function and added a new
helper function to improve the readability as per suggestions from Tvrtko.

v9: As per Chris's comment, removed the forceful flush of GuC log buffer from
the error state capture path as that could have disturbed the atomicity
required in error state path. Squashed the wc type vmalloc mapping patch
with SSE4.1 movntdqa based memcpy patch. Added a BUG_ON for the relay
buffer allocation size.

Akash Goel (12):
  drm/i915: New structure to contain GuC logging related fields
  drm/i915: Add low level set of routines for programming PM IER/IIR/IMR
register set
  relay: Use per CPU constructs for the relay channel buffer pointers
  drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  drm

[Intel-gfx] [PATCH v2] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-07 Thread akash . goel
From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware logs and
then dump them to file.
The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and
stored into a file '/tmp/guc_log_dump.dat', the name of the output file can
be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival of new
logs and as soon as new set of logs are produced it captures them in its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility (CTRL+C).

Before entering into a loop, it first discards whatever logs are present in
the debugfs file.
This way User can first launch this utility and then start a workload/activity
for which GuC firmware logs are to be actually captured and keep running the
utility for as long as its needed, like once the workload is over this utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time, utility
will first enable the logging and later on when it is stopped (CTRL+C) it will
also pause the logging on GuC side.

v2:
- Use combination of alarm system call & SIGALRM signal to run the utility
  for required duration. (Tvrtko)
- Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko)

Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
---
 tools/Makefile.sources   |   1 +
 tools/intel_guc_logger.c | 425 +++
 2 files changed, 426 insertions(+)
 create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
intel_gpu_time  \
intel_gpu_top   \
intel_gtt   \
+   intel_guc_logger\
intel_infoframes\
intel_l3_parity \
intel_lid   \
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..6e03282
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,425 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency, Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define DEFAULT_OUTPUT_FILE_NAME  "guc_log_dump.dat"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+bool stop_logging, discard_oldlogs;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+
+static void guc_log_control(bool enable_logging)
+{
+   int control_fd;
+   char data[19];
+   uint64_t val;
+   int ret;
+
+   control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+   igt_assert_f(control_fd >= 0, "couldn't open the guc log control file");
+
+   val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+   ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
+   igt_assert(ret > 2 && ret < sizeof(data));
+
+   ret = write(control_fd, data, ret);
+   igt_assert_f(ret > 0, "couldn't write to the log control file");
+
+   close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+   igt_info("received signal %d\n", sig);
+
+   stop_logging = true;
+}
+
+static void pull_leftover_data(void)
+{
+   unsigned int bytes_read = 0;
+   int ret;
+
+   do {
+   /* Read the logs from relay buffer */
+   ret = read(relay_fd, read_buffer, SUBBUF_SIZE);
+   if (!ret)
+   break;
+
+   igt_assert_f(ret > 0, "failed to read from the guc log file");
+   igt_assert_f(ret == SUBBUF_SIZE, "invalid read from relay 
file");
+
+   bytes_read += ret;
+
+   if (outfile_fd >= 0) {
+   ret = write(outfile_fd, read_buffer, SUBBUF_SIZE);
+   igt_assert_f(ret == SUBBUF_SIZE, "couldn't dump the 
logs in a file"

[Intel-gfx] [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file

2016-09-06 Thread akash . goel
From: Akash Goel 

This patch provides a test utility which helps capture GuC firmware logs and
then dump them to file.
The logs are pulled from a debugfs file '/sys/kernel/debug/dri/guc_log' and
stored into a file '/tmp/guc_log_dump.dat', the name of the output file can
be changed through a command line argument.

The utility goes into an infinite loop where it waits for the arrival of new
logs and as soon as new set of logs are produced it captures them in its local
buffer which is then flushed out to the file on disk.
Any time when logging needs to be ended, User can stop this utility (CTRL+C).

Before entering into a loop, it first discards whatever logs are present in
the debugfs file.
This way User can first launch this utility and then start a workload/activity
for which GuC firmware logs are to be actually captured and keep running the
utility for as long as its needed, like once the workload is over this utility
can be forcefully stopped.

If the logging wasn't enabled on GuC side by the Driver at boot time, utility
will first enable the logging and later on when it is stopped (CTRL+C) it will
also pause the logging on GuC side.

Signed-off-by: Akash Goel 
---
 tools/Makefile.sources   |   1 +
 tools/intel_guc_logger.c | 441 +++
 2 files changed, 442 insertions(+)
 create mode 100644 tools/intel_guc_logger.c

diff --git a/tools/Makefile.sources b/tools/Makefile.sources
index 2bb6c8e..be58871 100644
--- a/tools/Makefile.sources
+++ b/tools/Makefile.sources
@@ -19,6 +19,7 @@ tools_prog_lists =\
intel_gpu_time  \
intel_gpu_top   \
intel_gtt   \
+   intel_guc_logger\
intel_infoframes\
intel_l3_parity \
intel_lid   \
diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
new file mode 100644
index 000..92172fa
--- /dev/null
+++ b/tools/intel_guc_logger.c
@@ -0,0 +1,441 @@
+
+#define _GNU_SOURCE  /* For using O_DIRECT */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+#define MB(x) ((uint64_t)(x) * 1024 * 1024)
+#ifndef PAGE_SIZE
+  #define PAGE_SIZE 4096
+#endif
+#define SUBBUF_SIZE (19*PAGE_SIZE)
+/* Need large buffering from logger side to hide the DISK IO latency, Driver
+ * can only store 8 snapshots of GuC log buffer in relay.
+ */
+#define NUM_SUBBUFS 100
+
+#define RELAY_FILE_NAME  "guc_log"
+#define CONTROL_FILE_NAME "i915_guc_log_control"
+
+char *read_buffer;
+char *out_filename;
+int poll_timeout = 2; /* by default 2ms timeout */
+pthread_mutex_t mutex;
+pthread_t flush_thread;
+int verbosity_level = 3; /* by default capture logs at max verbosity */
+uint32_t produced, consumed;
+uint64_t total_bytes_written;
+int num_buffers = NUM_SUBBUFS;
+int relay_fd, outfile_fd = -1;
+bool stop_logging, discard_oldlogs;
+uint32_t test_duration, max_filesize;
+pthread_cond_t underflow_cond, overflow_cond;
+
+static void guc_log_control(bool enable_logging)
+{
+   int control_fd;
+   char data[19];
+   uint64_t val;
+   int ret;
+
+   control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
+   if (control_fd < 0)
+   igt_assert_f(0, "Couldn't open the guc log control file");
+
+   val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
+
+   snprintf(data, sizeof(data), "0x%" PRIx64, val);
+   ret = write(control_fd, data, strlen(data) + 1);
+   if (ret < 0)
+   igt_assert_f(0, "Couldn't write to the log control file");
+
+   close(control_fd);
+}
+
+static void int_sig_handler(int sig)
+{
+   igt_info("Received signal %d\n", sig);
+
+   stop_logging = true;
+}
+
+static void pull_leftover_data(void)
+{
+   unsigned int bytes_read = 0;
+   int ret;
+
+   while (1) {
+   /* Read the logs from relay buffer */
+   ret = read(relay_fd, read_buffer, SUBBUF_SIZE);
+   if (!ret)
+   break;
+   else if (ret < 0)
+   igt_assert_f(0, "Failed to read from the guc log file");
+   else if (ret < SUBBUF_SIZE)
+   igt_assert_f(0, "invalid read from relay file");
+
+   bytes_read += ret;
+
+   if (outfile_fd > 0) {
+   ret = write(outfile_fd, read_buffer, SUBBUF_SIZE);
+   if (ret < 0)
+   igt_assert_f(0, "Couldn't dump the logs in a 
file");
+   total_bytes_written += ret;
+   }
+   };
+
+   igt_debug("%u bytes flushed\n", bytes_read);
+}
+
+static int num_filled_bu

[Intel-gfx] [PATCH 15/19] drm/i915: Debugfs support for GuC logging control

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 44 -
 drivers/gpu/drm/i915/i915_guc_submission.c | 62 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index aaa51bc..bfc22bf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2626,6 +2626,47 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   if (!dev_priv->guc.log.vma) {
+   ret = -EINVAL;
+   goto end;
+   }
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+end:
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -5436,7 +5477,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 7ce586d..fda14e7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1576,3 +1586,55 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.value = control_val;
+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;
+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_param.verbosity;
+
+   /* If log_level was set as -1 at boot time, then the relay channel file
+* wouldn't have been created by now and interrupts also would not have
+* been enabled.
+*/
+   if (!dev_priv->guc.log.relay_chan)

[Intel-gfx] [PATCH 16/19] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer

2016-08-19 Thread akash . goel
From: Akash Goel 

Host needs to sample the GuC log buffer on every flush interrupt from GuC.
To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used.
Though there could be some performance implication with Uncached read, but
reliability of data will be ensured.

v2: Rebase.

v3: Rebase.

v4: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index fda14e7..7c889e1 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1138,8 +1138,6 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 
src_data += buffer_size;
dst_data += buffer_size;
-
-   /* FIXME: invalidate/flush for log buffer needed */
}
 
if (log_buf_snapshot_state)
@@ -1202,8 +1200,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
return 0;
 
if (!guc->log.buf_addr) {
-   /* Create a vmalloc mapping of log buffer pages */
-   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB);
+   /* Create a WC (Uncached for read) vmalloc mapping of log
+* buffer pages, so that we can directly get the data
+* (up-to-date) from memory.
+*/
+   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 07/19] relay: Use per CPU constructs for the relay channel buffer pointers

2016-08-19 Thread akash . goel
From: Akash Goel 

relay essentially needs to maintain the per CPU array of channel buffer
pointers but it manually creates that array.
Instead its better to avail the per CPU constructs, provided by the
kernel, to allocate & access the array of pointer to channel buffers.

v2: Include  in relay.h so that it pulls in the percpu
api explicitly. (Chris)

Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/relay.h | 17 +++-
 kernel/relay.c| 74 +--
 2 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index d7c8359..eb295e3 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Tracks changes to rchan/rchan_buf structs
@@ -63,7 +64,7 @@ struct rchan
struct kref kref;   /* channel refcount */
void *private_data; /* for user-defined data */
size_t last_toobig; /* tried to log event > subbuf size */
-   struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
+   struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
int is_global;  /* One global buffer ? */
struct list_head list;  /* for channel list */
struct dentry *parent;  /* parent dentry passed to open */
@@ -204,7 +205,7 @@ static inline void relay_write(struct rchan *chan,
struct rchan_buf *buf;
 
local_irq_save(flags);
-   buf = chan->buf[smp_processor_id()];
+   buf = *this_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
@@ -230,12 +231,12 @@ static inline void __relay_write(struct rchan *chan,
 {
struct rchan_buf *buf;
 
-   buf = chan->buf[get_cpu()];
+   buf = *get_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
-   put_cpu();
+   put_cpu_ptr(chan->buf);
 }
 
 /**
@@ -251,17 +252,19 @@ static inline void __relay_write(struct rchan *chan,
  */
 static inline void *relay_reserve(struct rchan *chan, size_t length)
 {
-   void *reserved;
-   struct rchan_buf *buf = chan->buf[smp_processor_id()];
+   void *reserved = NULL;
+   struct rchan_buf *buf = *get_cpu_ptr(chan->buf);
 
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
-   return NULL;
+   goto end;
}
reserved = buf->data + buf->offset;
buf->offset += length;
 
+end:
+   put_cpu_ptr(chan->buf);
return reserved;
 }
 
diff --git a/kernel/relay.c b/kernel/relay.c
index d797502..f55ab82 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -214,7 +214,7 @@ static void relay_destroy_buf(struct rchan_buf *buf)
__free_page(buf->page_array[i]);
relay_free_page_array(buf->page_array);
}
-   chan->buf[buf->cpu] = NULL;
+   *per_cpu_ptr(chan->buf, buf->cpu) = NULL;
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
@@ -382,20 +382,21 @@ static void __relay_reset(struct rchan_buf *buf, unsigned 
int init)
  */
 void relay_reset(struct rchan *chan)
 {
+   struct rchan_buf *buf;
unsigned int i;
 
if (!chan)
return;
 
-   if (chan->is_global && chan->buf[0]) {
-   __relay_reset(chan->buf[0], 0);
+   if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+   __relay_reset(buf, 0);
return;
}
 
mutex_lock(&relay_channels_mutex);
for_each_possible_cpu(i)
-   if (chan->buf[i])
-   __relay_reset(chan->buf[i], 0);
+   if ((buf = *per_cpu_ptr(chan->buf, i)))
+   __relay_reset(buf, 0);
mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_reset);
@@ -440,7 +441,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
struct dentry *dentry;
 
if (chan->is_global)
-   return chan->buf[0];
+   return *per_cpu_ptr(chan->buf, 0);
 
buf = relay_create_buf(chan);
if (!buf)
@@ -464,7 +465,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
__relay_reset(buf, 1);
 
   

[Intel-gfx] [PATCH 17/19] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-08-19 Thread akash . goel
From: Akash Goel 

In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
based memcpy function i915_memcpy_from_wc.
GuC log buffer has a WC type vmalloc mapping and copying using movntqda
from WC type memory is almost as fast as reading from WB memory.
This will further reduce the log buffer sampling time, so is needed dearly
to deal with the flush interrupt storm when GuC is generating logs at a
very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 7c889e1..c214a7c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1128,13 +1128,13 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 
/* Just copy the newly written data */
if (read_offset > write_offset) {
-   memcpy(dst_data, src_data, write_offset);
+   i915_memcpy_from_wc(dst_data, src_data, write_offset);
bytes_to_copy = buffer_size - read_offset;
} else {
bytes_to_copy = write_offset - read_offset;
}
-   memcpy(dst_data + read_offset,
-  src_data + read_offset, bytes_to_copy);
+   i915_memcpy_from_wc(dst_data + read_offset,
+   src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
@@ -1247,6 +1247,16 @@ static void guc_create_log(struct intel_guc *guc)
 
vma = guc->log.vma;
if (!vma) {
+   /* We require SSE 4.1 for fast reads from the GuC log buffer and
+* it should be present on the chipsets supporting GuC based
+* submisssions.
+*/
+   if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+   /* logging will not be enabled */
+   i915.guc_log_level = -1;
+   return;
+   }
+
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
/* logging will be off */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/19] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU reset
and re-initializing GUC. There could be some data in the log buffer which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.

v2:
- Avoid the wait via flush_work, to serialize against an ongoing log
  buffer flush, from the error state capture path. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_gpu_error.c  |  2 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 94297aa..b73c671 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1301,6 +1301,8 @@ static void i915_gem_capture_guc_log_buffer(struct 
drm_i915_private *dev_priv,
if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
return;
 
+   i915_guc_flush_logs(dev_priv, false);
+
error->guc_log = i915_error_object_create(dev_priv,
  dev_priv->guc.log.vma);
 }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 27b4047..7ce586d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -185,6 +185,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
 }
 
+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1527,6 +1537,26 @@ void i915_guc_capture_logs(struct drm_i915_private 
*dev_priv)
intel_runtime_pm_put(dev_priv);
 }
 
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait)
+{
+   if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+   return;
+
+   /* First disable the interrupts, will be renabled afterwards */
+   gen9_disable_guc_interrupts(dev_priv);
+
+   /* Before initiating the forceful flush, wait for any pending/ongoing
+* flush to complete otherwise forceful flush may not happen, but wait
+* can't be done for some paths like error state capture in which case
+* take a chance & directly attempt the forceful flush.
+*/
+   if (can_wait)
+   flush_work(&dev_priv->guc.log.flush_work);
+
+   /* Ask GuC to update the log buffer state */
+   host2guc_force_logbuffer_flush(&dev_priv->guc);
+}
+
 void i915_guc_unregister(struct drm_i915_private *dev_priv)
 {
if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 8598f38..d7eda42 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request 
*rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/19] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-19 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

v4: Add curlies for 'else' case also, matching the 'if'. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d416fae..27b4047 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1037,11 +1037,12 @@ static unsigned int guc_get_log_buffer_size(enum 
guc_log_buffer_type type)
 
 static void guc_read_update_log_buffer(struct intel_guc *guc)
 {
+   unsigned int buffer_size, read_offset, write_offset, bytes_to_copy, 
full_cnt;
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
+   bool new_overflow;
 
if (WARN_ON(!guc->log.buf_addr))
return;
@@ -1064,12 +1065,13 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buf_state_local, log_buf_state,
   sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
+   read_offset = log_buf_state_local.read_ptr;
write_offset = log_buf_state_local.sampled_write_ptr;
full_cnt = log_buf_state_local.buffer_full_cnt;
 
/* Bookkeeping stuff */
guc->log.flush_count[type] += log_buf_state_local.flush_to_file;
-   guc_check_log_buf_overflow(guc, type, full_cnt);
+   new_overflow = guc_check_log_buf_overflow(guc, type, full_cnt);
 
/* Update the state of shared log buffer */
log_buf_state->read_ptr = write_offset;
@@ -1092,7 +1094,27 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
log_buf_snapshot_state++;
 
/* Now copy the actual logs. */
-   memcpy(dst_data, src_data, buffer_size);
+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable */
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset > write_offset) {
+   memcpy(dst_data, src_data, write_offset);
+   bytes_to_copy = buffer_size - read_offset;
+   } else {
+   bytes_to_copy = write_offset - read_offset;
+   }
+   memcpy(dst_data + read_offset,
+  src_data + read_offset, bytes_to_copy);
 
src_data += buffer_size;
dst_data += buffer_size;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/19] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-19 Thread akash . goel
From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

v3:
- Add a comment to justiy setting 'is_global' before the NULL check on the
  parent directory dentry pointer.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 73 --
 1 file changed, 50 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c214a7c..d4a1c84 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -896,15 +896,16 @@ static struct dentry *create_buf_file_callback(const char 
*filename,
 {
struct dentry *buf_file;
 
-   if (!parent)
-   return NULL;
-
/* This to enable the use of a single buffer for the relay channel and
 * correspondingly have a single file exposed to User, through which
 * it can collect the logs in order without any post-processing.
+* Need to set 'is_global' even if parent is NULL for early logging.
 */
*is_global = 1;
 
+   if (!parent)
+   return NULL;
+
/* Not using the channel filename passed as an argument, since for each
 * channel relay appends the corresponding CPU number to the filename
 * passed in relay_open(). This should be fine as relay just needs a
@@ -937,13 +938,39 @@ static void guc_remove_log_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
 }
 
-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;
 
+   /* Keep the size of sub buffers same as shared log buffer */
+   subbuf_size = guc->log.vma->obj->base.size;
+
+   /* Store up to 8 snapshots, which is large enough to buffer sufficient
+* boot time logs and provides enough leeway to User, in terms of
+* latency, for consuming the logs from relay. Also doesn't take
+* up too much memory.
+*/
+   n_subbufs = 8;
+
+   guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
+   n_subbufs, &relay_callbacks, dev_priv);
+   if (!guc_log_relay_chan) {
+   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
+   return -ENOMEM;
+   }
+
+   guc->log.relay_chan = guc_log_relay_chan;
+   return 0;
+}
+
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
/* For now create the log file in /sys/kernel/debug/dri/0 dir */
log_dir = dev_priv->drm.primary->debugfs_root;
 
@@ -963,25 +990,12 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
return -ENODEV;
}
 
-   /* Keep the size of sub buffers same as shared log buffer */
-   subbuf_size = guc->log.vma->obj->base.size;
-
-   /* Store up to 8 snapshots, which is large enough to buffer sufficient
-* boot time logs and provides enough leeway to User, in terms of
-* latency, for consuming the logs from relay. Also doesn't take
-* up too much memory.
-*/
-   n_subbufs = 8;
-
-   guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size,
-   n_subbufs, &relay_callbacks, dev_priv);
-   if (!guc_log_relay_chan) {
-   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
-   return -ENOMEM;
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+   DRM_ERROR("Couldn't associate relay chan with fil

[Intel-gfx] [PATCH 10/19] drm/i915: Add stats for GuC log buffer flush interrupts

2016-08-19 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 28 +++
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 +-
 drivers/gpu/drm/i915/i915_irq.c|  2 ++
 drivers/gpu/drm/i915/intel_guc.h   |  7 +++
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b2cfcf4..aaa51bc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2491,6 +2491,32 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
return 0;
 }
 
+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+
+   seq_puts(m, "\nGuC logging stats:\n");
+
+   seq_printf(m, "\tISR:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+   seq_printf(m, "\tDPC:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+   seq_printf(m, "\tCRASH: flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+   seq_printf(m, "\tTotal flush interrupt count: %u\n",
+  guc->log.flush_interrupt_count);
+
+   seq_printf(m, "\tCapture miss count: %u\n",
+  guc->log.capture_miss_count);
+}
+
 static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
@@ -2565,6 +2591,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
i915_guc_client_info(m, dev_priv, &client);
 
+   i915_guc_log_info(m, dev_priv);
+
/* Add more as required ... */
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6f945ef..d416fae 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -996,6 +996,29 @@ static void *guc_get_write_buffer(struct intel_guc *guc)
return relay_reserve(guc->log.relay_chan, 0);
 }
 
+static bool
+guc_check_log_buf_overflow(struct intel_guc *guc,
+  enum guc_log_buffer_type type, unsigned int full_cnt)
+{
+   unsigned int prev_full_cnt = guc->log.prev_overflow_count[type];
+   bool overflow = false;
+
+   if (full_cnt != prev_full_cnt) {
+   overflow = true;
+
+   guc->log.prev_overflow_count[type] = full_cnt;
+   guc->log.total_overflow_count[type] += full_cnt - prev_full_cnt;
+
+   if (full_cnt < prev_full_cnt) {
+   /* buffer_full_cnt is a 4 bit counter */
+   guc->log.total_overflow_count[type] += 16;
+   }
+   DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   }
+
+   return overflow;
+}
+
 static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
 {
switch (type) {
@@ -1016,7 +1039,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 {
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
struct guc_log_buffer_state log_buf_state_local;
-   unsigned int buffer_size, write_offset;
+   unsigned int buffer_size, write_offset, full_cnt;
enum guc_log_buffer_type type;
void *src_data, *dst_data;
 
@@ -1042,6 +1065,11 @@ static void guc_re

[Intel-gfx] [PATCH 19/19] drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable

2016-08-19 Thread akash . goel
From: Akash Goel 

The GuC log buffer flush work item has to do a register access to send the
ack to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended. This work item function uses
rpm get/put calls around the Hw access, which covers the rpm suspend case
but for system suspend a sync would be required as kernel can potentially
schedule the work items even after some devices, including GFX, have been
put to suspend. But sync has to be done only for the system suspend case,
as sync along with rpm get/put can cause a deadlock for rpm suspend path.
To have the sync, but like a NOOP, for rpm suspend path also this work
item could have been queued from the irq handler only when the device is
runtime active & kept active while that work item is pending or getting
executed but an interrupt can come even after the device is out of use and
so can potentially lead to missing of this work item.

By marking the workqueue, dedicated for handling GuC log buffer flush
interrupts, as freezable we don't have to bother about flushing of this
work item from the suspend hooks, the pending work item if any will be
either executed before the suspend or scheduled later on resume. This way
the handling of log buffer flush work item can be kept same between system
suspend & rpm suspend.

Suggested-by: Imre Deak 
Cc: Imre Deak 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d4a1c84..bb25404 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1242,8 +1242,19 @@ static int guc_create_log_extras(struct intel_guc *guc)
 
/* Need a dedicated wq to process log buffer flush interrupts
 * from GuC without much delay so as to avoid any loss of logs.
+*
+* GuC log buffer flush work item has to do register access to
+* send the ack to GuC and this work item, if not synced before
+* suspend, can potentially get executed after the GFX device is
+* suspended.
+* By marking the WQ as freezable, we don't have to bother about
+* flushing of this work item from the suspend hooks, the 
pending
+* work item if any will be either executed before the suspend
+* or scheduled later on resume. This way the handling of work
+* item can be kept same between system suspend & rpm suspend.
 */
-   guc->log.flush_wq = alloc_ordered_workqueue("i915-guc_log", 0);
+   guc->log.flush_wq =
+   alloc_ordered_workqueue("i915-guc_log", WQ_FREEZABLE);
if (guc->log.flush_wq == NULL) {
DRM_ERROR("Couldn't allocate the wq for GuC logging\n");
return -ENOMEM;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 09/19] drm/i915: New lock to serialize the Host2GuC actions

2016-08-19 Thread akash . goel
From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
while waiting for the action's response from GuC. (Tvrtko)

v3: To conform to the general rules, acquire mutex before taking the
forcewake. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/intel_guc.h   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8853b07..6f945ef 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -87,6 +87,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
if (WARN_ON(len < 1 || len > 15))
return -EINVAL;
 
+   mutex_lock(&guc->action_lock);
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
dev_priv->guc.action_count += 1;
@@ -127,6 +128,7 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
dev_priv->guc.action_status = status;
 
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+   mutex_unlock(&guc->action_lock);
 
return ret;
 }
@@ -1347,6 +1349,7 @@ int i915_guc_submission_init(struct drm_i915_private 
*dev_priv)
 
guc->ctx_pool_vma = vma;
ida_init(&guc->ctx_ids);
+   mutex_init(&guc->action_lock);
guc_create_log(guc);
guc_create_ads(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 3299cce..1704495 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -155,6 +155,9 @@ struct intel_guc {
 
uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
+
+   /* To serialize the Host2GuC actions */
+   struct mutex action_lock;
 };
 
 /* intel_guc_loader.c */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/19] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-08-19 Thread akash . goel
From: Akash Goel 

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 47ef7e8..49dbe25 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
 #define   GUC_LOG_CRASH_PAGES  1
 #define   GUC_LOG_CRASH_SHIFT  4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
 #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
 
@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
  *|   Crash dump state header |
  * Page1  +---+
  *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
  * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
  *| Crash Dump logs   |
  *+---+
  *
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/19] drm/i915: Add GuC ukernel logging related fields to fw interface file

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 944786d..47ef7e8 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -418,15 +418,87 @@ struct guc_ads {
u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+   GUC_ISR_LOG_BUFFER,
+   GUC_DPC_LOG_BUFFER,
+   GUC_CRASH_DUMP_LOG_BUFFER,
+   GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +---+
+ *|   ISR state header (32 bytes) |
+ *|  DPC state header |
+ *|   Crash dump state header |
+ * Page1  +---+
+ *|   ISR logs|
+ * Page5  +---+
+ *|   DPC logs|
+ * Page9  +---+
+ *| Crash Dump logs   |
+ *+---+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgment from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receiving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgment to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+   u32 marker[2];
+   u32 read_ptr;
+   u32 write_ptr;
+   u32 size;
+   u32 sampled_write_ptr;
+   union {
+   struct {
+   u32 flush_to_file:1;
+   u32 buffer_full_cnt:4;
+   u32 reserved:27;
+   };
+   u32 flags;
+   };
+   u32 version;
+} __packed;
+
+union guc_log_control {
+   struct {
+   u32 logging_enabled:1;
+   u32 reserved1:3;
+   u32 verbosity:4;
+   u32 reserved2:24;
+   };
+   u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
HOST2GUC_ACTION_DEFAULT = 0x0,
HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+   HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+   HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
HOST2GUC_ACTION_LIMIT
 };
 
@@ -448,4 +520,10 @@ enum guc2host_status {
GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0xF000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+   GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+   GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists

[Intel-gfx] [PATCH 13/19] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-19 Thread akash . goel
From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

v2:
- For uniformity use existing helper function print_error_obj() to
  dump out contents of GuC log buffer, pretty printing is better left
  to userspace. (Chris)
- Skip the dumping of GuC log buffer when logging is disabled as it
  won't be of any use.
- Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 20 
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 08553b9..de89596 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -734,6 +734,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore;
+   struct drm_i915_error_object *guc_log;
 
struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0c3f30c..94297aa 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -572,6 +572,13 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}
 
+   obj = error->guc_log;
+   if (obj) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   print_error_obj(m, obj);
+   }
+
if (error->overlay)
intel_overlay_print_error_state(m, error->overlay);
 
@@ -651,6 +658,7 @@ static void i915_error_state_free(struct kref *error_ref)
}
 
i915_error_object_free(error->semaphore);
+   i915_error_object_free(error->guc_log);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -1286,6 +1294,17 @@ static void i915_capture_pinned_buffers(struct 
drm_i915_private *dev_priv,
error->pinned_bo = bo;
 }
 
+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+   struct drm_i915_error_state *error)
+{
+   /* Capturing log buf contents won't be useful if logging was disabled */
+   if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
+   return;
+
+   error->guc_log = i915_error_object_create(dev_priv,
+ dev_priv->guc.log.vma);
+}
+
 /* Capture all registers which don't fit into another category. */
 static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
   struct drm_i915_error_state *error)
@@ -1437,6 +1456,7 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
i915_gem_record_rings(dev_priv, error);
i915_capture_active_buffers(dev_priv, error);
i915_capture_pinned_buffers(dev_priv, error);
+   i915_gem_capture_guc_log_buffer(dev_priv, error);
 
do_gettimeofday(&error->time);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/19] drm/i915: Support for GuC interrupts

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  5 +++
 drivers/gpu/drm/i915/i915_irq.c| 56 +++---
 drivers/gpu/drm/i915/i915_reg.h| 11 ++
 drivers/gpu/drm/i915/intel_drv.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc_loader.c|  4 +++
 7 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a187ea7..08553b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1783,6 +1783,7 @@ struct drm_i915_private {
u32 pm_imr;
u32 pm_ier;
u32 pm_rps_events;
+   u32 pm_guc_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 34fb893..b062da6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1055,6 +1055,8 @@ int intel_guc_suspend(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   gen9_disable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1081,6 +1083,9 @@ int intel_guc_resume(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   if (i915.guc_log_level >= 0)
+   gen9_enable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 83d708a..fc1fe72 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private 
*dev_priv,
 } while (0)
 
 static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
 
 /* For display hotplug interrupt */
 static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private 
*dev_priv)
gen6_reset_rps_interrupts(dev_priv);
 }
 
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+  dev_priv->pm_guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+   }
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   dev_priv-

[Intel-gfx] [PATCH 08/19] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-19 Thread akash . goel
From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
  associated with the debugfs file. (Tvrtko)

v6:
- Move the write to 'is_global' after the NULL check on parent directory
  dentry pointer. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 212 -
 drivers/gpu/drm/i915/intel_guc.h   |   3 +
 4 files changed, 216 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
+   select RELAY
# i915 depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 13ae340..cdee60b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1133,6 +1133,7 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
/* Reveal our presence to userspace */
if (drm_dev_register(dev, 0) == 0) {
i915_debugfs_register(dev_priv);
+   i915_guc_register(dev_priv);
i915_setup_sysfs(dev);
} else
DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1171,6 +1172,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
intel_opregion_unregister(dev_priv);
 
i915_teardown_sysfs(&dev_priv->drm);
+   i915_guc_unregister(dev_priv);
i915_debugfs_unregister(dev_priv);
drm_dev_unregister(&dev_priv->drm);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 80e8725..8853b07 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include "i915_drv.h"
 #include "intel_guc.h"
 
@@ -837,13 +839,159 @@ err:
return NULL;
 }
 
+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+void *subbuf,
+void *prev_subbuf,
+size_t prev_padding)
+{
+   /* Use no-overwrite mode by default, where relay will stop accepting
+* new data if there are no empty sub buffers left.
+* There is no strict synchronization enforced by relay between Consumer
+* and Producer. In overwrite mode, there is a possibility of getting
+* inconsistent/garbled data, the producer could be writing on to the
+* same sub buffer from which Consumer is reading. This can't be avoided
+* unless Consumer is fast enough and can always run in tandem with
+* Producer.
+*/
+   if (relay_buf_full(buf))
+   return 0;
+

[Intel-gfx] [PATCH 03/19] drm/i915: New structure to contain GuC logging related fields

2016-08-19 Thread akash . goel
From: Akash Goel 

So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 
 drivers/gpu/drm/i915/intel_guc.h   | 8 ++--
 drivers/gpu/drm/i915/intel_guc_loader.c| 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b89478a..b2cfcf4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2578,10 +2578,10 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!dev_priv->guc.log_vma)
+   if (!dev_priv->guc.log.vma)
return 0;
 
-   obj = dev_priv->guc.log_vma->obj;
+   obj = dev_priv->guc.log.vma->obj;
for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index fec5898..34fb893 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -843,7 +843,7 @@ static void guc_create_log(struct intel_guc *guc)
GUC_LOG_ISR_PAGES + 1 +
GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
 
-   vma = guc->log_vma;
+   vma = guc->log.vma;
if (!vma) {
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
@@ -852,7 +852,7 @@ static void guc_create_log(struct intel_guc *guc)
return;
}
 
-   guc->log_vma = vma;
+   guc->log.vma = vma;
}
 
/* each allocated unit is a page */
@@ -862,7 +862,7 @@ static void guc_create_log(struct intel_guc *guc)
(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
 
offset = i915_ggtt_offset(vma) >> PAGE_SHIFT; /* in pages */
-   guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+   guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
 static void init_guc_policies(struct guc_policies *policies)
@@ -1034,7 +1034,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
struct intel_guc *guc = &dev_priv->guc;
 
i915_vma_unpin_and_release(&guc->ads_vma);
-   i915_vma_unpin_and_release(&guc->log_vma);
+   i915_vma_unpin_and_release(&guc->log.vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index c973262..d32023c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -121,10 +121,14 @@ struct intel_guc_fw {
uint32_t ucode_offset;
 };
 
+struct intel_guc_log {
+   uint32_t flags;
+   struct i915_vma *vma;
+};
+
 struct intel_guc {
struct intel_guc_fw guc_fw;
-   uint32_t log_flags;
-   struct i915_vma *log_vma;
+   struct intel_guc_log log;
 
struct i915_vma *ads_vma;
struct i915_vma *ctx_pool_vma;
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index b3751c4..6fd24d9 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,7 +187,7 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+   params[GUC_CTL_LOG_PARAMS] = guc->log.flags;
 
if (i915.guc_log_level >= 0) {
params[GUC_CTL_DEBUG] =
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/19] drm/i915: Decouple GuC log setup from verbosity parameter

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

GuC Log buffer allocation was tied up with verbosity level module param
i915.guc_log_level. User would be given a provision to enable firmware
logging at runtime, through a host2guc action, and not necessarily during
Driver load time. But the address of log buffer can be passed only in
init params, at firmware load time, so GuC has to be reset and firmware
needs to be reloaded to pass the log buffer address at runtime.
To avoid reset of GuC & reload of firmware, allocation of log buffer will
be done always but logging would be enabled initially on GuC side based on
the value of module parameter guc_log_level.

v2: Update commit message to describe the constraint with allocation of
log buffer at runtime. (Tvrtko)

v3: Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
 drivers/gpu/drm/i915/intel_guc_loader.c| 6 --
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index bb40792..fec5898 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -834,9 +834,6 @@ static void guc_create_log(struct intel_guc *guc)
unsigned long offset;
uint32_t size, flags;
 
-   if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-   return;
-
if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 324812d..b3751c4 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,11 +187,13 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
+   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
if (i915.guc_log_level >= 0) {
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
params[GUC_CTL_DEBUG] =
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-   }
+   } else
+   params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;
 
if (guc->ads_vma) {
u32 ads = i915_ggtt_offset(guc->ads_vma) >> PAGE_SHIFT;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v8 00/19] Support for sustained capturing of GuC firmware logs

2016-08-19 Thread akash . goel
From: Akash Goel 

GuC firmware log its debug messages into a Host-GuC shared memory buffer
and when the buffer is half full it sends a Flush interrupt to Host.
GuC firmware follows the half-full draining protocol where it expects that
while it is writing to 2nd half of the buffer, 1st half would get consumed
by Host and then get a flush completed acknowledgment from Host, so that
it does not end up doing any overwrite causing loss of logs.
So far flush interrupt wasn't enabled on Host side & User could capture the
contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface.
But this couldn't meet couple of key requirements, especially of Validation,
first is to ensure capturing of all boot time logs even with high verbosity
level and second is to enable capturing of logs in a sustained manner like
for the entire duration of a workload.
Now Driver will enable flush interrupt and on receiving it, would copy the
contents of log buffer into its local buffer. The size of local buffer would
be big enough to contain multiple snapshots of the log buffer giving ample
time to User to pull boot time messages.
Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to
collect the logs. Availed relay framework to implement this interface, where
Driver will have to just use a relay API to store snapshots of GuC log buffer
in a buffer managed by relay. The relay buffer can be operated in a mode,
equivalent to 'dmesg -c' where the old data, not yet collected by User, will
be overwritten if buffer becomes full or it can be operated in no-overwrite
mode where relay will stop accepting new data if all sub buffers are full.
Have used the latter mode to avoid the possibility of getting garbled data. 
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the log
buffer is taken by Driver, so can run in tandem with the Driver and thus
capture logs in a sustained/streaming manner, without any loss of data.

v2: Rebased to the latest drm-intel-nightly.

v3: Aligned with the modification of late debugfs registration, at the end of
i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3
new patches to optimize the log-buffer flush interrupt handling, gather
and report the logging related stats.

v4: Added 2 new patches to further optimize the log-buffer flush interrupt
handling. Did cleanup as per Chris's review comments, fixed couple of
issues related to clearing of Guc2Host message register. Switched to
no-overwrite mode for the relay.

v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided
by a patch from Chris. Dropped the rt priority kthread patch, after
evaluating all the optimizations with certain benchmarks like
synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts
almost at every ms or less. Updated the older patches as per the review
comments from Tvrtko and Chris W. Added a new patch to augment i915 error
state with the GuC log buffer contents. Fixed the issue of User interrupt
getting disabled for VEBOX ring, causing failure for certain IGTs.
Also included 2 patches to support early logging for capturing boot
time logs and use per CPU constructs on the relay side so as to address
a WARNING issue with the call to relay_reserve(), without disabling
preemption.

v6: Mainly did the rebasing, refactoring, cleanup as per the review comments
and fixed error/warnings reported by checkpatch.

v7: Added a new patch to complete the pending log buffer flush work item in
system suspend case. Cleaned up the irq handler & work item function
by removing the check for GuC interrupts.

v8: Replaced the patch added in last version with a patch which marks the
GuC log buffer flush interrupt handling WQ as freezable, as per the inputs
from Imre. Refactored the log buffer sampling function and added a new
    helper function to improve the readability as per suggestions from Tvrtko.

Akash Goel (13):
  drm/i915: New structure to contain GuC logging related fields
  drm/i915: Add low level set of routines for programming PM IER/IIR/IMR
register set
  relay: Use per CPU constructs for the relay channel buffer pointers
  drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  drm/i915: New lock to serialize the Host2GuC actions
  drm/i915: Add stats for GuC log buffer flush interrupts
  drm/i915: Optimization to reduce the sampling time of GuC log buffer
  drm/i915: Increase GuC log buffer size to reduce flush interrupts
  drm/i915: Augment i915 error state to include the dump of GuC log
buffer
  drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
  drm/i915: Use SSE4.1 movntdqa based memcpy for sam

[Intel-gfx] [PATCH 04/19] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-08-19 Thread akash . goel
From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +-
 drivers/gpu/drm/i915/i915_irq.c | 75 ++---
 drivers/gpu/drm/i915/intel_drv.h|  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
 4 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35caa9b..a187ea7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1780,7 +1780,8 @@ struct drm_i915_private {
u32 de_irq_mask[I915_MAX_PIPES];
};
u32 gt_irq_mask;
-   u32 pm_irq_mask;
+   u32 pm_imr;
+   u32 pm_ier;
u32 pm_rps_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index ebb83d5..83d708a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private 
*dev_priv,
 
assert_spin_locked(&dev_priv->irq_lock);
 
-   new_val = dev_priv->pm_irq_mask;
+   new_val = dev_priv->pm_imr;
new_val &= ~interrupt_mask;
new_val |= (~enabled_irq_mask & interrupt_mask);
 
-   if (new_val != dev_priv->pm_irq_mask) {
-   dev_priv->pm_irq_mask = new_val;
-   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
+   if (new_val != dev_priv->pm_imr) {
+   dev_priv->pm_imr = new_val;
+   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
POSTING_READ(gen6_pm_imr(dev_priv));
}
 }
 
-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
@@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private 
*dev_priv, uint32_t mask)
snb_update_pm_irq(dev_priv, mask, mask);
 }
 
-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
- uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
snb_update_pm_irq(dev_priv, mask, 0);
 }
 
-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
 
-   __gen6_disable_pm_irq(dev_priv, mask);
+   __gen6_mask_pm_irq(dev_priv, mask);
 }
 
-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
 {
i915_reg_t reg = gen6_pm_iir(dev_priv);
 
-   spin_lock_irq(&dev_priv->irq_lock);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   I915_WRITE(reg, reset_mask);
+   I915_WRITE(reg, reset_mask);
POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier |= enable_mask;
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   gen6_unmask_pm_irq(dev_priv, enable_mask);
+   /* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier &= ~disable_mask;
+   __gen6_mask_pm_irq(dev_priv, disable_mask);
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   /* though a barrier is missing here, but don't really need a one */
+}
+
+void gen6_reset_rps_interrup

[Intel-gfx] [PATCH 06/19] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-19 Thread akash . goel
From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

v7:
- Use switch statement instead of 'if else' for retrieving the GuC log
  buffer size. (Tvrtko)
- Refactored the log buffer copying function and shortended the name of
  couple of variables for better readability. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 189 +
 drivers/gpu/drm/i915/i915_irq.c|  28 -
 drivers/gpu/drm/i915/intel_guc.h   |   4 +
 3 files changed, 220 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b062da6..80e8725 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
 
+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+   u32 data[1];
+
+   data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+   return host2guc_action(guc, data, 1);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -828,6 +837,166 @@ err:
return NULL;
 }
 
+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+   return NULL;
+}
+
+static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
+{
+   switch (type) {
+   case GUC_ISR_LOG_BUFFER:
+   return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE;
+   case GUC_DPC_LOG_BUFFER:
+   return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE;
+   case GUC_CRASH_DUMP_LOG_BUFFER:
+   return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE;
+   default:
+   MISSING_CASE(type);
+   }
+
+   return 0;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
+   struct guc_log_buffer_state log_buf_state_local;
+   unsigned int buffer_size, write_offset;
+   enum guc_log_buffer_type type;
+   void *src_data, *dst_data;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buf_state = src_data = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   log_buf_snapshot_state = dst_data = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data += PAGE_SIZE;
+   dst_data += PAGE_SIZE;
+
+   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   /* Make a copy of the state structure, inside GuC log

[Intel-gfx] [PATCH 16/19] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer

2016-08-17 Thread akash . goel
From: Akash Goel 

Host needs to sample the GuC log buffer on every flush interrupt from GuC.
To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used.
Though there could be some performance implication with Uncached read, but
reliability of data will be ensured.

v2: Rebase.

v3: Rebase.

v4: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 5a302fe..3e8af71 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1126,8 +1126,6 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
dst_data_ptr += buffer_size;
}
 
-   /* FIXME: invalidate/flush for log buffer needed */
-
/* Update the read pointer in the shared log buffer */
log_buffer_state->read_ptr = write_offset;
 
@@ -1196,8 +1194,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
return 0;
 
if (!guc->log.buf_addr) {
-   /* Create a vmalloc mapping of log buffer pages */
-   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB);
+   /* Create a WC (Uncached for read) vmalloc mapping of log
+* buffer pages, so that we can directly get the data
+* (up-to-date) from memory.
+*/
+   vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/19] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-17 Thread akash . goel
From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

v3:
- Add a comment to justiy setting 'is_global' before the NULL check on the
  parent directory dentry pointer.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 73 --
 1 file changed, 50 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6b81fbd..ef0c116 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -896,15 +896,16 @@ static struct dentry *create_buf_file_callback(const char 
*filename,
 {
struct dentry *buf_file;
 
-   if (!parent)
-   return NULL;
-
/* This to enable the use of a single buffer for the relay channel and
 * correspondingly have a single file exposed to User, through which
 * it can collect the logs in order without any post-processing.
+* Need to set 'is_global' even if parent is NULL for early logging.
 */
*is_global = 1;
 
+   if (!parent)
+   return NULL;
+
/* Not using the channel filename passed as an argument, since for each
 * channel relay appends the corresponding CPU number to the filename
 * passed in relay_open(). This should be fine as relay just needs a
@@ -937,13 +938,39 @@ static void guc_remove_log_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
 }
 
-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;
 
+   /* Keep the size of sub buffers same as shared log buffer */
+   subbuf_size = guc->log.vma->obj->base.size;
+
+   /* Store up to 8 snapshots, which is large enough to buffer sufficient
+* boot time logs and provides enough leeway to User, in terms of
+* latency, for consuming the logs from relay. Also doesn't take
+* up too much memory.
+*/
+   n_subbufs = 8;
+
+   guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
+   n_subbufs, &relay_callbacks, dev_priv);
+   if (!guc_log_relay_chan) {
+   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
+   return -ENOMEM;
+   }
+
+   guc->log.relay_chan = guc_log_relay_chan;
+   return 0;
+}
+
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
/* For now create the log file in /sys/kernel/debug/dri/0 dir */
log_dir = dev_priv->drm.primary->debugfs_root;
 
@@ -963,25 +990,12 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
return -ENODEV;
}
 
-   /* Keep the size of sub buffers same as shared log buffer */
-   subbuf_size = guc->log.vma->obj->base.size;
-
-   /* Store up to 8 snapshots, which is large enough to buffer sufficient
-* boot time logs and provides enough leeway to User, in terms of
-* latency, for consuming the logs from relay. Also doesn't take
-* up too much memory.
-*/
-   n_subbufs = 8;
-
-   guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size,
-   n_subbufs, &relay_callbacks, dev_priv);
-   if (!guc_log_relay_chan) {
-   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
-   return -ENOMEM;
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+   DRM_ERROR("Couldn't associate relay chan with fil

[Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-17 Thread akash . goel
From: Akash Goel 

The GuC log buffer flush work item does a register access to send the ack
to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended.
The work item function uses rpm_get/rpm_put calls around the Hw access,
this covers the runtime suspend case but for system suspend case (which can
be done asychronously/forcefully) sync would be required as kernel can
potentially schedule the work items even after some devices, including GFX,
have been put to suspend.
Also sync has to be done conditionally i.e. only for the system suspend
case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm
suspend path.

Cc: Imre Deak 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++-
 drivers/gpu/drm/i915/intel_guc.h   | 2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index cdee60b..2ae0ad4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev)
goto out;
}
 
-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, false);
 
intel_display_suspend(dev);
 
@@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device)
i915_gem_release_all_mmaps(dev_priv);
mutex_unlock(&dev->struct_mutex);
 
-   intel_guc_suspend(dev);
+   intel_guc_suspend(dev, true);
 
intel_runtime_pm_disable_interrupts(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef0c116..1af8a8b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
  * intel_guc_suspend() - notify GuC entering suspend state
  * @dev:   drm device
  */
-int intel_guc_suspend(struct drm_device *dev)
+int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct intel_guc *guc = &dev_priv->guc;
@@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
return 0;
 
gen9_disable_guc_interrupts(dev_priv);
+   /* Sync is needed only for the system suspend case, runtime suspend
+* case is covered due to rpm get/put calls used around Hw access in
+* the work item function.
+*/
+   if (!rpm_suspend && (i915.guc_log_level >= 0))
+   flush_work(&dev_priv->guc.log.flush_work);
 
ctx = dev_priv->kernel_context;
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index b56fe24..1367314 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -172,7 +172,7 @@ extern void intel_guc_init(struct drm_device *dev);
 extern int intel_guc_setup(struct drm_device *dev);
 extern void intel_guc_fini(struct drm_device *dev);
 extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
-extern int intel_guc_suspend(struct drm_device *dev);
+extern int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend);
 extern int intel_guc_resume(struct drm_device *dev);
 
 /* i915_guc_submission.c */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/19] drm/i915: Add stats for GuC log buffer flush interrupts

2016-08-17 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 28 
 drivers/gpu/drm/i915/i915_guc_submission.c | 20 
 drivers/gpu/drm/i915/i915_irq.c|  2 ++
 drivers/gpu/drm/i915/intel_guc.h   |  7 +++
 4 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b2cfcf4..aaa51bc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2491,6 +2491,32 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
return 0;
 }
 
+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+
+   seq_puts(m, "\nGuC logging stats:\n");
+
+   seq_printf(m, "\tISR:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+   seq_printf(m, "\tDPC:   flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+   seq_printf(m, "\tCRASH: flush count %10u, overflow count %10u\n",
+  guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+  guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);
+
+   seq_printf(m, "\tTotal flush interrupt count: %u\n",
+  guc->log.flush_interrupt_count);
+
+   seq_printf(m, "\tCapture miss count: %u\n",
+  guc->log.capture_miss_count);
+}
+
 static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
@@ -2565,6 +2591,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
i915_guc_client_info(m, dev_priv, &client);
 
+   i915_guc_log_info(m, dev_priv);
+
/* Add more as required ... */
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 99f4afa..8b53a22 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1036,6 +1036,25 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
   sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
 
+   /* Bookkeeping stuff */
+   guc->log.flush_count[type] += 
log_buffer_state_local.flush_to_file;
+   if (log_buffer_state_local.buffer_full_cnt !=
+   guc->log.prev_overflow_count[type]) {
+   guc->log.total_overflow_count[type] +=
+   (log_buffer_state_local.buffer_full_cnt -
+guc->log.prev_overflow_count[type]);
+
+   if (log_buffer_state_local.buffer_full_cnt <
+   guc->log.prev_overflow_count[type]) {
+   /* buffer_full_cnt is a 4 bit counter */
+   guc->log.total_overflow_count[type] += 16;
+   }
+
+   guc->log.prev_overflow_count[type] =
+   log_buffer_state_local.buffer_full_cnt;
+   DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   }
+
if (log_buffer_snapshot_state) {
/* First copy the state structure in snapshot buffer */
memcpy(log_buffer_snapshot_state, 
&log_buffer_state_local,
@@ -1076,6 +1095,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 * getting consumed by User at a slow rate.
 */
 

[Intel-gfx] [PATCH 17/19] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-08-17 Thread akash . goel
From: Akash Goel 

In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
based memcpy function i915_memcpy_from_wc.
GuC log buffer has a WC type vmalloc mapping and copying using movntqda
from WC type memory is almost as fast as reading from WB memory.
This will further reduce the log buffer sampling time, so is needed dearly
to deal with the flush interrupt storm when GuC is generating logs at a
very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 3e8af71..6b81fbd 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -,15 +,18 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
/* Just copy the newly written data */
if (read_offset <= write_offset) {
bytes_to_copy = write_offset - read_offset;
-   memcpy(dst_data_ptr + read_offset,
-  src_data_ptr + read_offset, 
bytes_to_copy);
+   i915_memcpy_from_wc(dst_data_ptr + read_offset,
+   src_data_ptr + read_offset,
+   bytes_to_copy);
} else {
bytes_to_copy = buffer_size - read_offset;
-   memcpy(dst_data_ptr + read_offset,
-  src_data_ptr + read_offset, 
bytes_to_copy);
+   i915_memcpy_from_wc(dst_data_ptr + read_offset,
+   src_data_ptr + read_offset,
+   bytes_to_copy);
 
bytes_to_copy = write_offset;
-   memcpy(dst_data_ptr, src_data_ptr, 
bytes_to_copy);
+   i915_memcpy_from_wc(dst_data_ptr, src_data_ptr,
+   bytes_to_copy);
}
 
src_data_ptr += buffer_size;
@@ -1241,6 +1244,16 @@ static void guc_create_log(struct intel_guc *guc)
 
vma = guc->log.vma;
if (!vma) {
+   /* We require SSE 4.1 for fast reads from the GuC log buffer and
+* it should be present on the chipsets supporting GuC based
+* submisssions.
+*/
+   if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+   /* logging will not be enabled */
+   i915.guc_log_level = -1;
+   return;
+   }
+
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
/* logging will be off */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 15/19] drm/i915: Debugfs support for GuC logging control

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 44 -
 drivers/gpu/drm/i915/i915_guc_submission.c | 62 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index aaa51bc..bfc22bf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2626,6 +2626,47 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   if (!dev_priv->guc.log.vma) {
+   ret = -EINVAL;
+   goto end;
+   }
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+end:
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -5436,7 +5477,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index beeeb23..5a302fe 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1570,3 +1580,55 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.value = control_val;
+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;
+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_param.verbosity;
+
+   /* If log_level was set as -1 at boot time, then the relay channel file
+* wouldn't have been created by now and interrupts also would not have
+* been enabled.
+*/
+   if (!dev_priv->guc.log.relay_chan)

[Intel-gfx] [PATCH 14/19] drm/i915: Forcefully flush GuC log buffer on reset

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

Before capturing the GuC logs as a part of error state, there should be a
force log buffer flush action sent to GuC before proceeding with GPU reset
and re-initializing GUC. There could be some data in the log buffer which
is yet to be captured and those logs would be particularly useful to
understand that why the GPU reset was initiated.

v2:
- Avoid the wait via flush_work, to serialize against an ongoing log
  buffer flush, from the error state capture path. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_gpu_error.c  |  2 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 30 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 94297aa..b73c671 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1301,6 +1301,8 @@ static void i915_gem_capture_guc_log_buffer(struct 
drm_i915_private *dev_priv,
if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
return;
 
+   i915_guc_flush_logs(dev_priv, false);
+
error->guc_log = i915_error_object_create(dev_priv,
  dev_priv->guc.log.vma);
 }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index e46828b..beeeb23 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -185,6 +185,16 @@ static int host2guc_logbuffer_flush_complete(struct 
intel_guc *guc)
return host2guc_action(guc, data, 1);
 }
 
+static int host2guc_force_logbuffer_flush(struct intel_guc *guc)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH;
+   data[1] = 0;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1521,6 +1531,26 @@ void i915_guc_capture_logs(struct drm_i915_private 
*dev_priv)
intel_runtime_pm_put(dev_priv);
 }
 
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait)
+{
+   if (!i915.enable_guc_submission || (i915.guc_log_level < 0))
+   return;
+
+   /* First disable the interrupts, will be renabled afterwards */
+   gen9_disable_guc_interrupts(dev_priv);
+
+   /* Before initiating the forceful flush, wait for any pending/ongoing
+* flush to complete otherwise forceful flush may not happen, but wait
+* can't be done for some paths like error state capture in which case
+* take a chance & directly attempt the forceful flush.
+*/
+   if (can_wait)
+   flush_work(&dev_priv->guc.log.flush_work);
+
+   /* Ask GuC to update the log buffer state */
+   host2guc_force_logbuffer_flush(&dev_priv->guc);
+}
+
 void i915_guc_unregister(struct drm_i915_private *dev_priv)
 {
if (!i915.enable_guc_submission)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 8598f38..d7eda42 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -182,6 +182,7 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request 
*rq);
 void i915_guc_submission_disable(struct drm_i915_private *dev_priv);
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv);
 void i915_guc_capture_logs(struct drm_i915_private *dev_priv);
+void i915_guc_flush_logs(struct drm_i915_private *dev_priv, bool can_wait);
 void i915_guc_register(struct drm_i915_private *dev_priv);
 void i915_guc_unregister(struct drm_i915_private *dev_priv);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/19] drm/i915: Increase GuC log buffer size to reduce flush interrupts

2016-08-17 Thread akash . goel
From: Akash Goel 

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 47ef7e8..49dbe25 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
 #define   GUC_LOG_CRASH_PAGES  1
 #define   GUC_LOG_CRASH_SHIFT  4
-#define   GUC_LOG_DPC_PAGES3
+#define   GUC_LOG_DPC_PAGES7
 #define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_ISR_PAGES3
+#define   GUC_LOG_ISR_PAGES7
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
 
@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
  *|   Crash dump state header |
  * Page1  +---+
  *|   ISR logs|
- * Page5  +---+
- *|   DPC logs|
  * Page9  +---+
+ *|   DPC logs|
+ * Page17 +---+
  *| Crash Dump logs   |
  *+---+
  *
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 13/19] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-17 Thread akash . goel
From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

v2:
- For uniformity use existing helper function print_error_obj() to
  dump out contents of GuC log buffer, pretty printing is better left
  to userspace. (Chris)
- Skip the dumping of GuC log buffer when logging is disabled as it
  won't be of any use.
- Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 20 
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 08553b9..de89596 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -734,6 +734,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore;
+   struct drm_i915_error_object *guc_log;
 
struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0c3f30c..94297aa 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -572,6 +572,13 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}
 
+   obj = error->guc_log;
+   if (obj) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   print_error_obj(m, obj);
+   }
+
if (error->overlay)
intel_overlay_print_error_state(m, error->overlay);
 
@@ -651,6 +658,7 @@ static void i915_error_state_free(struct kref *error_ref)
}
 
i915_error_object_free(error->semaphore);
+   i915_error_object_free(error->guc_log);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -1286,6 +1294,17 @@ static void i915_capture_pinned_buffers(struct 
drm_i915_private *dev_priv,
error->pinned_bo = bo;
 }
 
+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+   struct drm_i915_error_state *error)
+{
+   /* Capturing log buf contents won't be useful if logging was disabled */
+   if (!dev_priv->guc.log.vma || (i915.guc_log_level < 0))
+   return;
+
+   error->guc_log = i915_error_object_create(dev_priv,
+ dev_priv->guc.log.vma);
+}
+
 /* Capture all registers which don't fit into another category. */
 static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
   struct drm_i915_error_state *error)
@@ -1437,6 +1456,7 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
i915_gem_record_rings(dev_priv, error);
i915_capture_active_buffers(dev_priv, error);
i915_capture_pinned_buffers(dev_priv, error);
+   i915_gem_capture_guc_log_buffer(dev_priv, error);
 
do_gettimeofday(&error->time);
 
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/19] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-17 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

v4: Add curlies for 'else' case also, matching the 'if'. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 40 +-
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8b53a22..e46828b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1011,8 +1011,9 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
struct guc_log_buffer_state *log_buffer_state, 
*log_buffer_snapshot_state;
struct guc_log_buffer_state log_buffer_state_local;
void *src_data_ptr, *dst_data_ptr;
-   unsigned int buffer_size;
+   unsigned int buffer_size, read_offset, write_offset, bytes_to_copy;
enum guc_log_buffer_type type;
+   bool new_overflow;
 
if (WARN_ON(!guc->log.buf_addr))
return;
@@ -1035,11 +1036,14 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buffer_state_local, log_buffer_state,
   sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
+   read_offset = log_buffer_state_local.read_ptr;
+   write_offset = log_buffer_state_local.sampled_write_ptr;
 
/* Bookkeeping stuff */
guc->log.flush_count[type] += 
log_buffer_state_local.flush_to_file;
if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[type]) {
+   new_overflow = 1;
guc->log.total_overflow_count[type] +=
(log_buffer_state_local.buffer_full_cnt -
 guc->log.prev_overflow_count[type]);
@@ -1053,6 +1057,8 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
guc->log.prev_overflow_count[type] =
log_buffer_state_local.buffer_full_cnt;
DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   } else {
+   new_overflow = 0;
}
 
if (log_buffer_snapshot_state) {
@@ -1065,13 +1071,36 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 * for consistency set the write pointer value to same
 * value of sampled_write_ptr in the snapshot buffer.
 */
-   log_buffer_snapshot_state->write_ptr =
-   log_buffer_snapshot_state->sampled_write_ptr;
+   log_buffer_snapshot_state->write_ptr = write_offset;
 
log_buffer_snapshot_state++;
 
/* Now copy the actual logs. */
-   memcpy(dst_data_ptr, src_data_ptr, buffer_size);
+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable 
*/
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset <= write_offset) {
+   bytes_to_copy = write_offset - read_offset;
+   memcpy(dst_data_ptr + read_offset,
+  src_data_ptr + read_offset, 
bytes_to_copy);
+   } else {
+   bytes_to_copy = buffer_size - read_offset;
+   memcpy(dst_

[Intel-gfx] [PATCH 09/19] drm/i915: New lock to serialize the Host2GuC actions

2016-08-17 Thread akash . goel
From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
while waiting for the action's response from GuC. (Tvrtko)

v3: To conform to the general rules, acquire mutex before taking the
forcewake. (Tvrtko)

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/intel_guc.h   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1713ce8..99f4afa 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -87,6 +87,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
if (WARN_ON(len < 1 || len > 15))
return -EINVAL;
 
+   mutex_lock(&guc->action_lock);
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
dev_priv->guc.action_count += 1;
@@ -127,6 +128,7 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
dev_priv->guc.action_status = status;
 
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+   mutex_unlock(&guc->action_lock);
 
return ret;
 }
@@ -1344,6 +1346,7 @@ int i915_guc_submission_init(struct drm_i915_private 
*dev_priv)
 
guc->ctx_pool_vma = vma;
ida_init(&guc->ctx_ids);
+   mutex_init(&guc->action_lock);
guc_create_log(guc);
guc_create_ads(guc);
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 3299cce..1704495 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -155,6 +155,9 @@ struct intel_guc {
 
uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
+
+   /* To serialize the Host2GuC actions */
+   struct mutex action_lock;
 };
 
 /* intel_guc_loader.c */
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/19] drm/i915: Support for GuC interrupts

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  5 +++
 drivers/gpu/drm/i915/i915_irq.c| 56 +++---
 drivers/gpu/drm/i915/i915_reg.h| 11 ++
 drivers/gpu/drm/i915/intel_drv.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 ++
 drivers/gpu/drm/i915/intel_guc_loader.c|  4 +++
 7 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a187ea7..08553b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1783,6 +1783,7 @@ struct drm_i915_private {
u32 pm_imr;
u32 pm_ier;
u32 pm_rps_events;
+   u32 pm_guc_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 34fb893..b062da6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1055,6 +1055,8 @@ int intel_guc_suspend(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   gen9_disable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1081,6 +1083,9 @@ int intel_guc_resume(struct drm_device *dev)
if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
return 0;
 
+   if (i915.guc_log_level >= 0)
+   gen9_enable_guc_interrupts(dev_priv);
+
ctx = dev_priv->kernel_context;
 
data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 83d708a..fc1fe72 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct drm_i915_private 
*dev_priv,
 } while (0)
 
 static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 
pm_iir);
 
 /* For display hotplug interrupt */
 static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct drm_i915_private 
*dev_priv)
gen6_reset_rps_interrupts(dev_priv);
 }
 
+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   if (!dev_priv->guc.interrupts_enabled) {
+   WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+  dev_priv->pm_guc_events);
+   dev_priv->guc.interrupts_enabled = true;
+   gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+   }
+   spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+   spin_lock_irq(&dev_priv->irq_lock);
+   dev_priv->guc.interru

[Intel-gfx] [PATCH 07/19] relay: Use per CPU constructs for the relay channel buffer pointers

2016-08-17 Thread akash . goel
From: Akash Goel 

relay essentially needs to maintain the per CPU array of channel buffer
pointers but it manually creates that array.
Instead its better to avail the per CPU constructs, provided by the
kernel, to allocate & access the array of pointer to channel buffers.

v2: Include  in relay.h so that it pulls in the percpu
api explicitly. (Chris)

Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Akash Goel 
Reviewed-by: Chris Wilson 
---
 include/linux/relay.h | 17 +++-
 kernel/relay.c| 74 +--
 2 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index d7c8359..eb295e3 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Tracks changes to rchan/rchan_buf structs
@@ -63,7 +64,7 @@ struct rchan
struct kref kref;   /* channel refcount */
void *private_data; /* for user-defined data */
size_t last_toobig; /* tried to log event > subbuf size */
-   struct rchan_buf *buf[NR_CPUS]; /* per-cpu channel buffers */
+   struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
int is_global;  /* One global buffer ? */
struct list_head list;  /* for channel list */
struct dentry *parent;  /* parent dentry passed to open */
@@ -204,7 +205,7 @@ static inline void relay_write(struct rchan *chan,
struct rchan_buf *buf;
 
local_irq_save(flags);
-   buf = chan->buf[smp_processor_id()];
+   buf = *this_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
@@ -230,12 +231,12 @@ static inline void __relay_write(struct rchan *chan,
 {
struct rchan_buf *buf;
 
-   buf = chan->buf[get_cpu()];
+   buf = *get_cpu_ptr(chan->buf);
if (unlikely(buf->offset + length > buf->chan->subbuf_size))
length = relay_switch_subbuf(buf, length);
memcpy(buf->data + buf->offset, data, length);
buf->offset += length;
-   put_cpu();
+   put_cpu_ptr(chan->buf);
 }
 
 /**
@@ -251,17 +252,19 @@ static inline void __relay_write(struct rchan *chan,
  */
 static inline void *relay_reserve(struct rchan *chan, size_t length)
 {
-   void *reserved;
-   struct rchan_buf *buf = chan->buf[smp_processor_id()];
+   void *reserved = NULL;
+   struct rchan_buf *buf = *get_cpu_ptr(chan->buf);
 
if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
length = relay_switch_subbuf(buf, length);
if (!length)
-   return NULL;
+   goto end;
}
reserved = buf->data + buf->offset;
buf->offset += length;
 
+end:
+   put_cpu_ptr(chan->buf);
return reserved;
 }
 
diff --git a/kernel/relay.c b/kernel/relay.c
index d797502..f55ab82 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -214,7 +214,7 @@ static void relay_destroy_buf(struct rchan_buf *buf)
__free_page(buf->page_array[i]);
relay_free_page_array(buf->page_array);
}
-   chan->buf[buf->cpu] = NULL;
+   *per_cpu_ptr(chan->buf, buf->cpu) = NULL;
kfree(buf->padding);
kfree(buf);
kref_put(&chan->kref, relay_destroy_channel);
@@ -382,20 +382,21 @@ static void __relay_reset(struct rchan_buf *buf, unsigned 
int init)
  */
 void relay_reset(struct rchan *chan)
 {
+   struct rchan_buf *buf;
unsigned int i;
 
if (!chan)
return;
 
-   if (chan->is_global && chan->buf[0]) {
-   __relay_reset(chan->buf[0], 0);
+   if (chan->is_global && (buf = *per_cpu_ptr(chan->buf, 0))) {
+   __relay_reset(buf, 0);
return;
}
 
mutex_lock(&relay_channels_mutex);
for_each_possible_cpu(i)
-   if (chan->buf[i])
-   __relay_reset(chan->buf[i], 0);
+   if ((buf = *per_cpu_ptr(chan->buf, i)))
+   __relay_reset(buf, 0);
mutex_unlock(&relay_channels_mutex);
 }
 EXPORT_SYMBOL_GPL(relay_reset);
@@ -440,7 +441,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
struct dentry *dentry;
 
if (chan->is_global)
-   return chan->buf[0];
+   return *per_cpu_ptr(chan->buf, 0);
 
buf = relay_create_buf(chan);
if (!buf)
@@ -464,7 +465,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, 
unsigned int cpu)
__relay_reset(buf, 1);
 
   

[Intel-gfx] [PATCH 08/19] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-17 Thread akash . goel
From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
  associated with the debugfs file. (Tvrtko)

v6:
- Move the write to 'is_global' after the NULL check on parent directory
  dentry pointer. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 212 -
 drivers/gpu/drm/i915/intel_guc.h   |   3 +
 4 files changed, 216 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
+   select RELAY
# i915 depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 13ae340..cdee60b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1133,6 +1133,7 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
/* Reveal our presence to userspace */
if (drm_dev_register(dev, 0) == 0) {
i915_debugfs_register(dev_priv);
+   i915_guc_register(dev_priv);
i915_setup_sysfs(dev);
} else
DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1171,6 +1172,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
intel_opregion_unregister(dev_priv);
 
i915_teardown_sysfs(&dev_priv->drm);
+   i915_guc_unregister(dev_priv);
i915_debugfs_unregister(dev_priv);
drm_dev_unregister(&dev_priv->drm);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ade51cb..1713ce8 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include "i915_drv.h"
 #include "intel_guc.h"
 
@@ -837,13 +839,159 @@ err:
return NULL;
 }
 
+/*
+ * Sub buffer switch callback. Called whenever relay has to switch to a new
+ * sub buffer, relay stays on the same sub buffer if 0 is returned.
+ */
+static int subbuf_start_callback(struct rchan_buf *buf,
+void *subbuf,
+void *prev_subbuf,
+size_t prev_padding)
+{
+   /* Use no-overwrite mode by default, where relay will stop accepting
+* new data if there are no empty sub buffers left.
+* There is no strict synchronization enforced by relay between Consumer
+* and Producer. In overwrite mode, there is a possibility of getting
+* inconsistent/garbled data, the producer could be writing on to the
+* same sub buffer from which Consumer is reading. This can't be avoided
+* unless Consumer is fast enough and can always run in tandem with
+* Producer.
+*/
+   if (relay_buf_full(buf))
+   return 0;
+
+   return

[Intel-gfx] [PATCH 02/19] drm/i915: Add GuC ukernel logging related fields to fw interface file

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 944786d..47ef7e8 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -418,15 +418,87 @@ struct guc_ads {
u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+   GUC_ISR_LOG_BUFFER,
+   GUC_DPC_LOG_BUFFER,
+   GUC_CRASH_DUMP_LOG_BUFFER,
+   GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +---+
+ *|   ISR state header (32 bytes) |
+ *|  DPC state header |
+ *|   Crash dump state header |
+ * Page1  +---+
+ *|   ISR logs|
+ * Page5  +---+
+ *|   DPC logs|
+ * Page9  +---+
+ *| Crash Dump logs   |
+ *+---+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgment from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receiving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgment to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+   u32 marker[2];
+   u32 read_ptr;
+   u32 write_ptr;
+   u32 size;
+   u32 sampled_write_ptr;
+   union {
+   struct {
+   u32 flush_to_file:1;
+   u32 buffer_full_cnt:4;
+   u32 reserved:27;
+   };
+   u32 flags;
+   };
+   u32 version;
+} __packed;
+
+union guc_log_control {
+   struct {
+   u32 logging_enabled:1;
+   u32 reserved1:3;
+   u32 verbosity:4;
+   u32 reserved2:24;
+   };
+   u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
HOST2GUC_ACTION_DEFAULT = 0x0,
HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+   HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+   HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
HOST2GUC_ACTION_LIMIT
 };
 
@@ -448,4 +520,10 @@ enum guc2host_status {
GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0xF000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+   GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+   GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists

[Intel-gfx] [PATCH 06/19] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 186 +
 drivers/gpu/drm/i915/i915_irq.c|  28 -
 drivers/gpu/drm/i915/intel_guc.h   |   4 +
 3 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index b062da6..ade51cb 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
 
+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+   u32 data[1];
+
+   data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+   return host2guc_action(guc, data, 1);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -828,6 +837,163 @@ err:
return NULL;
 }
 
+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+}
+
+static void *guc_get_write_buffer(struct intel_guc *guc)
+{
+   return NULL;
+}
+
+static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
+{
+   if (type == GUC_ISR_LOG_BUFFER)
+   return (GUC_LOG_ISR_PAGES + 1) * PAGE_SIZE;
+   else if (type == GUC_DPC_LOG_BUFFER)
+   return (GUC_LOG_DPC_PAGES + 1) * PAGE_SIZE;
+   else
+   return (GUC_LOG_CRASH_PAGES + 1) * PAGE_SIZE;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+   struct guc_log_buffer_state *log_buffer_state, 
*log_buffer_snapshot_state;
+   struct guc_log_buffer_state log_buffer_state_local;
+   void *src_data_ptr, *dst_data_ptr;
+   unsigned int buffer_size;
+   enum guc_log_buffer_type type;
+
+   if (WARN_ON(!guc->log.buf_addr))
+   return;
+
+   /* Get the pointer to shared GuC log buffer */
+   log_buffer_state = src_data_ptr = guc->log.buf_addr;
+
+   /* Get the pointer to local buffer to store the logs */
+   dst_data_ptr = log_buffer_snapshot_state = guc_get_write_buffer(guc);
+
+   /* Actual logs are present from the 2nd page */
+   src_data_ptr += PAGE_SIZE;
+   dst_data_ptr += PAGE_SIZE;
+
+   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   /* Make a copy of the state structure in GuC log buffer (which
+* is uncached mapped) on the stack to avoid reading from it
+* multiple times.
+*/
+   memcpy(&log_buffer_state_local, log_buffer_state,
+  sizeof(struct guc_log_buffer_state));
+   buffer_size = guc_get_log_

[Intel-gfx] [PATCH 03/19] drm/i915: New structure to contain GuC logging related fields

2016-08-17 Thread akash . goel
From: Akash Goel 

So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 4 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 8 
 drivers/gpu/drm/i915/intel_guc.h   | 8 ++--
 drivers/gpu/drm/i915/intel_guc_loader.c| 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b89478a..b2cfcf4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2578,10 +2578,10 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!dev_priv->guc.log_vma)
+   if (!dev_priv->guc.log.vma)
return 0;
 
-   obj = dev_priv->guc.log_vma->obj;
+   obj = dev_priv->guc.log.vma->obj;
for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index fec5898..34fb893 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -843,7 +843,7 @@ static void guc_create_log(struct intel_guc *guc)
GUC_LOG_ISR_PAGES + 1 +
GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
 
-   vma = guc->log_vma;
+   vma = guc->log.vma;
if (!vma) {
vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) {
@@ -852,7 +852,7 @@ static void guc_create_log(struct intel_guc *guc)
return;
}
 
-   guc->log_vma = vma;
+   guc->log.vma = vma;
}
 
/* each allocated unit is a page */
@@ -862,7 +862,7 @@ static void guc_create_log(struct intel_guc *guc)
(GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
 
offset = i915_ggtt_offset(vma) >> PAGE_SHIFT; /* in pages */
-   guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+   guc->log.flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
 static void init_guc_policies(struct guc_policies *policies)
@@ -1034,7 +1034,7 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
struct intel_guc *guc = &dev_priv->guc;
 
i915_vma_unpin_and_release(&guc->ads_vma);
-   i915_vma_unpin_and_release(&guc->log_vma);
+   i915_vma_unpin_and_release(&guc->log.vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index c973262..d32023c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -121,10 +121,14 @@ struct intel_guc_fw {
uint32_t ucode_offset;
 };
 
+struct intel_guc_log {
+   uint32_t flags;
+   struct i915_vma *vma;
+};
+
 struct intel_guc {
struct intel_guc_fw guc_fw;
-   uint32_t log_flags;
-   struct i915_vma *log_vma;
+   struct intel_guc_log log;
 
struct i915_vma *ads_vma;
struct i915_vma *ctx_pool_vma;
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index b3751c4..6fd24d9 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,7 +187,7 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+   params[GUC_CTL_LOG_PARAMS] = guc->log.flags;
 
if (i915.guc_log_level >= 0) {
params[GUC_CTL_DEBUG] =
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/19] drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set

2016-08-17 Thread akash . goel
From: Akash Goel 

So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +-
 drivers/gpu/drm/i915/i915_irq.c | 75 ++---
 drivers/gpu/drm/i915/intel_drv.h|  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 +-
 4 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35caa9b..a187ea7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1780,7 +1780,8 @@ struct drm_i915_private {
u32 de_irq_mask[I915_MAX_PIPES];
};
u32 gt_irq_mask;
-   u32 pm_irq_mask;
+   u32 pm_imr;
+   u32 pm_ier;
u32 pm_rps_events;
u32 pipestat_irq_mask[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index ebb83d5..83d708a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -303,18 +303,18 @@ static void snb_update_pm_irq(struct drm_i915_private 
*dev_priv,
 
assert_spin_locked(&dev_priv->irq_lock);
 
-   new_val = dev_priv->pm_irq_mask;
+   new_val = dev_priv->pm_imr;
new_val &= ~interrupt_mask;
new_val |= (~enabled_irq_mask & interrupt_mask);
 
-   if (new_val != dev_priv->pm_irq_mask) {
-   dev_priv->pm_irq_mask = new_val;
-   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_irq_mask);
+   if (new_val != dev_priv->pm_imr) {
+   dev_priv->pm_imr = new_val;
+   I915_WRITE(gen6_pm_imr(dev_priv), dev_priv->pm_imr);
POSTING_READ(gen6_pm_imr(dev_priv));
}
 }
 
-void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
@@ -322,28 +322,54 @@ void gen6_enable_pm_irq(struct drm_i915_private 
*dev_priv, uint32_t mask)
snb_update_pm_irq(dev_priv, mask, mask);
 }
 
-static void __gen6_disable_pm_irq(struct drm_i915_private *dev_priv,
- uint32_t mask)
+static void __gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
snb_update_pm_irq(dev_priv, mask, 0);
 }
 
-void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
if (WARN_ON(!intel_irqs_enabled(dev_priv)))
return;
 
-   __gen6_disable_pm_irq(dev_priv, mask);
+   __gen6_mask_pm_irq(dev_priv, mask);
 }
 
-void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+void gen6_reset_pm_iir(struct drm_i915_private *dev_priv, u32 reset_mask)
 {
i915_reg_t reg = gen6_pm_iir(dev_priv);
 
-   spin_lock_irq(&dev_priv->irq_lock);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
-   I915_WRITE(reg, dev_priv->pm_rps_events);
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   I915_WRITE(reg, reset_mask);
+   I915_WRITE(reg, reset_mask);
POSTING_READ(reg);
+}
+
+void gen6_enable_pm_irq(struct drm_i915_private *dev_priv, u32 enable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier |= enable_mask;
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   gen6_unmask_pm_irq(dev_priv, enable_mask);
+   /* unmask_pm_irq provides an implicit barrier (POSTING_READ) */
+}
+
+void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_mask)
+{
+   assert_spin_locked(&dev_priv->irq_lock);
+
+   dev_priv->pm_ier &= ~disable_mask;
+   __gen6_mask_pm_irq(dev_priv, disable_mask);
+   I915_WRITE(gen6_pm_ier(dev_priv), dev_priv->pm_ier);
+   /* though a barrier is missing here, but don't really need a one */
+}
+
+void gen6_reset_rps_interrup

[Intel-gfx] [PATCH v7 00/19] Support for sustained capturing of GuC firmware logs

2016-08-17 Thread akash . goel
From: Akash Goel 

GuC firmware log its debug messages into a Host-GuC shared memory buffer
and when the buffer is half full it sends a Flush interrupt to Host.
GuC firmware follows the half-full draining protocol where it expects that
while it is writing to 2nd half of the buffer, 1st half would get consumed
by Host and then get a flush completed acknowledgment from Host, so that
it does not end up doing any overwrite causing loss of logs.
So far flush interrupt wasn't enabled on Host side & User could capture the
contents/snapshot of log buffer through 'i915_guc_log_dump' debugfs iface.
But this couldn't meet couple of key requirements, especially of Validation,
first is to ensure capturing of all boot time logs even with high verbosity
level and second is to enable capturing of logs in a sustained manner like
for the entire duration of a workload.
Now Driver will enable flush interrupt and on receiving it, would copy the
contents of log buffer into its local buffer. The size of local buffer would
be big enough to contain multiple snapshots of the log buffer giving ample
time to User to pull boot time messages.
Have added a debugfs interface '/sys/kernel/debug/dri/guc_log' for User to
collect the logs. Availed relay framework to implement this interface, where
Driver will have to just use a relay API to store snapshots of GuC log buffer
in a buffer managed by relay. The relay buffer can be operated in a mode,
equivalent to 'dmesg -c' where the old data, not yet collected by User, will
be overwritten if buffer becomes full or it can be operated in no-overwrite
mode where relay will stop accepting new data if all sub buffers are full.
Have used the latter mode to avoid the possibility of getting garbled data. 
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the log
buffer is taken by Driver, so can run in tandem with the Driver and thus
capture logs in a sustained/streaming manner, without any loss of data.

v2: Rebased to the latest drm-intel-nightly.

v3: Aligned with the modification of late debugfs registration, at the end of
i915 Driver load. Did cleanup as per Tvrtko's review comments, added 3
new patches to optimize the log-buffer flush interrupt handling, gather
and report the logging related stats.

v4: Added 2 new patches to further optimize the log-buffer flush interrupt
handling. Did cleanup as per Chris's review comments, fixed couple of
issues related to clearing of Guc2Host message register. Switched to
no-overwrite mode for the relay.

v5: Added a new patch to avail MOVNTDQA instruction based fast memcpy provided
by a patch from Chris. Dropped the rt priority kthread patch, after
evaluating all the optimizations with certain benchmarks like
synmark_oglmultithread, synmark_oglbatch5 which generates flush interupts
almost at every ms or less. Updated the older patches as per the review
comments from Tvrtko and Chris W. Added a new patch to augment i915 error
state with the GuC log buffer contents. Fixed the issue of User interrupt
getting disabled for VEBOX ring, causing failure for certain IGTs.
Also included 2 patches to support early logging for capturing boot
time logs and use per CPU constructs on the relay side so as to address
a WARNING issue with the call to relay_reserve(), without disabling
preemption.

v6: Mainly did the rebasing, refactoring, cleanup as per the review comments
and fixed error/warnings reported by checkpatch.

v7: Added a new patch to complete the pending log buffer flush work item in
system suspend case. Cleaned up the irq handler & work item function
by removing the check for GuC interrupts.

Akash Goel (13):
  drm/i915: New structure to contain GuC logging related fields
  drm/i915: Add low level set of routines for programming PM IER/IIR/IMR
register set
  relay: Use per CPU constructs for the relay channel buffer pointers
  drm/i915: Add a relay backed debugfs interface for capturing GuC logs
  drm/i915: New lock to serialize the Host2GuC actions
  drm/i915: Add stats for GuC log buffer flush interrupts
  drm/i915: Optimization to reduce the sampling time of GuC log buffer
  drm/i915: Increase GuC log buffer size to reduce flush interrupts
  drm/i915: Augment i915 error state to include the dump of GuC log
buffer
  drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer
  drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
  drm/i915: Early creation of relay channel for capturing boot time logs
  drm/i915: Sync against the GuC log buffer flush work item on system
suspend

Sagar Arun Kamble (6):
  drm/i915: Decouple GuC log setup from verbosity parameter
  drm/i915: Add GuC ukernel logging related

[Intel-gfx] [PATCH 01/19] drm/i915: Decouple GuC log setup from verbosity parameter

2016-08-17 Thread akash . goel
From: Sagar Arun Kamble 

GuC Log buffer allocation was tied up with verbosity level module param
i915.guc_log_level. User would be given a provision to enable firmware
logging at runtime, through a host2guc action, and not necessarily during
Driver load time. But the address of log buffer can be passed only in
init params, at firmware load time, so GuC has to be reset and firmware
needs to be reloaded to pass the log buffer address at runtime.
To avoid reset of GuC & reload of firmware, allocation of log buffer will
be done always but logging would be enabled initially on GuC side based on
the value of module parameter guc_log_level.

v2: Update commit message to describe the constraint with allocation of
log buffer at runtime. (Tvrtko)

v3: Rebase.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 ---
 drivers/gpu/drm/i915/intel_guc_loader.c| 6 --
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index bb40792..fec5898 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -834,9 +834,6 @@ static void guc_create_log(struct intel_guc *guc)
unsigned long offset;
uint32_t size, flags;
 
-   if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
-   return;
-
if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 324812d..b3751c4 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -187,11 +187,13 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
GUC_CTL_VCS2_ENABLED;
 
+   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+
if (i915.guc_log_level >= 0) {
-   params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
params[GUC_CTL_DEBUG] =
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
-   }
+   } else
+   params[GUC_CTL_DEBUG] = GUC_LOG_DISABLED;
 
if (guc->ads_vma) {
u32 ads = i915_ggtt_offset(guc->ads_vma) >> PAGE_SHIFT;
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/18] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-15 Thread akash . goel
From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 66 +-
 1 file changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8733c19..34e4335 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -937,13 +937,39 @@ static void guc_remove_log_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
 }
 
-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;
 
+   /* Keep the size of sub buffers same as shared log buffer */
+   subbuf_size = guc->log.vma->obj->base.size;
+
+   /* Store up to 8 snapshots, which is large enough to buffer sufficient
+* boot time logs and provides enough leeway to User, in terms of
+* latency, for consuming the logs from relay. Also doesn't take
+* up too much memory.
+*/
+   n_subbufs = 8;
+
+   guc_log_relay_chan = relay_open(NULL, NULL, subbuf_size,
+   n_subbufs, &relay_callbacks, dev_priv);
+   if (!guc_log_relay_chan) {
+   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
+   return -ENOMEM;
+   }
+
+   guc->log.relay_chan = guc_log_relay_chan;
+   return 0;
+}
+
+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
/* For now create the log file in /sys/kernel/debug/dri/0 dir */
log_dir = dev_priv->drm.primary->debugfs_root;
 
@@ -963,25 +989,12 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
return -ENODEV;
}
 
-   /* Keep the size of sub buffers same as shared log buffer */
-   subbuf_size = guc->log.vma->obj->base.size;
-
-   /* Store up to 8 snapshots, which is large enough to buffer sufficient
-* boot time logs and provides enough leeway to User, in terms of
-* latency, for consuming the logs from relay. Also doesn't take
-* up too much memory.
-*/
-   n_subbufs = 8;
-
-   guc_log_relay_chan = relay_open("guc_log", log_dir, subbuf_size,
-   n_subbufs, &relay_callbacks, dev_priv);
-   if (!guc_log_relay_chan) {
-   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
-   return -ENOMEM;
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+   DRM_ERROR("Couldn't associate relay chan with file %d\n", ret);
+   return ret;
}
 
-   /* FIXME: Cover the update under a lock ? */
-   guc->log.relay_chan = guc_log_relay_chan;
return 0;
 }
 
@@ -1001,7 +1014,6 @@ static void guc_move_to_next_buf(struct intel_guc *guc)
 
 static void *guc_get_write_buffer(struct intel_guc *guc)
 {
-   /* FIXME: Cover the check under a lock ? */
if (!guc->log.relay_chan)
return NULL;
 
@@ -1227,6 +1239,16 @@ static int guc_create_log_extras(struct intel_guc *guc)
guc->log.buf_addr = vaddr;
}
 
+   if (!guc->log.relay_chan) {
+   /* Create a relay channel, so that we have buffers for storing
+* the GuC firmware logs, the channel will be linked with a file
+* later on when debugfs is registered.
+*/
+   ret = guc_create_relay_channel(guc);
+   if (ret)
+   return ret;
+  

[Intel-gfx] [PATCH 15/18] drm/i915: Debugfs support for GuC logging control

2016-08-15 Thread akash . goel
From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 44 -
 drivers/gpu/drm/i915/i915_guc_submission.c | 62 ++
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 3 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index aaa51bc..bfc22bf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2626,6 +2626,47 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
 }
 
+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.vma)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   if (!dev_priv->guc.log.vma) {
+   ret = -EINVAL;
+   goto end;
+   }
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+end:
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -5436,7 +5477,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
 };
 
 void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 85df2f3..be6c727 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
 }
 
+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
 /*
  * Initialise, update, or clear doorbell data shared with the GuC
  *
@@ -1585,3 +1595,55 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
 }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.value = control_val;
+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;
+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_param.verbosity;
+
+   /* If log_level was set as -1 at boot time, then the relay channel file
+* wouldn't have been created by now and interrupts also would not have
+* been enabled.
+*/
+   if (!dev_priv->guc.log.relay_chan) {
+   ret = guc_log_late_

[Intel-gfx] [PATCH 11/18] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-15 Thread akash . goel
From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 40 +-
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7b4a57..b8d6313 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1003,6 +1003,8 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
void *src_data_ptr, *dst_data_ptr;
unsigned int buffer_size, expected_size;
enum guc_log_buffer_type type;
+   unsigned int read_offset, write_offset, bytes_to_copy;
+   bool new_overflow;
 
if (WARN_ON(!guc->log.buf_addr))
return;
@@ -1025,11 +1027,14 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buffer_state_local, log_buffer_state,
   sizeof(struct guc_log_buffer_state));
buffer_size = log_buffer_state_local.size;
+   read_offset = log_buffer_state_local.read_ptr;
+   write_offset = log_buffer_state_local.sampled_write_ptr;
 
/* Bookkeeping stuff */
guc->log.flush_count[type] += 
log_buffer_state_local.flush_to_file;
if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[type]) {
+   new_overflow = 1;
guc->log.total_overflow_count[type] +=
(log_buffer_state_local.buffer_full_cnt -
 guc->log.prev_overflow_count[type]);
@@ -1043,7 +1048,8 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
guc->log.prev_overflow_count[type] =
log_buffer_state_local.buffer_full_cnt;
DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-   }
+   } else
+   new_overflow = 0;
 
if (log_buffer_snapshot_state) {
/* First copy the state structure in snapshot buffer */
@@ -1055,8 +1061,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 * for consistency set the write pointer value to same
 * value of sampled_write_ptr in the snapshot buffer.
 */
-   log_buffer_snapshot_state->write_ptr =
-   log_buffer_snapshot_state->sampled_write_ptr;
+   log_buffer_snapshot_state->write_ptr = write_offset;
 
log_buffer_snapshot_state++;
 
@@ -1079,7 +1084,31 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
buffer_size = expected_size;
}
 
-   memcpy(dst_data_ptr, src_data_ptr, buffer_size);
+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable 
*/
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset <= write_offset) {
+   bytes_to_copy = write_offset - read_offset;
+   memcpy(dst_data_ptr + read_offset,
+  src_data_ptr + read_offset, 
bytes_to_copy);
+   } else {
+   bytes_to_copy = buffer_size - read_offset;
+   memcpy(dst_data_ptr + read_offset,
+   

[Intel-gfx] [PATCH 02/18] drm/i915: Add GuC ukernel logging related fields to fw interface file

2016-08-15 Thread akash . goel
From: Sagar Arun Kamble 

The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 944786d..47ef7e8 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -418,15 +418,87 @@ struct guc_ads {
u32 reserved2[4];
 } __packed;
 
+/* GuC logging structures */
+
+enum guc_log_buffer_type {
+   GUC_ISR_LOG_BUFFER,
+   GUC_DPC_LOG_BUFFER,
+   GUC_CRASH_DUMP_LOG_BUFFER,
+   GUC_MAX_LOG_BUFFER
+};
+
+/**
+ * DOC: GuC Log buffer Layout
+ *
+ * Page0  +---+
+ *|   ISR state header (32 bytes) |
+ *|  DPC state header |
+ *|   Crash dump state header |
+ * Page1  +---+
+ *|   ISR logs|
+ * Page5  +---+
+ *|   DPC logs|
+ * Page9  +---+
+ *| Crash Dump logs   |
+ *+---+
+ *
+ * Below state structure is used for coordination of retrieval of GuC firmware
+ * logs. Separate state is maintained for each log buffer type.
+ * read_ptr points to the location where i915 read last in log buffer and
+ * is read only for GuC firmware. write_ptr is incremented by GuC with number
+ * of bytes written for each log entry and is read only for i915.
+ * When any type of log buffer becomes half full, GuC sends a flush interrupt.
+ * GuC firmware expects that while it is writing to 2nd half of the buffer,
+ * first half would get consumed by Host and then get a flush completed
+ * acknowledgment from Host, so that it does not end up doing any overwrite
+ * causing loss of logs. So when buffer gets half filled & i915 has requested
+ * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr
+ * to the value of write_ptr and raise the interrupt.
+ * On receiving the interrupt i915 should read the buffer, clear flush_to_file
+ * field and also update read_ptr with the value of sample_write_ptr, before
+ * sending an acknowledgment to GuC. marker & version fields are for internal
+ * usage of GuC and opaque to i915. buffer_full_cnt field is incremented every
+ * time GuC detects the log buffer overflow.
+ */
+struct guc_log_buffer_state {
+   u32 marker[2];
+   u32 read_ptr;
+   u32 write_ptr;
+   u32 size;
+   u32 sampled_write_ptr;
+   union {
+   struct {
+   u32 flush_to_file:1;
+   u32 buffer_full_cnt:4;
+   u32 reserved:27;
+   };
+   u32 flags;
+   };
+   u32 version;
+} __packed;
+
+union guc_log_control {
+   struct {
+   u32 logging_enabled:1;
+   u32 reserved1:3;
+   u32 verbosity:4;
+   u32 reserved2:24;
+   };
+   u32 value;
+} __packed;
+
 /* This Action will be programmed in C180 - SOFT_SCRATCH_O_REG */
 enum host2guc_action {
HOST2GUC_ACTION_DEFAULT = 0x0,
HOST2GUC_ACTION_SAMPLE_FORCEWAKE = 0x6,
HOST2GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
HOST2GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
+   HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
+   HOST2GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
HOST2GUC_ACTION_LIMIT
 };
 
@@ -448,4 +520,10 @@ enum guc2host_status {
GUC2HOST_STATUS_GENERIC_FAIL = GUC2HOST_STATUS(0xF000)
 };
 
+/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
+enum guc2host_message {
+   GUC2HOST_MSG_CRASH_DUMP_POSTED = (1 << 1),
+   GUC2HOST_MSG_FLUSH_LOG_BUFFER = (1 << 3)
+};
+
 #endif
-- 
1.9.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists

  1   2   3   4   5   >