Thanks Joonas! :)

-----Original Message-----
From: Joonas Lahtinen [mailto:[email protected]] 
Sent: Tuesday, August 29, 2017 12:37 PM
To: Wang, Zhi A <[email protected]>; [email protected]; 
[email protected]
Cc: [email protected]; [email protected]; Widawsky, Benjamin 
<[email protected]>; Vivi, Rodrigo <[email protected]>
Subject: Re: [RFCv5 2/2] drm/i915: Introduce private PAT management

On Tue, 2017-08-29 at 16:00 +0800, Zhi Wang wrote:
> The private PAT management is to support PPAT entry manipulation. Two 
> APIs are introduced for dynamically managing PPAT entries: 
> intel_ppat_get and intel_ppat_put.
> 
> intel_ppat_get will search for an existing PPAT entry which perfectly 
> matches the required PPAT value. If not, it will try to allocate or 
> return a partially matched PPAT entry if there is any available PPAT 
> indexes or not.
> 
> intel_ppat_put will put back the PPAT entry which comes from 
> intel_ppat_get. If it's dynamically allocated, the reference count 
> will be decreased. If the reference count turns into zero, the PPAT 
> index is freed again.
> 
> Besides, another two callbacks are introduced to support the private 
> PAT management framework. One is ppat->update(), which writes the PPAT 
> configurations in ppat->entries into HW. Another one is ppat->match, 
> which will return a score to show how two PPAT values match with each other.
> 
> v5:
> 
> - Add check and warnnings for those platforms which don't have PPAT.
> 
> v3:
> 
> - Introduce dirty bitmap for PPAT registers. (Chris)
> - Change the name of the pointer "dev_priv" to "i915". (Chris)
> - intel_ppat_{get, put} returns/takes a const intel_ppat_entry *. 
> (Chris)
> 
> v2:
> 
> - API re-design. (Chris)
> 
> Cc: Ben Widawsky <[email protected]>
> Cc: Rodrigo Vivi <[email protected]>
> Cc: Chris Wilson <[email protected]>
> Cc: Joonas Lahtinen <[email protected]>
> Signed-off-by: Zhi Wang <[email protected]>

<SNIP>

> -static void cnl_setup_private_ppat(struct drm_i915_private *dev_priv)
> +static struct intel_ppat_entry *alloc_ppat_entry(struct intel_ppat *ppat,
> +                                              unsigned int index,
> +                                              u8 value)
>  {
> +     struct intel_ppat_entry *entry = &ppat->entries[index];
> +
> +     entry->ppat = ppat;
> +     entry->value = value;
> +     kref_init(&entry->ref_count);
> +     set_bit(index, ppat->used);
> +     set_bit(index, ppat->dirty);
> +
> +     return entry;
> +}
> +
> +static void free_ppat_entry(struct intel_ppat_entry *entry) {
> +     struct intel_ppat *ppat = entry->ppat;
> +     int index = entry - ppat->entries;
> +
> +     entry->value = ppat->dummy_value;
> +     clear_bit(index, ppat->used);
> +     set_bit(index, ppat->dirty);
> +}

Above functions should be __ prefixed as they do no checking if they override 
existing data. The suitable names might be __ppat_get and __ppat_put.

> +
> +/**
> + * intel_ppat_get - get a usable PPAT entry
> + * @i915: i915 device instance
> + * @value: the PPAT value required by the caller
> + *
> + * The function tries to search if there is an existing PPAT entry 
> +which
> + * matches with the required value. If perfectly matched, the 
> +existing PPAT
> + * entry will be used. If only partially matched, it will try to 
> +check if
> + * there is any available PPAT index. If yes, it will allocate a new 
> +PPAT
> + * index for the required entry and update the HW. If not, the 
> +partially
> + * matched entry will be used.
> + */
> +const struct intel_ppat_entry *intel_ppat_get(struct drm_i915_private 
> +*i915,

Maybe split the function type and name to avoid exceeding 80 chars on next line.

> +                                           u8 value)
> +{
> +     struct intel_ppat *ppat = &i915->ppat;
> +     struct intel_ppat_entry *entry;
> +     int i, used;
> +     unsigned int score, best_score;
> +
> +     if (WARN_ON(!ppat->max_entries))
> +             return ERR_PTR(-ENODEV);

No need for extra check like this, this will just lead to ENOSPC.

> +
> +     score = best_score = 0;
> +     used = 0;

This variable behaves more like scanned.

> +
> +     /* First, find a suitable value from available entries */

The next two lines repeat this information, no need to document "what".

> +     for_each_set_bit(i, ppat->used, ppat->max_entries) {

                score could be declared in this scope.

> +             score = ppat->match(ppat->entries[i].value, value);
> +             /* Perfect match */

This comment is literally repeating what code says.

> +             if (score == INTEL_PPAT_PERFECT_MATCH) {
> +                     entry = &ppat->entries[i];
> +                     kref_get(&entry->ref_count);
> +                     return entry;
> +             }
> +
> +             if (score > best_score) {
> +                     entry = &ppat->entries[i];

Above could be simplified:

                        if (score == INTEL_PPAT_PERFECT_MATCH) {
                                kref_get(&entry->ref);
                                return entry;
                        }

> +                     best_score = score;
> +             }
> +             used++;
> +     }
> +
> +     /* No matched entry and we can't allocate a new entry. */

DRM_ERROR replicates this comment's information.

> +     if (!best_score && used == ppat->max_entries) {
> +             DRM_ERROR("Fail to get PPAT entry\n");

DRM_DEBUG_DRIVER at most.

> +             return ERR_PTR(-ENOSPC);
> +     }
> +
> +     /*
> +      * Found a matched entry which is not perfect,
> +      * and we can't allocate a new entry.
> +      */
> +     if (best_score && used == ppat->max_entries) {
> +             kref_get(&entry->ref_count);
> +             return entry;
> +     }

Above code could be combined:

        if (scanned == ppat->max_entries) {
                if(!best_score)
                        return ERR_PTR(-ENOSPC);
                kref_get(&entry->ref);
                return entry;
        }

> +
> +     /* Allocate a new entry */

This comment is also just telling "what", which we can see from code.

> +     i = find_first_zero_bit(ppat->used, ppat->max_entries);
> +     entry = alloc_ppat_entry(ppat, i, value);
> +     ppat->update(i915);
> +     return entry;
> +}
> +
> +static void put_ppat(struct kref *kref)

ppat_release might cause less confusion, otherwise there will be 3 put 
functions.

> +{
> +     struct intel_ppat_entry *entry =
> +             container_of(kref, struct intel_ppat_entry, ref_count);
> +     struct drm_i915_private *i915 = entry->ppat->i915;
> +
> +     free_ppat_entry(entry);
> +     entry->ppat->update(i915);
> +}
> +
> +/**
> + * intel_ppat_put - put back the PPAT entry got from intel_ppat_get()
> + * @entry: an intel PPAT entry
> + *
> + * Put back the PPAT entry got from intel_ppat_get(). If the PPAT 
> +index of the
> + * entry is dynamically allocated, its reference count will be 
> +decreased. Once
> + * the reference count becomes into zero, the PPAT index becomes free again.
> + */
> +void intel_ppat_put(const struct intel_ppat_entry *entry) {
> +     struct intel_ppat *ppat = entry->ppat;
> +     int index = entry - ppat->entries;
> +
> +     if (WARN_ON(!ppat->max_entries))
> +             return;

This is clearly a kernel programmer error (and a serious one, so could be 
GEM_BUG_ON).

> +
> +     kref_put(&ppat->entries[index].ref_count, put_ppat); }
> +
> +static void cnl_private_pat_update(struct drm_i915_private *dev_priv) 
> +{
> +     struct intel_ppat *ppat = &dev_priv->ppat;
> +     int i;
> +
> +     for_each_set_bit(i, ppat->dirty, ppat->max_entries) {
> +             clear_bit(i, ppat->dirty);
> +             I915_WRITE(GEN10_PAT_INDEX(i), ppat->entries[i].value);

Clearing the bit after write is the logical thing to do.

> +     }
> +}
> +
> +static void bdw_private_pat_update(struct drm_i915_private *dev_priv) 
> +{
> +     struct intel_ppat *ppat = &dev_priv->ppat;
> +     u64 pat = 0;
> +     int i;
> +
> +     for (i = 0; i < ppat->max_entries; i++)
> +             pat |= GEN8_PPAT(i, ppat->entries[i].value);
> +
> +     bitmap_clear(ppat->dirty, 0, ppat->max_entries);
> +
> +     I915_WRITE(GEN8_PRIVATE_PAT_LO, pat);
> +     I915_WRITE(GEN8_PRIVATE_PAT_HI, pat >> 32);

While moving the code, lower_32_bits and upper_32_bits, it should really warn 
without them?

> +}
> +
> +#define gen8_pat_ca(v) ((v) & 0x3)
> +#define gen8_pat_tc(v) (((v) >> 2) & 0x3) #define gen8_pat_age(v) 
> +(((v) >> 4) & 0x3)

Magic numbers. I'm also not 100% sure if KMD will want partial matches, or if 
GVT should pass in a scoring function.

> +
> +static unsigned int bdw_private_pat_match(u8 src, u8 dst) {
> +     unsigned int score = 0;
> +
> +     /* Cache attribute has to be matched. */
> +     if (gen8_pat_ca(src) != gen8_pat_ca(dst))
> +             return 0;
> +
> +     if (gen8_pat_age(src) == gen8_pat_age(dst))
> +             score += 1;
> +
> +     if (gen8_pat_tc(src) == gen8_pat_tc(dst))
> +             score += 2;

I'd lift this check to have them all in importance order.

> +
> +     if (score == 3)
> +             return INTEL_PPAT_PERFECT_MATCH;
> +
> +     return score;
> +}
> +
> +#define chv_get_snoop(v) (((v) >> 6) & 0x1)

Magic numbers, which need to be in i915_reg.h with #define. Should also be 
chv_pat_snoop();

> +
> +static unsigned int chv_private_pat_match(u8 src, u8 dst) {
> +     if (chv_get_snoop(src) == chv_get_snoop(dst))
> +             return INTEL_PPAT_PERFECT_MATCH;
> +
> +     return 0;

        return chv_pat_snoop(src) == chv_pat_snoop(dst) ?
               INTEL_PPAT_PERFECT_MATCH : 0;

> +}
> +
> +static void cnl_setup_private_ppat(struct intel_ppat *ppat) {
> +     ppat->max_entries = 8;
> +     ppat->update = cnl_private_pat_update;
> +     ppat->match = bdw_private_pat_match;
> +     ppat->dummy_value = GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3);

"dummy_value" is not a very descriptive name.

> +
>       /* XXX: spec is unclear if this is still needed for CNL+ */
> -     if (!USES_PPGTT(dev_priv)) {
> -             I915_WRITE(GEN10_PAT_INDEX(0), GEN8_PPAT_UC);
> +     if (!USES_PPGTT(ppat->i915)) {
> +             alloc_ppat_entry(ppat, 0, GEN8_PPAT_UC);
>               return;
>       }
>  
> -     I915_WRITE(GEN10_PAT_INDEX(0), GEN8_PPAT_WB | GEN8_PPAT_LLC);
> -     I915_WRITE(GEN10_PAT_INDEX(1), GEN8_PPAT_WC | GEN8_PPAT_LLCELLC);
> -     I915_WRITE(GEN10_PAT_INDEX(2), GEN8_PPAT_WT | GEN8_PPAT_LLCELLC);
> -     I915_WRITE(GEN10_PAT_INDEX(3), GEN8_PPAT_UC);
> -     I915_WRITE(GEN10_PAT_INDEX(4), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));
> -     I915_WRITE(GEN10_PAT_INDEX(5), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1));
> -     I915_WRITE(GEN10_PAT_INDEX(6), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2));
> -     I915_WRITE(GEN10_PAT_INDEX(7), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
> +     alloc_ppat_entry(ppat, 0, GEN8_PPAT_WB | GEN8_PPAT_LLC);
> +     alloc_ppat_entry(ppat, 2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC);
> +     alloc_ppat_entry(ppat, 3, GEN8_PPAT_UC);
> +     alloc_ppat_entry(ppat, 4, GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));

Maybe I'm not following at all, but this seems like quite a change to me? 0-3 
used to be WB, WC, WT, UC, now 1 is unset, like is 5-7.

I'd leave all changes to registers to a later patch and leave this patch to do 
what the title says, "Introduce private PAT management".

>  static void setup_private_pat(struct drm_i915_private *dev_priv)  {
> +     struct intel_ppat *ppat = &dev_priv->ppat;
> +     int i;
> +
> +     ppat->i915 = dev_priv;
> +
> +     /* Load per-platform PPAT configurations */

This comment is again just taking space.

>       if (INTEL_GEN(dev_priv) >= 10)
> -             cnl_setup_private_ppat(dev_priv);
> +             cnl_setup_private_ppat(ppat);
>       else if (IS_CHERRYVIEW(dev_priv) || IS_GEN9_LP(dev_priv))
> -             chv_setup_private_ppat(dev_priv);
> +             chv_setup_private_ppat(ppat);
>       else
> -             bdw_setup_private_ppat(dev_priv);
> +             bdw_setup_private_ppat(ppat);
> +
> +     GEM_BUG_ON(ppat->max_entries > INTEL_MAX_PPAT_ENTRIES);
> +
> +     if (!ppat->max_entries)
> +             return;

I'm not sure why we have "!max_entries" checks going, we should have one 
GEM_BUG_ON at this point and in the rest you can assume it to be nonzero.

> +
> +     /* Fill unused PPAT entries with dummy PPAT value */
> +     for_each_clear_bit(i, ppat->used, ppat->max_entries) {
> +             ppat->entries[i].value = ppat->dummy_value;
> +             ppat->entries[i].ppat = ppat;
> +             set_bit(i, ppat->dirty);
> +     }
> +
> +     /* Write the HW */

If the callback was named update_hw(), this comment would be unnecessary.

> +     ppat->update(dev_priv);
>  }

<SNIP>

> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -536,6 +536,40 @@ i915_vm_to_ggtt(struct i915_address_space *vm)
>       return container_of(vm, struct i915_ggtt, base);  }
>  
> +#define INTEL_MAX_PPAT_ENTRIES 8
> +#define INTEL_PPAT_PERFECT_MATCH (~0U)
> +
> +struct intel_ppat;
> +
> +struct intel_ppat_entry {
> +     struct intel_ppat *ppat;
> +     struct kref ref_count;

        Just "ref", like elsewhere in code.

> +     u8 value;
> +};
> +
> +struct intel_ppat {
> +     struct intel_ppat_entry entries[INTEL_MAX_PPAT_ENTRIES];
> +     DECLARE_BITMAP(used, INTEL_MAX_PPAT_ENTRIES);
> +     DECLARE_BITMAP(dirty, INTEL_MAX_PPAT_ENTRIES);
> +

This really goes with the previous group of variable. So no newline.

> +     unsigned int max_entries;
> +
> +     u8 dummy_value;

Better name, like "clear_value" and short description may be useful.

> +     /*
> +      * Return a score to show how two PPAT values match,
> +      * a INTEL_PPAT_PERFECT_MATCH indicates a perfect match
> +      */
> +     unsigned int (*match)(u8 src, u8 dst);
> +     /* Write the PPAT configuration into HW. */
> +     void (*update)(struct drm_i915_private *i915);
> +
> +     struct drm_i915_private *i915;
> +};
> +
> +const struct intel_ppat_entry *intel_ppat_get(struct drm_i915_private 
> +*i915,

Same here with type vs name.

> +                                           u8 value);
> +void intel_ppat_put(const struct intel_ppat_entry *entry);

Regards, joonas
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to