Re: [PULL] drm-misc-fixes

2021-10-14 Thread Dave Airlie
> - Respun clock fixes for vc4/hdmi.

I was uneasy with these patches due to the number and size of them at
this point in the cycle. Is there any major problem leaving them until
next? I think fixes needs a hard reset and rebase to rc6 when it's
tagged.

If these are super-urgent fixes then I'd rather they come in a topic
branch I can give to Linus separately.

Dave.


[git pull] drm fixes for 5.15-rc6

2021-10-14 Thread Dave Airlie
Hi Linus,

This is the fixes pull for 5.15-rc6.

It has a few scattered msm and i915 fixes, a few core fixes and a
mediatek feature revert.

I've had to pick a bunch of patches into this, as the drm-misc-fixes
tree had a bunch of vc4 patches I wasn't comfortable with sending to
you at least as part of this, they were delayed due to your reverts.
If it's really useful as fixes I'll do a separate pull.

Dave.

drm-fixes-2021-10-15-1:
drm fixes for 5.15-rc6

core:
- clamp fbdev size
- edid cap blocks read to avoid out of bounds

panel:
- fix missing crc32 dependency

msm:
- Fix a new crash on dev file close if the dev file was opened when
  GPU is not loaded (such as missing fw in initrd)
- Switch to single drm_sched_entity per priority level per drm_file
  to unbreak multi-context userspace
- Serialize GMU access to fix GMU OOB errors
- Various error path fixes
- A couple integer overflow fixes
- Fix mdp5 cursor plane WARNs

i915:
- Fix ACPI object leak
- Fix context leak in user proto-context creation
- Fix missing i915_sw_fence_fini call

hyperv:
- hide hw pointer

nouveau:
- fix engine selection bit

r128:
- fix UML build

rcar-du:
- unconncted LVDS regression fix

mediatek:
- revert CMDQ refinement patches
The following changes since commit 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc:

  Linux 5.15-rc5 (2021-10-10 17:01:59 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-10-15-1

for you to fetch changes up to a14bc107edd0c108bda2245e50daa22f91c95d20:

  drm/panel: olimex-lcd-olinuxino: select CRC32 (2021-10-15 15:05:13 +1000)


drm fixes for 5.15-rc6

core:
- clamp fbdev size
- edid cap blocks read to avoid out of bounds

panel:
- fix missing crc32 dependency

msm:
- Fix a new crash on dev file close if the dev file was opened when
  GPU is not loaded (such as missing fw in initrd)
- Switch to single drm_sched_entity per priority level per drm_file
  to unbreak multi-context userspace
- Serialize GMU access to fix GMU OOB errors
- Various error path fixes
- A couple integer overflow fixes
- Fix mdp5 cursor plane WARNs

i915:
- Fix ACPI object leak
- Fix context leak in user proto-context creation
- Fix missing i915_sw_fence_fini call

hyperv:
- hide hw pointer

nouveau:
- fix engine selection bit

r128:
- fix UML build

rcar-du:
- unconncted LVDS regression fix

mediatek:
- revert CMDQ refinement patches


Arnd Bergmann (1):
  drm/msm/submit: fix overflow check on 64-bit architectures

Chun-Kuang Hu (5):
  Revert "drm/mediatek: Clear pending flag when cmdq packet is done"
  Revert "drm/mediatek: Add cmdq_handle in mtk_crtc"
  Revert "drm/mediatek: Detect CMDQ execution timeout"
  Revert "drm/mediatek: Remove struct cmdq_client"
  Revert "drm/mediatek: Use mailbox rx_callback instead of cmdq_task_cb"

Colin Ian King (1):
  drm/msm: Fix null pointer dereference on pointer edp

Dan Carpenter (4):
  drm/msm/a4xx: fix error handling in a4xx_gpu_init()
  drm/msm/a3xx: fix error handling in a3xx_gpu_init()
  drm/msm/dsi: Fix an error code in msm_dsi_modeset_init()
  drm/msm/dsi: fix off by one in dsi_bus_clk_enable error handling

Dave Airlie (3):
  Merge tag 'drm-msm-fixes-2021-10-11' of
https://gitlab.freedesktop.org/drm/msm into drm-fixes
  Merge tag 'drm-intel-fixes-2021-10-14' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'mediatek-drm-fixes-5.15' of
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux
into drm-fixes

Dexuan Cui (1):
  drm/hyperv: Fix double mouse pointers

Dmitry Baryshkov (2):
  drm/msm/mdp5: fix cursor-related warnings
  drm/msm/dsi/phy: fix clock names in 28nm_8960 phy

Douglas Anderson (1):
  drm/edid: In connector_bad_edid() cap num_of_ext by num_blocks read

Fabio Estevam (1):
  drm/msm: Do not run snapshot on non-DPU devices

Kuogee Hsieh (1):
  drm/msm/dp: only signal audio when disconnected detected at dp_pm_resume

Laurent Pinchart (1):
  drm: rcar-du: Don't create encoder for unconnected LVDS outputs

Marek Vasut (2):
  drm/msm: Avoid potential overflow in timeout_to_jiffies()
  drm/nouveau/fifo: Reinstate the correct engine bit programming

Marijn Suijten (1):
  drm/msm/dsi: dsi_phy_14nm: Take ready-bit into account in poll_for_ready

Matthew Auld (1):
  drm/i915: remember to call i915_sw_fence_fini

Matthew Brost (1):
  drm/i915: Fix bug in user proto-context creation that leaked contexts

Randy Dunlap (1):
  drm/r128: fix build for UML

Rob Clark (5):
  drm/msm: Fix crash on dev file close
  drm/msm/a6xx: Serialize GMU communication
  drm/msm/a6xx: Track current ctx by seqno
  drm/msm: A bit more docs + cleanup
  drm/msm: One sched entity per process per priority

Robert Foss (1):
  drm/msm/dpu: Fix address of SM8150 

Re: [Intel-gfx] [PATCH v5] drm/i915/gt: move remaining debugfs interfaces into gt

2021-10-14 Thread Lucas De Marchi

On Thu, Oct 14, 2021 at 02:11:34AM +0200, Andi Shyti wrote:

Hi Lucas,

On Wed, Oct 13, 2021 at 05:04:27PM -0700, Lucas De Marchi wrote:

On Wed, Oct 13, 2021 at 12:17:38AM +0200, Andi Shyti wrote:
> From: Andi Shyti 
>
> The following interfaces:
>
>  i915_wedged
>  i915_forcewake_user
>
> are dependent on gt values. Put them inside gt/ and drop the
> "i915_" prefix name. This would be the new structure:
>
>  dri/0/gt
>  |
>  +-- forcewake_user
>  |
>  \-- reset
>
> For backwards compatibility with existing igt (and the slight
> semantic difference between operating on the i915 abi entry
> points and the deep gt info):
>
>  dri/0
>  |
>  +-- i915_wedged
>  |
>  \-- i915_forcewake_user
>
> remain at the top level.
>
> Signed-off-by: Andi Shyti 
> Cc: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Reviewed-by: Lucas De Marchi 

do you want me to push this?


yes, please.


done, thanks.

Now, about igt: eventually we need to update it to use the gt
debugfs file. Is this something you have already or is it something
we are waiting on multi-gt to land?

Lucas De Marchi



Thanks,
Andi


Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Sierra Guiza, Alejandro (Alex)



On 10/14/2021 3:57 PM, Ralph Campbell wrote:


On 10/14/21 11:01 AM, Jason Gunthorpe wrote:

On Thu, Oct 14, 2021 at 10:35:27AM -0700, Ralph Campbell wrote:


I ran xfstests-dev using the kernel boot option to "fake" a pmem device
when I first posted this patch. The tests ran OK (or at least the same
tests passed with and without my patch).

Hmm. I know nothing of xfstests but

tests/generic/413

Looks kind of like it might cover this situation?

Did it run for you?

Jason


I don't remember. I'll have to rerun the test which might take a day 
or two

to set up again.


I just ran this generic/413 on my side using pmem fake device. It does fail.
I remember we proposed a fix on this patch before try_get_page was removed.
@@ -1186,7 +1153,7 @@ bool __must_check try_grab_page(struct page *page, 
unsigned int flags);

 static inline __must_check bool try_get_page(struct page *page)
 {
    page = compound_head(page);
-   if (WARN_ON_ONCE(page_ref_count(page) <= 0))
+   if (WARN_ON_ONCE(page_ref_count(page) < 
(int)!is_zone_device_page(page)))

    return false;
    page_ref_inc(page);
    return true;

Alex



RE: [RFC v1 3/6] drm: Add a capability flag to support additional flip completion signalling

2021-10-14 Thread Kasireddy, Vivek
Hi Pekka,
Thank you for reviewing this patch.
 
> On Mon, 13 Sep 2021 16:35:26 -0700
> Vivek Kasireddy  wrote:
> 
> > If a driver supports this capability, it means that there would be an
> > additional signalling mechanism for a page flip completion in addition
> > to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
> >
> > This capability may only be relevant for Virtual KMS drivers and is 
> > currently
> > used only by virtio-gpu. Also, it can provide a potential solution for:
> > https://gitlab.freedesktop.org/wayland/weston/-/issues/514
> >
> > Signed-off-by: Vivek Kasireddy 
> > ---
> >  drivers/gpu/drm/drm_ioctl.c   | 3 +++
> >  include/drm/drm_mode_config.h | 8 
> >  include/uapi/drm/drm.h| 1 +
> >  3 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
> > index 8b8744dcf691..8a420844f8bc 100644
> > --- a/drivers/gpu/drm/drm_ioctl.c
> > +++ b/drivers/gpu/drm/drm_ioctl.c
> > @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void 
> > *data, struct
> drm_file *file_
> > case DRM_CAP_CRTC_IN_VBLANK_EVENT:
> > req->value = 1;
> > break;
> > +   case DRM_CAP_RELEASE_FENCE:
> > +   req->value = dev->mode_config.release_fence;
> > +   break;
> 
> Hi Vivek,
> 
> is this actually necessary?
> 
> I would think that userspace figures out the existence of the release
> fence capability by seeing that the KMS property "RELEASE_FENCE_PTR"
> either exists or not.
[Vivek] Yeah, that makes sense. However, in order for the userspace to not see
this property, we'd have to prevent drm core from exposing it; which means we
need to check dev->mode_config.release_fence before attaching the property
to the crtc.

> 
> However, would we not need a client cap instead?
> 
> If a KMS driver knows that userspace is aware of "RELEASE_FENCE_PTR"
> and will use it when necessary, then the KMS driver can send the
> pageflip completion without waiting for the host OS to signal the old
> buffer as free for re-use.
[Vivek] Right, the KMS driver can just look at whether the release_fence was
added by the userspace (in the atomic commit) to determine whether it needs
to wait for the old fb.

> 
> If the KMS driver does not know that userspace can handle pageflip
> completing "too early", then it has no choice but to wait until the old
> buffer is really free before signalling pageflip completion.
> 
> Wouldn't that make sense?
[Vivek] Yes; DRM_CAP_RELEASE_FENCE may not be necessary to
implement the behavior you suggest which makes sense.

> 
> 
> Otherwise, this proposal sounds fine to me.
[Vivek] Did you get a chance to review the Weston MR:
https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668

Could you please take a look?

Thanks,
Vivek

> 
> 
> Thanks,
> pq
> 
> 
> > default:
> > return -EINVAL;
> > }
> > diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
> > index 12b964540069..944bebf359d7 100644
> > --- a/include/drm/drm_mode_config.h
> > +++ b/include/drm/drm_mode_config.h
> > @@ -935,6 +935,14 @@ struct drm_mode_config {
> >  */
> > bool normalize_zpos;
> >
> > +   /**
> > +* @release_fence:
> > +*
> > +* If this option is set, it means there would be an additional 
> > signalling
> > +* mechanism for a page flip completion.
> > +*/
> > +   bool release_fence;
> > +
> > /**
> >  * @modifiers_property: Plane property to list support modifier/format
> >  * combination.
> > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
> > index 3b810b53ba8b..8b8985f65581 100644
> > --- a/include/uapi/drm/drm.h
> > +++ b/include/uapi/drm/drm.h
> > @@ -767,6 +767,7 @@ struct drm_gem_open {
> >   * Documentation/gpu/drm-mm.rst, section "DRM Sync Objects".
> >   */
> >  #define DRM_CAP_SYNCOBJ_TIMELINE   0x14
> > +#define DRM_CAP_RELEASE_FENCE  0x15
> >
> >  /* DRM_IOCTL_GET_CAP ioctl argument type */
> >  struct drm_get_cap {



Re: [PATCH v8 6/8] drm/i915/ttm: move shrinker management into adjust_lru

2021-10-14 Thread kernel test robot

Hi Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[also build test WARNING on next-20211012]
[cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next 
tegra-drm/drm/tegra/for-next airlied/drm-next v5.15-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Matthew-Auld/drm-i915-gem-Break-out-some-shmem-backend-utils/20211011-230443
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: i386-randconfig-c001-20211012 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
c3dcf39554dbea780d6cb7e12239451ba47a2668)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/43578548b98c603bc1bd2840b3e50bfde8bf0772
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Matthew-Auld/drm-i915-gem-Break-out-some-shmem-backend-utils/20211011-230443
git checkout 43578548b98c603bc1bd2840b3e50bfde8bf0772
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=i386 
clang-analyzer

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


clang-analyzer warnings: (new ones prefixed by >>)

>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c:865:23: warning: Value stored to 
'i915_tt' during its initialization is never read [clang-analyzer-deadcode.DeadStores]
   struct i915_ttm_tt *i915_tt =
   ^~~

vim +/i915_tt +865 drivers/gpu/drm/i915/gem/i915_gem_ttm.c

213d5092776345a Thomas Hellström 2021-06-10  819
b6e913e19c54edd Thomas Hellström 2021-06-29  820  static int 
__i915_ttm_get_pages(struct drm_i915_gem_object *obj,
b6e913e19c54edd Thomas Hellström 2021-06-29  821
struct ttm_placement *placement)
213d5092776345a Thomas Hellström 2021-06-10  822  {
213d5092776345a Thomas Hellström 2021-06-10  823struct 
ttm_buffer_object *bo = i915_gem_to_ttm(obj);
213d5092776345a Thomas Hellström 2021-06-10  824struct 
ttm_operation_ctx ctx = {
213d5092776345a Thomas Hellström 2021-06-10  825.interruptible 
= true,
213d5092776345a Thomas Hellström 2021-06-10  826.no_wait_gpu = 
false,
213d5092776345a Thomas Hellström 2021-06-10  827};
213d5092776345a Thomas Hellström 2021-06-10  828struct sg_table *st;
b07a6483839a838 Thomas Hellström 2021-06-18  829int real_num_busy;
213d5092776345a Thomas Hellström 2021-06-10  830int ret;
213d5092776345a Thomas Hellström 2021-06-10  831
b07a6483839a838 Thomas Hellström 2021-06-18  832/* First try only the 
requested placement. No eviction. */
b6e913e19c54edd Thomas Hellström 2021-06-29  833real_num_busy = 
fetch_and_zero(>num_busy_placement);
b6e913e19c54edd Thomas Hellström 2021-06-29  834ret = ttm_bo_validate(bo, 
placement, );
b07a6483839a838 Thomas Hellström 2021-06-18  835if (ret) {
b07a6483839a838 Thomas Hellström 2021-06-18  836ret = 
i915_ttm_err_to_gem(ret);
b07a6483839a838 Thomas Hellström 2021-06-18  837/*
b07a6483839a838 Thomas Hellström 2021-06-18  838 * Anything 
that wants to restart the operation gets to
b07a6483839a838 Thomas Hellström 2021-06-18  839 * do that.
b07a6483839a838 Thomas Hellström 2021-06-18  840 */
b07a6483839a838 Thomas Hellström 2021-06-18  841if (ret == 
-EDEADLK || ret == -EINTR || ret == -ERESTARTSYS ||
b07a6483839a838 Thomas Hellström 2021-06-18  842ret == 
-EAGAIN)
b07a6483839a838 Thomas Hellström 2021-06-18  843return 
ret;
b07a6483839a838 Thomas Hellström 2021-06-18  844
b07a6483839a838 Thomas Hellström 2021-06-18  845/*
b07a6483839a838 Thomas Hellström 2021-06-18  846 * If the 
initial attempt fails, allow all accepted placements,
b07a6483839a838 Thomas Hellström 2021-06-18  847 * evicting if 
necessary.
b07a6483839a838 Thomas Hellström 2021-06-18  848 */
b6e913e19c54edd Thomas Hellström 2021-06-29  849
placement->num_busy_placement = real_num_busy;
b6e913e19c54edd Thomas Hellström 2021-06-29  850ret = 
ttm_bo_validate(bo, placement, );
213d5092776345a Thomas Hellström 2021-06-10  851if (ret)
b07a6483839a838 Thomas Hellström 2021-06-18  852return 
i915_ttm_err_to_gem(ret);
b07a6483839a838 Thomas Hellström 2021-06-18  853}
213d5092776345a Thomas 

Re: [bug report] drm/msm: dsi: Handle dual-channel for 6G as well

2021-10-14 Thread Jessica Zhang

Hey Dan,

On 10/1/2021 5:31 AM, Dan Carpenter wrote:

Hello Sean Paul,

The patch a6bcddbc2ee1: "drm/msm: dsi: Handle dual-channel for 6G as
well" from Jul 25, 2018, leads to the following
Smatch static checker warning:

drivers/gpu/drm/msm/dsi/dsi_host.c:729 dsi_calc_clk_rate_6g()
warn: wrong type for 'msm_host->esc_clk_rate' (should be 'ulong')

drivers/gpu/drm/msm/dsi/dsi_host.c
 721 int dsi_calc_clk_rate_6g(struct msm_dsi_host *msm_host, bool 
is_bonded_dsi)
 722 {
 723 if (!msm_host->mode) {
 724 pr_err("%s: mode not set\n", __func__);
 725 return -EINVAL;
 726 }
 727
 728 dsi_calc_pclk(msm_host, is_bonded_dsi);
--> 729 msm_host->esc_clk_rate = clk_get_rate(msm_host->esc_clk);
 ^^
I don't know why Smatch is suddenly warning about ancient msm code, but
clock rates should be unsigned long.  (I don't remember why).

 730 return 0;
 731 }


I'm unable to recreate the warning with Smatch. After running 
build_kernel_data.sh, I ran `/smatch_scripts/kchecker 
drivers/gpu/drm/msm/dsi/dsi_host.c` and got the following output:


CHECK scripts/mod/empty.c
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
CHECK arch/arm64/kernel/vdso/vgettimeofday.c
CC drivers/gpu/drm/msm/dsi/dsi_host.o
CHECK drivers/gpu/drm/msm/dsi/dsi_host.c
drivers/gpu/drm/msm/dsi/dsi_host.c:2380 msm_dsi_host_power_on() warn: 
missing error code 'ret'


Is there a specific .config you're using (that's not the default 
mainline defconfig)? If so, can you please share it?


Thanks,

Jessica Zhang


regards,
dan carpenter


Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Dan Williams
On Thu, Oct 14, 2021 at 4:06 PM Jason Gunthorpe  wrote:
>
> On Thu, Oct 14, 2021 at 12:01:14PM -0700, Dan Williams wrote:
> > > > Does anyone know why devmap is pte_special anyhow?
> >
> > It does not need to be special as mentioned here:
> >
> > https://lore.kernel.org/all/CAPcyv4iFeVDVPn6uc=aksyuvkiu3-fk-n16ijvzq3n8ot00...@mail.gmail.com/
>
> I added a remark there
>
> Not special means more to me, it means devmap should do the refcounts
> properly like normal memory pages.
>
> It means vm_normal_page should return !NULL and it means insert_page,
> not insert_pfn should be used to install them in the PTE. VMAs should
> not be MIXED MAP, but normal struct page maps.
>
> I think this change alone would fix all the refcount problems
> everwhere in DAX and devmap.
>
> > The refcount dependencies also go away after this...
> >
> > https://lore.kernel.org/all/161604050866.1463742.7759521510383551055.st...@dwillia2-desk3.amr.corp.intel.com/
> >
> > ...but you can see that patches 1 and 2 in that series depend on being
> > able to guarantee that all mappings are invalidated when the undelying
> > device that owns the pgmap goes away.
>
> If I have put everything together right this is because of what I
> pointed to here. FS-DAX is installing 0 refcount pages into PTEs and
> expecting that to work sanely.
>
> This means the page map cannot be removed until all the PTEs are fully
> flushed, which buggily doesn't happen because of the missing unplug.
>
> However, this is all because nobody incrd a refcount to represent the
> reference in the PTE and since this ment that 0 refcount pages were
> wrongly stuffed into PTEs then devmap used the refcount == 1 hack to
> unbreak GUP?
>
> So.. Is there some reason why devmap pages are trying so hard to avoid
> sane refcounting???

I wouldn't put it that way. It's more that the original sin of
ZONE_DEVICE that sought to reuse the lru field space, by never having
a zero recount, then got layered upon and calcified in malignant ways.
In the meantime surrounding infrastructure got decrustified. Work like
the 'struct page' cleanup among other things, made it clearer and
clearer over time that the original design choice needed to be fixed.

> If the PTE itself holds the refcount (by not being special) then there
> is no need for the pagemap stuff in GUP. pagemap already waits for
> refs to go to 0 so the missing shootdown during nvdimm unplug will
> cause pagemap to block until the address spaces are invalidated. IMHO
> this is already better than the current buggy situation of allowing
> continued PTE reference to memory that is now removed from the system.
>
> > For that to happen there needs to be communication back to the FS for
> > device-gone / failure events. That work is in progress via this
> > series:
> >
> > https://lore.kernel.org/all/20210924130959.2695749-1-ruansy.f...@fujitsu.com/
>
> This is fine, but I don't think it should block fixing the mm side -
> the end result here still cannot be 0 ref count pages installed in
> PTEs.
>
> Fixing that does not depend on shootdown during device removal, right?
>
> It requires holding refcounts while pages are installed into address
> spaces - and this lack is a direct cause of making the PTEs all
> special and using insert_pfn and MIXED_MAP.

The MIXED_MAP and insert_pfn were a holdover from page-less DAX, but
now that we have page-available DAX, yes, we can skip the FS
notification and just rely on typical refcounting and hanging until
the FS has a chance to uninstall the PTEs. You're right, the FS
notification is an improvement to the conversion, not a requirement.

However, there still needs to be something in the gup-fast path to
indicate that GUP_LONGTERM is not possible because the PTE represents
a pfn that can not support typical page-cache behavior for truncate
which is to just disconnect the page from the file and keep the page
pinned indefinitely. I think the "no longterm" caveat would be the
only remaining utility of PTE_DEVMAP after the above conversion to use
typical page refcounts throughout DAX.


[PATCH 1/2] drm/i915/pmu: Add a name to the execlists stats

2021-10-14 Thread Umesh Nerlige Ramappa
In preparation for GuC pmu stats, add a name to the execlists stats
structure so that it can be differentiated from the GuC stats.

Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 14 +++---
 drivers/gpu/drm/i915/gt/intel_engine_stats.h | 33 +++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 52 +++-
 3 files changed, 53 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2ae57e4656a3..38436f4b5706 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -361,7 +361,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
DRIVER_CAPS(i915)->has_logical_contexts = true;
 
ewma__engine_latency_init(>latency);
-   seqcount_init(>stats.lock);
+   seqcount_init(>stats.execlists.lock);
 
ATOMIC_INIT_NOTIFIER_HEAD(>context_status_notifier);
 
@@ -1876,15 +1876,16 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,
ktime_t *now)
 {
-   ktime_t total = engine->stats.total;
+   struct intel_engine_execlists_stats *stats = >stats.execlists;
+   ktime_t total = stats->total;
 
/*
 * If the engine is executing something at the moment
 * add it to the total.
 */
*now = ktime_get();
-   if (READ_ONCE(engine->stats.active))
-   total = ktime_add(total, ktime_sub(*now, engine->stats.start));
+   if (READ_ONCE(stats->active))
+   total = ktime_add(total, ktime_sub(*now, stats->start));
 
return total;
 }
@@ -1898,13 +1899,14 @@ static ktime_t __intel_engine_get_busy_time(struct 
intel_engine_cs *engine,
  */
 ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t 
*now)
 {
+   struct intel_engine_execlists_stats *stats = >stats.execlists;
unsigned int seq;
ktime_t total;
 
do {
-   seq = read_seqcount_begin(>stats.lock);
+   seq = read_seqcount_begin(>lock);
total = __intel_engine_get_busy_time(engine, now);
-   } while (read_seqcount_retry(>stats.lock, seq));
+   } while (read_seqcount_retry(>lock, seq));
 
return total;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_stats.h 
b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
index 24fbdd94351a..8e762d683e50 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_stats.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_stats.h
@@ -15,45 +15,46 @@
 
 static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 {
+   struct intel_engine_execlists_stats *stats = >stats.execlists;
unsigned long flags;
 
-   if (engine->stats.active) {
-   engine->stats.active++;
+   if (stats->active) {
+   stats->active++;
return;
}
 
/* The writer is serialised; but the pmu reader may be from hardirq */
local_irq_save(flags);
-   write_seqcount_begin(>stats.lock);
+   write_seqcount_begin(>lock);
 
-   engine->stats.start = ktime_get();
-   engine->stats.active++;
+   stats->start = ktime_get();
+   stats->active++;
 
-   write_seqcount_end(>stats.lock);
+   write_seqcount_end(>lock);
local_irq_restore(flags);
 
-   GEM_BUG_ON(!engine->stats.active);
+   GEM_BUG_ON(!stats->active);
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 {
+   struct intel_engine_execlists_stats *stats = >stats.execlists;
unsigned long flags;
 
-   GEM_BUG_ON(!engine->stats.active);
-   if (engine->stats.active > 1) {
-   engine->stats.active--;
+   GEM_BUG_ON(!stats->active);
+   if (stats->active > 1) {
+   stats->active--;
return;
}
 
local_irq_save(flags);
-   write_seqcount_begin(>stats.lock);
+   write_seqcount_begin(>lock);
 
-   engine->stats.active--;
-   engine->stats.total =
-   ktime_add(engine->stats.total,
- ktime_sub(ktime_get(), engine->stats.start));
+   stats->active--;
+   stats->total = ktime_add(stats->total,
+ktime_sub(ktime_get(), stats->start));
 
-   write_seqcount_end(>stats.lock);
+   write_seqcount_end(>lock);
local_irq_restore(flags);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 9167ce52487c..b820a2c1124e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -257,6 +257,33 @@ struct intel_engine_execlists {
 
 #define INTEL_ENGINE_CS_MAX_NAME 8
 
+struct intel_engine_execlists_stats {
+   /**
+* @active: Number 

[PATCH 2/2] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

2021-10-14 Thread Umesh Nerlige Ramappa
With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an overaccounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at guc level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate guc and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of guc state
- Add hooks to gt park/unpark for guc busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to guc initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and guc stats objects
- Since disable_submission is called from many places, move resetting
  stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
  callbacks and worker with gt reset

v6: (CI BAT failures)
- DUTs using execlist submission failed to boot since __gt_unpark is
  called during i915 load. This ends up calling the guc busyness unpark
  hook and results in kiskstarting an uninitialized worker. Let
  park/unpark hooks check if guc submission has been initialized.
- drop cant_sleep() from trylock hepler since rcu_read_lock takes care
  of that.

Signed-off-by: John Harrison 
Signed-off-by: Umesh Nerlige Ramappa 
Acked-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  28 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  33 ++-
 .../drm/i915/gt/intel_execlists_submission.c  |  34 +++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +
 drivers/gpu/drm/i915/gt/intel_reset.c |  15 +
 drivers/gpu/drm/i915/gt/intel_reset.h |   1 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  30 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  21 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   5 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  13 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 273 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |   2 +
 14 files changed, 432 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 38436f4b5706..6b783fdcba2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1873,23 +1873,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
intel_engine_print_breadcrumbs(engine, m);
 }
 
-static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,
-   ktime_t *now)
-{
-   struct intel_engine_execlists_stats *stats = >stats.execlists;
-   ktime_t total = stats->total;
-
-   /*
-* If the engine is executing something at the moment
-* add it to the total.
- 

Re: [PATCH v12 05/35] dt-bindings: clock: tegra-car: Document new clock sub-nodes

2021-10-14 Thread Dmitry Osipenko
15.10.2021 03:45, Stephen Boyd пишет:
> Quoting Dmitry Osipenko (2021-10-14 17:43:49)
>> 15.10.2021 03:16, Stephen Boyd пишет:
>>> Quoting Dmitry Osipenko (2021-09-20 11:11:15)
 diff --git 
 a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml 
 b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
 index 459d2a525393..f832abb7f11a 100644
 --- a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
 +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
 @@ -42,6 +42,36 @@ properties:
"#reset-cells":
  const: 1
  
 +patternProperties:
 +  "^(sclk)|(pll-[cem])$":
 +type: object
 +properties:
 +  compatible:
 +enum:
 +  - nvidia,tegra20-sclk
 +  - nvidia,tegra30-sclk
 +  - nvidia,tegra30-pllc
 +  - nvidia,tegra30-plle
 +  - nvidia,tegra30-pllm
 +
 +  operating-points-v2: true
 +
 +  clocks:
 +items:
 +  - description: node's clock
 +
 +  power-domains:
 +maxItems: 1
 +description: phandle to the core SoC power domain
>>>
>>> Is this done to associate the power domain with a particular clk? And an
>>> OPP table with a particular clk?
>>
>> Yes
>>
> 
> Ok. Can Ulf/Viresh review this patch series?

They already did, please see v13 [1].

[1] https://lore.kernel.org/lkml/20210926224058.1252-1-dig...@gmail.com/


Re: [PATCH 2/2] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

2021-10-14 Thread Umesh Nerlige Ramappa

On Thu, Oct 14, 2021 at 09:21:28AM +0100, Tvrtko Ursulin wrote:


On 13/10/2021 01:56, Umesh Nerlige Ramappa wrote:

With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an overaccounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at guc level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate guc and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of guc state
- Add hooks to gt park/unpark for guc busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to guc initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and guc stats objects
- Since disable_submission is called from many places, move resetting
  stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
  callbacks and worker with gt reset

Signed-off-by: John Harrison 
Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  28 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  33 ++-
 .../drm/i915/gt/intel_execlists_submission.c  |  34 +++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +
 drivers/gpu/drm/i915/gt/intel_reset.c |  16 ++
 drivers/gpu/drm/i915/gt/intel_reset.h |   1 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  30 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  21 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   5 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  13 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 267 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |   2 +
 14 files changed, 427 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 38436f4b5706..6b783fdcba2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1873,23 +1873,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
intel_engine_print_breadcrumbs(engine, m);
 }
-static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,
-   ktime_t *now)
-{
-   struct intel_engine_execlists_stats *stats = >stats.execlists;
-   ktime_t total = stats->total;
-
-   /*
-* If the engine is executing something at the moment
-* add it to the total.
-*/
-   *now = ktime_get();
-   if (READ_ONCE(stats->active))
-   total = ktime_add(total, ktime_sub(*now, stats->start));
-
-   return total;
-}
-
 /**
  * intel_engine_get_busy_time() - Return current accumulated engine busyness
  * @engine: engine to 

Re: [PATCH v12 05/35] dt-bindings: clock: tegra-car: Document new clock sub-nodes

2021-10-14 Thread Stephen Boyd
Quoting Dmitry Osipenko (2021-10-14 17:43:49)
> 15.10.2021 03:16, Stephen Boyd пишет:
> > Quoting Dmitry Osipenko (2021-09-20 11:11:15)
> >> diff --git 
> >> a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml 
> >> b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> >> index 459d2a525393..f832abb7f11a 100644
> >> --- a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> >> +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> >> @@ -42,6 +42,36 @@ properties:
> >>"#reset-cells":
> >>  const: 1
> >>  
> >> +patternProperties:
> >> +  "^(sclk)|(pll-[cem])$":
> >> +type: object
> >> +properties:
> >> +  compatible:
> >> +enum:
> >> +  - nvidia,tegra20-sclk
> >> +  - nvidia,tegra30-sclk
> >> +  - nvidia,tegra30-pllc
> >> +  - nvidia,tegra30-plle
> >> +  - nvidia,tegra30-pllm
> >> +
> >> +  operating-points-v2: true
> >> +
> >> +  clocks:
> >> +items:
> >> +  - description: node's clock
> >> +
> >> +  power-domains:
> >> +maxItems: 1
> >> +description: phandle to the core SoC power domain
> > 
> > Is this done to associate the power domain with a particular clk? And an
> > OPP table with a particular clk?
> 
> Yes
> 

Ok. Can Ulf/Viresh review this patch series?


Re: [PATCH v12 05/35] dt-bindings: clock: tegra-car: Document new clock sub-nodes

2021-10-14 Thread Dmitry Osipenko
15.10.2021 03:16, Stephen Boyd пишет:
> Quoting Dmitry Osipenko (2021-09-20 11:11:15)
>> diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml 
>> b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
>> index 459d2a525393..f832abb7f11a 100644
>> --- a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
>> +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
>> @@ -42,6 +42,36 @@ properties:
>>"#reset-cells":
>>  const: 1
>>  
>> +patternProperties:
>> +  "^(sclk)|(pll-[cem])$":
>> +type: object
>> +properties:
>> +  compatible:
>> +enum:
>> +  - nvidia,tegra20-sclk
>> +  - nvidia,tegra30-sclk
>> +  - nvidia,tegra30-pllc
>> +  - nvidia,tegra30-plle
>> +  - nvidia,tegra30-pllm
>> +
>> +  operating-points-v2: true
>> +
>> +  clocks:
>> +items:
>> +  - description: node's clock
>> +
>> +  power-domains:
>> +maxItems: 1
>> +description: phandle to the core SoC power domain
> 
> Is this done to associate the power domain with a particular clk? And an
> OPP table with a particular clk?

Yes



Re: [PATCH] drm: msm: fix building without CONFIG_COMMON_CLK

2021-10-14 Thread Dmitry Baryshkov

On 13/10/2021 17:42, Arnd Bergmann wrote:

From: Arnd Bergmann 

When CONFIG_COMMON_CLOCK is disabled, the 8996 specific
phy code is left out, which results in a link failure:

ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to 
`msm_hdmi_phy_8996_cfg'

This was only exposed after it became possible to build
test the driver without the clock interfaces.

Make COMMON_CLK a hard dependency for compile testing,
and simplify it a little based on that.

Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Reported-by: Randy Dunlap 
Suggested-by: Geert Uytterhoeven 
Signed-off-by: Arnd Bergmann 


This drops dependency on CONFIG_OF. While ARM64 selects OF, pure ARM 
does not.


With that fixed:
Reviewed-by: Dmitry Baryshkov 


---
  drivers/gpu/drm/msm/Kconfig  | 2 +-
  drivers/gpu/drm/msm/Makefile | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index f5107b6ded7b..cb204912e0f4 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -4,8 +4,8 @@ config DRM_MSM
tristate "MSM DRM"
depends on DRM
depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
+   depends on COMMON_CLK
depends on IOMMU_SUPPORT
-   depends on (OF && COMMON_CLK) || COMPILE_TEST
depends on QCOM_OCMEM || QCOM_OCMEM=n
depends on QCOM_LLCC || QCOM_LLCC=n
depends on QCOM_COMMAND_DB || QCOM_COMMAND_DB=n
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..bbee22b54b0c 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -23,8 +23,10 @@ msm-y := \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
+   hdmi/hdmi_phy_8996.o \
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
+   hdmi/hdmi_pll_8960.o \
edp/edp.o \
edp/edp_aux.o \
edp/edp_bridge.o \
@@ -37,6 +39,7 @@ msm-y := \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
disp/mdp4/mdp4_lvds_connector.o \
+   disp/mdp4/mdp4_lvds_pll.o \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \
@@ -117,9 +120,6 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
dp/dp_audio.o
  
  msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o

-msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
  
  msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
  




--
With best wishes
Dmitry


Re: [PATCH v12 05/35] dt-bindings: clock: tegra-car: Document new clock sub-nodes

2021-10-14 Thread Stephen Boyd
Quoting Dmitry Osipenko (2021-09-20 11:11:15)
> diff --git a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml 
> b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> index 459d2a525393..f832abb7f11a 100644
> --- a/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> +++ b/Documentation/devicetree/bindings/clock/nvidia,tegra20-car.yaml
> @@ -42,6 +42,36 @@ properties:
>"#reset-cells":
>  const: 1
>  
> +patternProperties:
> +  "^(sclk)|(pll-[cem])$":
> +type: object
> +properties:
> +  compatible:
> +enum:
> +  - nvidia,tegra20-sclk
> +  - nvidia,tegra30-sclk
> +  - nvidia,tegra30-pllc
> +  - nvidia,tegra30-plle
> +  - nvidia,tegra30-pllm
> +
> +  operating-points-v2: true
> +
> +  clocks:
> +items:
> +  - description: node's clock
> +
> +  power-domains:
> +maxItems: 1
> +description: phandle to the core SoC power domain

Is this done to associate the power domain with a particular clk? And an
OPP table with a particular clk?

> +
> +required:
> +  - compatible
> +  - operating-points-v2
> +  - clocks
> +  - power-domains
> +
> +additionalProperties: false
> +
>  required:
>- compatible
>- reg


[PATCH 1/2] drm/msm/hdmi: use bulk regulator API

2021-10-14 Thread Dmitry Baryshkov
Switch to using bulk regulator API instead of hand coding loops.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/hdmi/hdmi.c   | 34 +++
 drivers/gpu/drm/msm/hdmi/hdmi.h   |  6 ++--
 drivers/gpu/drm/msm/hdmi/hdmi_bridge.c| 20 -
 drivers/gpu/drm/msm/hdmi/hdmi_connector.c | 24 ++--
 drivers/gpu/drm/msm/hdmi/hdmi_phy.c   | 33 --
 5 files changed, 40 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index 737453b6e596..db17a000d968 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -154,19 +154,13 @@ static struct hdmi *msm_hdmi_init(struct platform_device 
*pdev)
ret = -ENOMEM;
goto fail;
}
-   for (i = 0; i < config->hpd_reg_cnt; i++) {
-   struct regulator *reg;
-
-   reg = devm_regulator_get(>dev,
-   config->hpd_reg_names[i]);
-   if (IS_ERR(reg)) {
-   ret = PTR_ERR(reg);
-   DRM_DEV_ERROR(>dev, "failed to get hpd regulator: 
%s (%d)\n",
-   config->hpd_reg_names[i], ret);
-   goto fail;
-   }
+   for (i = 0; i < config->hpd_reg_cnt; i++)
+   hdmi->hpd_regs[i].supply = config->hpd_reg_names[i];
 
-   hdmi->hpd_regs[i] = reg;
+   ret = devm_regulator_bulk_get(>dev, config->hpd_reg_cnt, 
hdmi->hpd_regs);
+   if (ret) {
+   DRM_DEV_ERROR(>dev, "failed to get hpd regulator: %d\n", 
ret);
+   goto fail;
}
 
hdmi->pwr_regs = devm_kcalloc(>dev,
@@ -177,19 +171,11 @@ static struct hdmi *msm_hdmi_init(struct platform_device 
*pdev)
ret = -ENOMEM;
goto fail;
}
-   for (i = 0; i < config->pwr_reg_cnt; i++) {
-   struct regulator *reg;
 
-   reg = devm_regulator_get(>dev,
-   config->pwr_reg_names[i]);
-   if (IS_ERR(reg)) {
-   ret = PTR_ERR(reg);
-   DRM_DEV_ERROR(>dev, "failed to get pwr regulator: 
%s (%d)\n",
-   config->pwr_reg_names[i], ret);
-   goto fail;
-   }
-
-   hdmi->pwr_regs[i] = reg;
+   ret = devm_regulator_bulk_get(>dev, config->pwr_reg_cnt, 
hdmi->pwr_regs);
+   if (ret) {
+   DRM_DEV_ERROR(>dev, "failed to get pwr regulator: %d\n", 
ret);
+   goto fail;
}
 
hdmi->hpd_clks = devm_kcalloc(>dev,
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h
index d0b84f0abee1..82261078c6b1 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.h
@@ -56,8 +56,8 @@ struct hdmi {
void __iomem *qfprom_mmio;
phys_addr_t mmio_phy_addr;
 
-   struct regulator **hpd_regs;
-   struct regulator **pwr_regs;
+   struct regulator_bulk_data *hpd_regs;
+   struct regulator_bulk_data *pwr_regs;
struct clk **hpd_clks;
struct clk **pwr_clks;
 
@@ -163,7 +163,7 @@ struct hdmi_phy {
void __iomem *mmio;
struct hdmi_phy_cfg *cfg;
const struct hdmi_phy_funcs *funcs;
-   struct regulator **regs;
+   struct regulator_bulk_data *regs;
struct clk **clks;
 };
 
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index 6e380db9287b..f04eb4a70f0d 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -28,13 +28,9 @@ static void msm_hdmi_power_on(struct drm_bridge *bridge)
 
pm_runtime_get_sync(>pdev->dev);
 
-   for (i = 0; i < config->pwr_reg_cnt; i++) {
-   ret = regulator_enable(hdmi->pwr_regs[i]);
-   if (ret) {
-   DRM_DEV_ERROR(dev->dev, "failed to enable pwr 
regulator: %s (%d)\n",
-   config->pwr_reg_names[i], ret);
-   }
-   }
+   ret = regulator_bulk_enable(config->pwr_reg_cnt, hdmi->pwr_regs);
+   if (ret)
+   DRM_DEV_ERROR(dev->dev, "failed to enable pwr regulator: %d\n", 
ret);
 
if (config->pwr_clk_cnt > 0) {
DBG("pixclock: %lu", hdmi->pixclock);
@@ -70,13 +66,9 @@ static void power_off(struct drm_bridge *bridge)
for (i = 0; i < config->pwr_clk_cnt; i++)
clk_disable_unprepare(hdmi->pwr_clks[i]);
 
-   for (i = 0; i < config->pwr_reg_cnt; i++) {
-   ret = regulator_disable(hdmi->pwr_regs[i]);
-   if (ret) {
-   DRM_DEV_ERROR(dev->dev, "failed to disable pwr 
regulator: %s (%d)\n",
-   config->pwr_reg_names[i], ret);
-   }
-   }
+   ret = 

[PATCH 2/2] drm/msm/hdmi: switch to drm_bridge_connector

2021-10-14 Thread Dmitry Baryshkov
Merge old hdmi_bridge and hdmi_connector implementations. Use
drm_bridge_connector instead.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/Makefile  |   2 +-
 drivers/gpu/drm/msm/hdmi/hdmi.c   |  12 +-
 drivers/gpu/drm/msm/hdmi/hdmi.h   |  19 ++-
 drivers/gpu/drm/msm/hdmi/hdmi_bridge.c|  81 -
 .../msm/hdmi/{hdmi_connector.c => hdmi_hpd.c} | 154 ++
 5 files changed, 109 insertions(+), 159 deletions(-)
 rename drivers/gpu/drm/msm/hdmi/{hdmi_connector.c => hdmi_hpd.c} (62%)

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..91b09cda8a9c 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -19,7 +19,7 @@ msm-y := \
hdmi/hdmi.o \
hdmi/hdmi_audio.o \
hdmi/hdmi_bridge.o \
-   hdmi/hdmi_connector.o \
+   hdmi/hdmi_hpd.o \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index db17a000d968..d1cf4df7188c 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 #include "hdmi.h"
 
@@ -41,7 +43,7 @@ static irqreturn_t msm_hdmi_irq(int irq, void *dev_id)
struct hdmi *hdmi = dev_id;
 
/* Process HPD: */
-   msm_hdmi_connector_irq(hdmi->connector);
+   msm_hdmi_hpd_irq(hdmi->bridge);
 
/* Process DDC: */
msm_hdmi_i2c_irq(hdmi->i2c);
@@ -283,7 +285,7 @@ int msm_hdmi_modeset_init(struct hdmi *hdmi,
goto fail;
}
 
-   hdmi->connector = msm_hdmi_connector_init(hdmi);
+   hdmi->connector = drm_bridge_connector_init(hdmi->dev, encoder);
if (IS_ERR(hdmi->connector)) {
ret = PTR_ERR(hdmi->connector);
DRM_DEV_ERROR(dev->dev, "failed to create HDMI connector: 
%d\n", ret);
@@ -291,6 +293,8 @@ int msm_hdmi_modeset_init(struct hdmi *hdmi,
goto fail;
}
 
+   drm_connector_attach_encoder(hdmi->connector, hdmi->encoder);
+
hdmi->irq = irq_of_parse_and_map(pdev->dev.of_node, 0);
if (hdmi->irq < 0) {
ret = hdmi->irq;
@@ -307,7 +311,9 @@ int msm_hdmi_modeset_init(struct hdmi *hdmi,
goto fail;
}
 
-   ret = msm_hdmi_hpd_enable(hdmi->connector);
+   drm_bridge_connector_enable_hpd(hdmi->connector);
+
+   ret = msm_hdmi_hpd_enable(hdmi->bridge);
if (ret < 0) {
DRM_DEV_ERROR(>pdev->dev, "failed to enable HPD: %d\n", 
ret);
goto fail;
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h
index 82261078c6b1..736f348befb3 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.h
@@ -114,6 +114,13 @@ struct hdmi_platform_config {
struct hdmi_gpio_data gpios[HDMI_MAX_NUM_GPIO];
 };
 
+struct hdmi_bridge {
+   struct drm_bridge base;
+   struct hdmi *hdmi;
+   struct work_struct hpd_work;
+};
+#define to_hdmi_bridge(x) container_of(x, struct hdmi_bridge, base)
+
 void msm_hdmi_set_mode(struct hdmi *hdmi, bool power_on);
 
 static inline void hdmi_write(struct hdmi *hdmi, u32 reg, u32 data)
@@ -230,13 +237,11 @@ void msm_hdmi_audio_set_sample_rate(struct hdmi *hdmi, 
int rate);
 struct drm_bridge *msm_hdmi_bridge_init(struct hdmi *hdmi);
 void msm_hdmi_bridge_destroy(struct drm_bridge *bridge);
 
-/*
- * hdmi connector:
- */
-
-void msm_hdmi_connector_irq(struct drm_connector *connector);
-struct drm_connector *msm_hdmi_connector_init(struct hdmi *hdmi);
-int msm_hdmi_hpd_enable(struct drm_connector *connector);
+void msm_hdmi_hpd_irq(struct drm_bridge *bridge);
+enum drm_connector_status msm_hdmi_bridge_detect(
+   struct drm_bridge *bridge);
+int msm_hdmi_hpd_enable(struct drm_bridge *bridge);
+void msm_hdmi_hpd_disable(struct hdmi_bridge *hdmi_bridge);
 
 /*
  * i2c adapter for ddc:
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index f04eb4a70f0d..211b73dddf65 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -5,17 +5,16 @@
  */
 
 #include 
+#include 
 
+#include "msm_kms.h"
 #include "hdmi.h"
 
-struct hdmi_bridge {
-   struct drm_bridge base;
-   struct hdmi *hdmi;
-};
-#define to_hdmi_bridge(x) container_of(x, struct hdmi_bridge, base)
-
 void msm_hdmi_bridge_destroy(struct drm_bridge *bridge)
 {
+   struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge);
+
+   msm_hdmi_hpd_disable(hdmi_bridge);
 }
 
 static void msm_hdmi_power_on(struct drm_bridge *bridge)
@@ -251,14 +250,76 @@ static void msm_hdmi_bridge_mode_set(struct drm_bridge 
*bridge,
msm_hdmi_audio_update(hdmi);
 }
 
+static struct edid *msm_hdmi_bridge_get_edid(struct drm_bridge *bridge,
+   struct drm_connector *connector)
+{
+   

Re: [PATCH] drm/i915/gt: make a gt sysfs group and move power management files

2021-10-14 Thread Sundaresan, Sujaritha



On 10/13/2021 5:08 PM, Andi Shyti wrote:

From: Andi Shyti 

The GT has its own properties and in sysfs they should be grouped
in the 'gt/' directory.

Create a 'gt/' directory in sysfs which will contain gt0...gtN
directories related to each tile configured in the GPU. Move the
power management files inside those directories.

The previous power management files are kept in their original
root directory to avoid breaking the ABI. They point to the tile
'0' and a warning message is printed whenever accessed to. The
deprecated interface needs for the CONFIG_SYSFS_DEPRECATED_V2
flag in order to be generated.

The new sysfs structure will have a similar layout for the 4 tile
case:

/sys/.../card0
  ├── gt
  │   ├── gt0
  │   │   ├── id
  │   │   ├── rc6_enable
  │   │   ├── rc6_residency_ms
  │   │   ├── rps_act_freq_mhz
  │   │   ├── rps_boost_freq_mhz
  │   │   ├── rps_cur_freq_mhz
  │   │   ├── rps_max_freq_mhz
  │   │   ├── rps_min_freq_mhz
  │   │   ├── rps_RP0_freq_mhz
  │   │   ├── rps_RP1_freq_mhz
  │   │   └── rps_RPn_freq_mhz
 .   .
 .   .
 .   .
  │   └── gt3
  │   ├── id
  │   ├── rc6_enable
  │   ├── rc6_residency_ms
  │   ├── rps_act_freq_mhz
  │   ├── rps_boost_freq_mhz
  │   ├── rps_cur_freq_mhz
  │   ├── rps_max_freq_mhz
  │   ├── rps_min_freq_mhz
  │   ├── rps_RP0_freq_mhz
  │   ├── rps_RP1_freq_mhz
  │   └── rps_RPn_freq_mhz
  ├── gt_act_freq_mhz   -+
  ├── gt_boost_freq_mhz  |
  ├── gt_cur_freq_mhz|Original interface
  ├── gt_max_freq_mhz+─-> kept as existing ABI;
  ├── gt_min_freq_mhz|it points to gt0/
  ├── gt_RP0_freq_mhz|
  └── gt_RP1_freq_mhz|
  └── gt_RPn_freq_mhz   -+

Signed-off-by: Andi Shyti 
Signed-off-by: Lucas De Marchi 
Cc: Matt Roper 
Cc: Sujaritha Sundaresan 
Cc: Tvrtko Ursulin 
---
Hi,

this patch needs to go on top of Matt Roper's multitile series:

https://patchwork.freedesktop.org/series/95631/

because it requires multitile support.

Matt if you want and it's not much hassle for you, you can take
this patch along with your series.

Thanks,
Andi

  drivers/gpu/drm/i915/Makefile |   4 +-
  drivers/gpu/drm/i915/gt/intel_gt.c|   3 +
  drivers/gpu/drm/i915/gt/sysfs_gt.c| 130 +
  drivers/gpu/drm/i915/gt/sysfs_gt.h|  45 +++
  drivers/gpu/drm/i915/gt/sysfs_gt_pm.c | 394 ++
  drivers/gpu/drm/i915/gt/sysfs_gt_pm.h |  16 ++
  drivers/gpu/drm/i915/i915_drv.h   |   2 +
  drivers/gpu/drm/i915/i915_reg.h   |   1 +
  drivers/gpu/drm/i915/i915_sysfs.c | 328 +
  drivers/gpu/drm/i915/i915_sysfs.h |   3 +
  10 files changed, 607 insertions(+), 319 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.c
  create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt.h
  create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.c
  create mode 100644 drivers/gpu/drm/i915/gt/sysfs_gt_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 21b05ed0e4e8c..f39e00a0d584f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -120,7 +120,9 @@ gt-y += \
gt/intel_timeline.o \
gt/intel_workarounds.o \
gt/shmem_utils.o \
-   gt/sysfs_engines.o
+   gt/sysfs_engines.o \
+   gt/sysfs_gt.o \
+   gt/sysfs_gt_pm.o
  # autogenerated null render state
  gt-y += \
gt/gen6_renderstate.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 0879e30ace7cc..748a21ab717d2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -21,6 +21,7 @@
  #include "intel_rps.h"
  #include "intel_uncore.h"
  #include "shmem_utils.h"
+#include "sysfs_gt.h"
  #include "pxp/intel_pxp.h"
  
  static void

@@ -448,6 +449,7 @@ void intel_gt_driver_register(struct intel_gt *gt)
intel_rps_driver_register(>rps);
  
  	intel_gt_debugfs_register(gt);

+   intel_gt_sysfs_register(gt);
  }
  
  static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)

@@ -764,6 +766,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
  {
intel_wakeref_t wakeref;
  
+	intel_gt_sysfs_unregister(gt);

intel_rps_driver_unregister(>rps);
  
  	intel_pxp_fini(>pxp);

diff --git a/drivers/gpu/drm/i915/gt/sysfs_gt.c 
b/drivers/gpu/drm/i915/gt/sysfs_gt.c
new file mode 100644
index 0..0d1398b2d61ce
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/sysfs_gt.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i915_drv.h"
+#include "i915_sysfs.h"
+#include 

Re: [PATCH 2/2] amd/amdgpu_dm: Verify Gamma and Degamma LUT sizes using DRM Core check

2021-10-14 Thread Sean Paul
On Wed, Oct 13, 2021 at 2:12 PM Mark Yacoub  wrote:
>
> From: Mark Yacoub 
>
> [Why]
> drm_atomic_helper_check_crtc now verifies both legacy and non-legacy LUT
> sizes. There is no need to check it within amdgpu_dm_atomic_check.
>
> [How]
> Remove the local call to verify LUT sizes and use DRM Core function
> instead.
>
> Tested on ChromeOS Zork.
>
> v1:
> Remove amdgpu_dm_verify_lut_sizes everywhere.
>

Reviewed-by: Sean Paul 

> Signed-off-by: Mark Yacoub 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  8 ++---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 -
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 35 ---
>  3 files changed, 4 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index f74663b6b046e..47f8de1cfc3a5 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -10244,6 +10244,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
> }
> }
>  #endif
> +   ret = drm_atomic_helper_check_crtcs(state);
> +   if (ret)
> +   return ret;
> +
> for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
> new_crtc_state, i) {
> dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
>
> @@ -10253,10 +10257,6 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
> dm_old_crtc_state->dsc_force_changed == false)
> continue;
>
> -   ret = amdgpu_dm_verify_lut_sizes(new_crtc_state);
> -   if (ret)
> -   goto fail;
> -
> if (!new_crtc_state->enable)
> continue;
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index fcb9c4a629c32..22730e5542092 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -617,7 +617,6 @@ void amdgpu_dm_trigger_timing_sync(struct drm_device 
> *dev);
>  #define MAX_COLOR_LEGACY_LUT_ENTRIES 256
>
>  void amdgpu_dm_init_color_mod(void);
> -int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state);
>  int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc);
>  int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
>   struct dc_plane_state *dc_plane_state);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index a022e5bb30a5c..319f8a8a89835 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -284,37 +284,6 @@ static int __set_input_tf(struct dc_transfer_func *func,
> return res ? 0 : -ENOMEM;
>  }
>
> -/**
> - * Verifies that the Degamma and Gamma LUTs attached to the |crtc_state| are 
> of
> - * the expected size.
> - * Returns 0 on success.
> - */
> -int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state)
> -{
> -   const struct drm_color_lut *lut = NULL;
> -   uint32_t size = 0;
> -
> -   lut = __extract_blob_lut(crtc_state->degamma_lut, );
> -   if (lut && size != MAX_COLOR_LUT_ENTRIES) {
> -   DRM_DEBUG_DRIVER(
> -   "Invalid Degamma LUT size. Should be %u but got 
> %u.\n",
> -   MAX_COLOR_LUT_ENTRIES, size);
> -   return -EINVAL;
> -   }
> -
> -   lut = __extract_blob_lut(crtc_state->gamma_lut, );
> -   if (lut && size != MAX_COLOR_LUT_ENTRIES &&
> -   size != MAX_COLOR_LEGACY_LUT_ENTRIES) {
> -   DRM_DEBUG_DRIVER(
> -   "Invalid Gamma LUT size. Should be %u (or %u for 
> legacy) but got %u.\n",
> -   MAX_COLOR_LUT_ENTRIES, MAX_COLOR_LEGACY_LUT_ENTRIES,
> -   size);
> -   return -EINVAL;
> -   }
> -
> -   return 0;
> -}
> -
>  /**
>   * amdgpu_dm_update_crtc_color_mgmt: Maps DRM color management to DC stream.
>   * @crtc: amdgpu_dm crtc state
> @@ -348,10 +317,6 @@ int amdgpu_dm_update_crtc_color_mgmt(struct 
> dm_crtc_state *crtc)
> bool is_legacy;
> int r;
>
> -   r = amdgpu_dm_verify_lut_sizes(>base);
> -   if (r)
> -   return r;
> -
> degamma_lut = __extract_blob_lut(crtc->base.degamma_lut, 
> _size);
> regamma_lut = __extract_blob_lut(crtc->base.gamma_lut, _size);
>
> --
> 2.33.0.882.g93a45727a2-goog
>


[Bug 214725] New: simpledrm and i915 both active after boot

2021-10-14 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=214725

Bug ID: 214725
   Summary: simpledrm and i915 both active after boot
   Product: Drivers
   Version: 2.5
Kernel Version: 5.14.11
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: dennis.lis...@gmail.com
Regression: No

Gentoo Linux, custom kernel build, 5.14.11 with
CONFIG_DRM_SIMPLEDRM=y
CONFIG_DRM_I915=y

Plasma (wayland) sees two monitors: the laptop screen and an unknown one. The
only monitor physically present is the laptop screen. This did not happen a few
months ago (possibly started with 5.14 upgrade).

$ ls /sys/class/drm/
card0  card0-eDP-1  card1  card1-Unknown-1  card2  card2-DP-1  card2-DP-2 
card2-DP-3  renderD128  renderD129  version
$ readlink /sys/class/drm/card0 # Intel GPU
../../devices/pci:00/:00:02.0/drm/card0
$ readlink /sys/class/drm/card1 # Simple DRM
../../devices/platform/simple-framebuffer.0/drm/card1
$ readlink /sys/class/drm/card2 # Nouveau (not used at the moment)
../../devices/pci:00/:00:01.0/:01:00.0/drm/card2

$ dmesg | egrep '(simple|i915)'
[0.00] Command line: BOOT_IMAGE=/kernel-5.14.11 root=[...] ro
acpi_backlight=video resume=[...] i915.enable_gvt=1 quiet
[0.060342] Kernel command line: BOOT_IMAGE=/kernel-5.14.11 root=[...] ro
acpi_backlight=video resume=[...] i915.enable_gvt=1 quiet
[1.076957] i915 :00:02.0: [drm] VT-d active for gfx access
[1.076961] i915 :00:02.0: vgaarb: deactivate vga console
[1.387359] i915 :00:02.0: Direct firmware load for
i915/gvt/vid_0x8086_did_0x191b_rid_0x06.golden_hw_state failed with error -2
[1.404418] i915 :00:02.0: vgaarb: changed VGA decodes:
olddecodes=io+mem,decodes=none:owns=io+mem
[1.404525] i915 :00:02.0: [drm] Disabling framebuffer compression (FBC)
to prevent screen flicker with VT-d enabled
[1.404931] i915 :00:02.0: [drm] Finished loading DMC firmware
i915/skl_dmc_ver1_27.bin (v1.27)
[1.422128] [drm] Initialized i915 1.6.0 20201103 for :00:02.0 on minor
0
[1.425127] [drm] Initialized simpledrm 1.0.0 20200625 for
simple-framebuffer.0 on minor 1
[1.437859] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrm
frame buffer device
[1.438058] fbcon: i915 (fb1) is primary device
[2.579133] i915 :00:02.0: [drm] fb1: i915 frame buffer device
[   22.162612] snd_hda_intel :00:1f.3: bound :00:02.0 (ops
i915_audio_component_bind_ops)
[   52.106084] simple-framebuffer simple-framebuffer.0: swiotlb buffer is full
(sz: 8388608 bytes), total 32768 (slots), used 0 (slots)

Looks like the hand-over mechanism does not correctly handle this combination,
possibly due to i915 loading before simpledrm.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 10/11] drm/msm/dsi: Add support for DSC configuration

2021-10-14 Thread Dmitry Baryshkov

On 07/10/2021 10:08, Vinod Koul wrote:

When DSC is enabled, we need to configure DSI registers accordingly and
configure the respective stream compression registers.

Add support to calculate the register setting based on DSC params and
timing information and configure these registers.

Signed-off-by: Vinod Koul 
---
  drivers/gpu/drm/msm/dsi/dsi.xml.h  |  10 +++
  drivers/gpu/drm/msm/dsi/dsi_host.c | 123 -
  2 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.xml.h 
b/drivers/gpu/drm/msm/dsi/dsi.xml.h
index 49b551ad1bff..c1c85df58c4b 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.xml.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.xml.h
@@ -706,4 +706,14 @@ static inline uint32_t DSI_VERSION_MAJOR(uint32_t val)
  #define REG_DSI_CPHY_MODE_CTRL
0x02d4
  
  
+#define REG_DSI_VIDEO_COMPRESSION_MODE_CTRL			0x029c

+
+#define REG_DSI_VIDEO_COMPRESSION_MODE_CTRL2   0x02a0
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL  0x02a4
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL2 0x02a8
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL3 0x02ac
+
  #endif /* DSI_XML */
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index ba24458c2e38..86e36a3e97b6 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -946,6 +946,26 @@ static void dsi_ctrl_config(struct msm_dsi_host *msm_host, 
bool enable,
dsi_write(msm_host, REG_DSI_CPHY_MODE_CTRL, BIT(0));
  }
  
+static int dsi_dsc_update_pic_dim(struct msm_display_dsc_config *dsc,

+ int pic_width, int pic_height)
+{
+   if (!dsc || !pic_width || !pic_height) {
+   pr_err("DSI: invalid input: pic_width: %d pic_height: %d\n", 
pic_width, pic_height);
+   return -EINVAL;
+   }
+
+   if ((pic_width % dsc->drm->slice_width) || (pic_height % 
dsc->drm->slice_height)) {
+   pr_err("DSI: pic_dim %dx%d has to be multiple of slice %dx%d\n",
+  pic_width, pic_height, dsc->drm->slice_width, 
dsc->drm->slice_height);
+   return -EINVAL;
+   }


This should go to the mode_valid() callback for the dsi_bridge.


+
+   dsc->drm->pic_width = pic_width;
+   dsc->drm->pic_height = pic_height;
+
+   return 0;
+}
+
  static void dsi_timing_setup(struct msm_dsi_host *msm_host, bool 
is_bonded_dsi)
  {
struct drm_display_mode *mode = msm_host->mode;
@@ -978,7 +998,72 @@ static void dsi_timing_setup(struct msm_dsi_host 
*msm_host, bool is_bonded_dsi)
hdisplay /= 2;
}
  
+	if (msm_host->dsc) {

+   struct msm_display_dsc_config *dsc = msm_host->dsc;
+
+   /* update dsc params with timing params */
+   dsi_dsc_update_pic_dim(dsc, mode->hdisplay, mode->vdisplay);
+   DBG("Mode Width- %d x Height %d\n", dsc->drm->pic_width, 
dsc->drm->pic_height);
+
+   /* we do the calculations for dsc parameters here so that
+* panel can use these parameters
+*/
+   dsi_populate_dsc_params(dsc);
+
+   /* Divide the display by 3 but keep back/font porch and
+* pulse width same
+*/
+   h_total -= hdisplay;
+   hdisplay /= 3;
+   h_total += hdisplay;
+   ha_end = ha_start + hdisplay;
+   }
+
if (msm_host->mode_flags & MIPI_DSI_MODE_VIDEO) {
+   if (msm_host->dsc) {
+   struct msm_display_dsc_config *dsc = msm_host->dsc;
+   u32 reg, intf_width, slice_per_intf;
+   u32 total_bytes_per_intf;
+
+   /* first calculate dsc parameters and then program
+* compress mode registers
+*/
+   intf_width = hdisplay;
+   slice_per_intf = DIV_ROUND_UP(intf_width, 
dsc->drm->slice_width);
+
+   /* If slice_count > slice_per_intf, then use 1
+* This can happen during partial update
+*/
+   dsc->drm->slice_count = 1;


Is the if() missing here? The indentpation and the comment seems unclear 
about that.



+
+   dsc->bytes_in_slice = 
DIV_ROUND_UP(dsc->drm->slice_width * 8, 8);
+   total_bytes_per_intf = dsc->bytes_in_slice * 
slice_per_intf;
+
+   dsc->eol_byte_num = total_bytes_per_intf % 3;
+   dsc->pclk_per_line =  
DIV_ROUND_UP(total_bytes_per_intf, 3);
+   dsc->bytes_per_pkt = dsc->bytes_in_slice * 
dsc->drm->slice_count;
+   dsc->pkt_per_line = slice_per_intf / 
dsc->drm->slice_count;
+
+   reg = 

Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Jason Gunthorpe
On Thu, Oct 14, 2021 at 12:01:14PM -0700, Dan Williams wrote:
> > > Does anyone know why devmap is pte_special anyhow?
> 
> It does not need to be special as mentioned here:
> 
> https://lore.kernel.org/all/CAPcyv4iFeVDVPn6uc=aksyuvkiu3-fk-n16ijvzq3n8ot00...@mail.gmail.com/

I added a remark there

Not special means more to me, it means devmap should do the refcounts
properly like normal memory pages.

It means vm_normal_page should return !NULL and it means insert_page,
not insert_pfn should be used to install them in the PTE. VMAs should
not be MIXED MAP, but normal struct page maps.

I think this change alone would fix all the refcount problems
everwhere in DAX and devmap.

> The refcount dependencies also go away after this...
> 
> https://lore.kernel.org/all/161604050866.1463742.7759521510383551055.st...@dwillia2-desk3.amr.corp.intel.com/
>
> ...but you can see that patches 1 and 2 in that series depend on being
> able to guarantee that all mappings are invalidated when the undelying
> device that owns the pgmap goes away.

If I have put everything together right this is because of what I
pointed to here. FS-DAX is installing 0 refcount pages into PTEs and
expecting that to work sanely. 

This means the page map cannot be removed until all the PTEs are fully
flushed, which buggily doesn't happen because of the missing unplug.

However, this is all because nobody incrd a refcount to represent the
reference in the PTE and since this ment that 0 refcount pages were
wrongly stuffed into PTEs then devmap used the refcount == 1 hack to
unbreak GUP?

So.. Is there some reason why devmap pages are trying so hard to avoid
sane refcounting???

If the PTE itself holds the refcount (by not being special) then there
is no need for the pagemap stuff in GUP. pagemap already waits for
refs to go to 0 so the missing shootdown during nvdimm unplug will
cause pagemap to block until the address spaces are invalidated. IMHO
this is already better than the current buggy situation of allowing
continued PTE reference to memory that is now removed from the system.

> For that to happen there needs to be communication back to the FS for
> device-gone / failure events. That work is in progress via this
> series:
> 
> https://lore.kernel.org/all/20210924130959.2695749-1-ruansy.f...@fujitsu.com/

This is fine, but I don't think it should block fixing the mm side -
the end result here still cannot be 0 ref count pages installed in
PTEs.

Fixing that does not depend on shootdown during device removal, right?

It requires holding refcounts while pages are installed into address
spaces - and this lack is a direct cause of making the PTEs all
special and using insert_pfn and MIXED_MAP.

Thanks,
Jason


Re: [PATCH] drm/msm/dp: Use the connector passed to dp_debug_get()

2021-10-14 Thread kernel test robot
Hi Bjorn,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.15-rc5 next-20211013]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Bjorn-Andersson/drm-msm-dp-Use-the-connector-passed-to-dp_debug_get/20211010-110400
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
7fd2bf83d59a2d32e0d596c5d3e623b9a0e7e2d5
config: arm-qcom_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/e591869849ae9c71d7db25e6b0df588a31ca76cf
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Bjorn-Andersson/drm-msm-dp-Use-the-connector-passed-to-dp_debug_get/20211010-110400
git checkout e591869849ae9c71d7db25e6b0df588a31ca76cf
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/msm/dp/dp_display.c: In function 'msm_dp_debugfs_init':
>> drivers/gpu/drm/msm/dp/dp_display.c:1432:65: error: passing argument 5 of 
>> 'dp_debug_get' from incompatible pointer type 
>> [-Werror=incompatible-pointer-types]
1432 | dp->link, 
dp->dp_display.connector,
 |   
~~^~
 | |
 | 
struct drm_connector *
   In file included from drivers/gpu/drm/msm/dp/dp_display.c:28:
   drivers/gpu/drm/msm/dp/dp_debug.h:63:40: note: expected 'struct 
drm_connector **' but argument is of type 'struct drm_connector *'
  63 | struct drm_connector **connector, struct drm_minor 
*minor)
 | ~~~^
   cc1: some warnings being treated as errors


vim +/dp_debug_get +1432 drivers/gpu/drm/msm/dp/dp_display.c

  1421  
  1422  void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor 
*minor)
  1423  {
  1424  struct dp_display_private *dp;
  1425  struct device *dev;
  1426  int rc;
  1427  
  1428  dp = container_of(dp_display, struct dp_display_private, 
dp_display);
  1429  dev = >pdev->dev;
  1430  
  1431  dp->debug = dp_debug_get(dev, dp->panel, dp->usbpd,
> 1432  dp->link, 
> dp->dp_display.connector,
  1433  minor);
  1434  if (IS_ERR(dp->debug)) {
  1435  rc = PTR_ERR(dp->debug);
  1436  DRM_ERROR("failed to initialize debug, rc = %d\n", rc);
  1437  dp->debug = NULL;
  1438  }
  1439  }
  1440  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH] ARM: dts: qcom-apq8064: stop using legacy clock names for HDMI

2021-10-14 Thread Dmitry Baryshkov
Stop using legacy clock names (with _clk suffix) for HDMI and HDMI PHY
device tree nodes.

Signed-off-by: Dmitry Baryshkov 
---
 arch/arm/boot/dts/qcom-apq8064.dtsi | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
b/arch/arm/boot/dts/qcom-apq8064.dtsi
index 87e5194114d5..6a51cf014994 100644
--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
@@ -1420,9 +1420,9 @@ hdmi: hdmi-tx@4a0 {
clocks = < HDMI_APP_CLK>,
 < HDMI_M_AHB_CLK>,
 < HDMI_S_AHB_CLK>;
-   clock-names = "core_clk",
- "master_iface_clk",
- "slave_iface_clk";
+   clock-names = "core",
+ "master_iface",
+ "slave_iface";
 
phys = <_phy>;
phy-names = "hdmi-phy";
@@ -1453,7 +1453,7 @@ hdmi_phy: hdmi-phy@4a00400 {
"hdmi_pll";
 
clocks = < HDMI_S_AHB_CLK>;
-   clock-names = "slave_iface_clk";
+   clock-names = "slave_iface";
#phy-cells = <0>;
};
 
-- 
2.33.0



Re: Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"

2021-10-14 Thread Dmitry Baryshkov
On Thu, 14 Oct 2021 at 19:54, Vladimir Zapolskiy
 wrote:
>
> Hi Dmitry,
>
> On 10/14/21 4:54 PM, Dmitry Baryshkov wrote:
> > From: Amit Pundir 
> >
> > This reverts commit 001ce9785c0674d913531345e86222c965fc8bf4.
> >
> > This upstream commit broke AOSP (post Android 12 merge) build
> > on RB5. The device either silently crashes into USB crash mode
> > after android boot animation or we see a blank blue screen
> > with following dpu errors in dmesg:
> >
> > [  T444] hw recovery is not complete for ctl:3
> > [  T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu 
> > error]enc31 intf1 ctl 3 reset failure: -22
> > [  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu 
> > error]vblank timeout
> > [  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit 
> > done returned -110
> > [C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame 
> > done timeout
> > [  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu 
> > error]vblank timeout
> > [  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit 
> > done returned -110
> >
> > Signed-off-by: Amit Pundir 
>
> your sob tag is missing.

True. I hope this is fine:

Signed-off-by: Dmitry Baryshkov 

>
> > ---
> >   arch/arm64/boot/dts/qcom/sm8250.dtsi | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
> > b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > index 8c15d9fed08f..d12e4cbfc852 100644
> > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > @@ -2590,9 +2590,10 @@
> >   power-domains = < MDSS_GDSC>;
> >
> >   clocks = < DISP_CC_MDSS_AHB_CLK>,
> > +  < GCC_DISP_HF_AXI_CLK>,
> >< GCC_DISP_SF_AXI_CLK>,
> >< DISP_CC_MDSS_MDP_CLK>;
> > - clock-names = "iface", "nrt_bus", "core";
> > + clock-names = "iface", "bus", "nrt_bus", "core";
> >
> >   assigned-clocks = < DISP_CC_MDSS_MDP_CLK>;
> >   assigned-clock-rates = <46000>;
> >
>
> --
> Best wishes,
> Vladimir



-- 
With best wishes
Dmitry


[PATCH] drm/i915: Avoid bitwise vs logical OR warning in snb_wm_latency_quirk()

2021-10-14 Thread Nathan Chancellor
A new warning in clang points out a place in this file where a bitwise
OR is being used with boolean types:

drivers/gpu/drm/i915/intel_pm.c:3066:12: warning: use of bitwise '|' with 
boolean operands [-Wbitwise-instead-of-logical]
changed = ilk_increase_wm_latency(dev_priv, dev_priv->wm.pri_latency, 
12) |
  
^

This construct is intentional, as it allows every one of the calls to
ilk_increase_wm_latency() to occur (instead of short circuiting with
logical OR) while still caring about the result of each call.

To make this clearer to the compiler, use the '|=' operator to assign
the result of each ilk_increase_wm_latency() call to changed, which
keeps the meaning of the code the same but makes it obvious that every
one of these calls is expected to happen.

Link: https://github.com/ClangBuiltLinux/linux/issues/1473
Reported-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/i915/intel_pm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index f90fe39cf8ca..aaa3a0998e4c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3050,9 +3050,9 @@ static void snb_wm_latency_quirk(struct drm_i915_private 
*dev_priv)
 * The BIOS provided WM memory latency values are often
 * inadequate for high resolution displays. Adjust them.
 */
-   changed = ilk_increase_wm_latency(dev_priv, dev_priv->wm.pri_latency, 
12) |
-   ilk_increase_wm_latency(dev_priv, dev_priv->wm.spr_latency, 12) 
|
-   ilk_increase_wm_latency(dev_priv, dev_priv->wm.cur_latency, 12);
+   changed = ilk_increase_wm_latency(dev_priv, dev_priv->wm.pri_latency, 
12);
+   changed |= ilk_increase_wm_latency(dev_priv, dev_priv->wm.spr_latency, 
12);
+   changed |= ilk_increase_wm_latency(dev_priv, dev_priv->wm.cur_latency, 
12);
 
if (!changed)
return;

base-commit: d73b17465d6da0a94bc0fcc86b150e1e923e8f71
-- 
2.33.1.637.gf443b226ca



Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Ralph Campbell



On 10/14/21 11:01 AM, Jason Gunthorpe wrote:

On Thu, Oct 14, 2021 at 10:35:27AM -0700, Ralph Campbell wrote:


I ran xfstests-dev using the kernel boot option to "fake" a pmem device
when I first posted this patch. The tests ran OK (or at least the same
tests passed with and without my patch).

Hmm. I know nothing of xfstests but

tests/generic/413

Looks kind of like it might cover this situation?

Did it run for you?

Jason


I don't remember. I'll have to rerun the test which might take a day or two
to set up again.



Re: [PATCH v5 1/3] drm/loongson: Add DRM Driver for Loongson 7A1000 bridge chip

2021-10-14 Thread Sam Ravnborg
Hi lichenyang,
On Sat, Sep 11, 2021 at 10:31:31AM +0800, lichenyang wrote:
> From: Chenyang Li 
> 
> This patch adds an initial DRM driver for the Loongson LS7A1000
> bridge chip(LS7A). The LS7A bridge chip contains two display
> controllers, support dual display output. The maximum support for
> each channel display is to 1920x1080@60Hz.
> At present, DC device detection and DRM driver registration are
> completed, the crtc/plane/encoder/connector objects has been
> implemented.
> On Loongson 3A4000 CPU and 7A1000 system, we have achieved the use
> of dual screen, and support dual screen clone mode and expansion
> mode.
> 
> v11:
> - Remove a lot of useless code.
> - Add help information.
> - Delete unnecessary header files.
Looks much better now, thanks.

Can you provide some kind of overview of the HW?
It is confusing that you talk about a bridge for a display driver - is
this something from the HW?
And please look into usign the drm_bridge_connector - as this is what
any modern DRM display driver needs to use today.

Also the connector type needs to be specified - unknown is not
acceptable here.

The mail you sent did not show up at https://lore.kernel.org/dri-devel/
Please fix what is required to make it visible there.
This is where we point people to see the original mails.

Also a cover letter that explains what has been done - and what has not
been done - would be nice.

I look forward to v12,

Sam


Re: [PATCH] dt-bindings: display/bridge: sil,sii9234: Convert to YAML binding

2021-10-14 Thread Rob Herring
On Wed, 06 Oct 2021 17:21:58 +0200, AngeloGioacchino Del Regno wrote:
> Convert the Silicon Image SiI9234 HDMI/MHL bridge documentation to YAML.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  .../bindings/display/bridge/sii9234.txt   |  49 
>  .../bindings/display/bridge/sil,sii9234.yaml  | 110 ++
>  2 files changed, 110 insertions(+), 49 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/display/bridge/sii9234.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/sil,sii9234.yaml
> 

Reviewed-by: Rob Herring 


Re: [PATCH v2] dt-bindings: display/bridge: sil,sii8620: Convert to YAML binding

2021-10-14 Thread Rob Herring
On Wed, Oct 06, 2021 at 05:04:59PM +0200, AngeloGioacchino Del Regno wrote:
> Convert the Silicon Image SiI8620 HDMI/MHL bridge documentation to YAML.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  .../bindings/display/bridge/sil,sii8620.yaml  | 93 +++
>  .../bindings/display/bridge/sil-sii8620.txt   | 33 ---
>  2 files changed, 93 insertions(+), 33 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/sil,sii8620.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/display/bridge/sil-sii8620.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/sil,sii8620.yaml 
> b/Documentation/devicetree/bindings/display/bridge/sil,sii8620.yaml
> new file mode 100644
> index ..5a38595b6687
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/bridge/sil,sii8620.yaml
> @@ -0,0 +1,93 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/bridge/sil,sii8620.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Silicon Image SiI8620 HDMI/MHL bridge
> +
> +maintainers:
> +  - Andrzej Hajda 
> +
> +properties:
> +  compatible:
> +const: sil,sii8620
> +
> +  reg:
> +description: I2C address of the bridge
> +maxItems: 1
> +
> +  clocks:
> +maxItems: 1
> +
> +  clock-names:
> +const: xtal
> +
> +  cvcc10-supply:
> +description: Digital Core Supply Voltage, 1.0V
> +
> +  iovcc18-supply:
> +description: I/O voltage supply, 1.8V
> +
> +  interrupts:
> +maxItems: 1
> +
> +  reset-gpios:
> +description: GPIO connected to the reset pin.
> +maxItems: 1
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Video port for HDMI input

No need for 'ports' with only one port. However, this *should* have an 
output port for a connector, so I'm okay with this change. However, 
don't add that now. Just note the addition of 'ports' in the commit msg.

> +
> +required:
> +  - port@0
> +
> +required:
> +  - compatible
> +  - reg
> +  - cvcc10-supply
> +  - iovcc18-supply
> +  - interrupts
> +  - ports
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +#include 
> +
> +i2c1 {

i2c {

> +  #address-cells = <1>;
> +  #size-cells = <0>;
> +
> +  bridge@39 {
> +compatible = "sil,sii8620";
> +reg = <0x39>;
> +cvcc10-supply = <_reg>;
> +iovcc18-supply = <_reg>;
> +interrupt-parent = <>;
> +interrupts = <2 IRQ_TYPE_LEVEL_HIGH>;
> +reset-gpios = < 0 GPIO_ACTIVE_HIGH>;
> +
> +ports {
> +  #address-cells = <1>;
> +  #size-cells = <0>;
> +
> +  port@0 {
> +reg = <0>;
> +mhl_to_hdmi: endpoint {
> +  remote-endpoint = <_to_mhl>;
> +};
> +  };
> +};
> +  };
> +};
> +
> +...
> diff --git a/Documentation/devicetree/bindings/display/bridge/sil-sii8620.txt 
> b/Documentation/devicetree/bindings/display/bridge/sil-sii8620.txt
> deleted file mode 100644
> index b05052f7d62f..
> --- a/Documentation/devicetree/bindings/display/bridge/sil-sii8620.txt
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -Silicon Image SiI8620 HDMI/MHL bridge bindings
> -
> -Required properties:
> - - compatible: "sil,sii8620"
> - - reg: i2c address of the bridge
> - - cvcc10-supply: Digital Core Supply Voltage (1.0V)
> - - iovcc18-supply: I/O Supply Voltage (1.8V)
> - - interrupts: interrupt specifier of INT pin
> - - reset-gpios: gpio specifier of RESET pin
> - - clocks, clock-names: specification and name of "xtal" clock
> - - video interfaces: Device node can contain video interface port
> - node for HDMI encoder according to [1].
> -
> -[1]: Documentation/devicetree/bindings/media/video-interfaces.txt
> -
> -Example:
> - sii8620@39 {
> - reg = <0x39>;
> - compatible = "sil,sii8620";
> - cvcc10-supply = <_reg>;
> - iovcc18-supply = <_reg>;
> - interrupt-parent = <>;
> - interrupts = <2 0>;
> - reset-gpio = < 0 0>;
> - clocks = <_system_controller 0>;
> - clock-names = "xtal";
> -
> - port {
> - mhl_to_hdmi: endpoint {
> - remote-endpoint = <_to_mhl>;
> - };
> - };
> - };
> -- 
> 2.33.0
> 
> 


Re: [PATCH] drm: Update MST First Link Slot Information Based on Encoding Format

2021-10-14 Thread Bhawanpreet Lakha

Adding Mikita aswell

On 2021-10-14 4:21 p.m., Bhawanpreet Lakha wrote:


On 2021-10-13 6:25 p.m., Lyude Paul wrote:
Some comments below (also, sorry again for the mixup on the last 
review!)


On Tue, 2021-10-12 at 17:58 -0400, Bhawanpreet Lakha wrote:

8b/10b encoding format requires to reserve the first slot for
recording metadata. Real data transmission starts from the second slot,
with a total of available 63 slots available.

In 128b/132b encoding format, metadata is transmitted separately
in LLCP packet before MTP. Real data transmission starts from
the first slot, with a total of 64 slots available.

v2:
* Remove get_mst_link_encoding_cap
* Move total/start slots to mst_state, and copy it to mst_mgr in
atomic_check

Signed-off-by: Fangzhi Zuo 
Signed-off-by: Bhawanpreet Lakha 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 28 +++
  drivers/gpu/drm/drm_dp_mst_topology.c | 35 
+++

  include/drm/drm_dp_mst_helper.h   | 13 +++
  3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5020f2d36fe1..4ad50eb0091a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10612,6 +10612,8 @@ static int amdgpu_dm_atomic_check(struct 
drm_device

*dev,
  #if defined(CONFIG_DRM_AMD_DC_DCN)
 struct dsc_mst_fairness_vars vars[MAX_PIPES];
  #endif
+   struct drm_dp_mst_topology_state *mst_state;
+   struct drm_dp_mst_topology_mgr *mgr;
   trace_amdgpu_dm_atomic_check_begin(state);
  @@ -10819,6 +10821,32 @@ static int amdgpu_dm_atomic_check(struct 
drm_device

*dev,
 lock_and_validation_needed = true;
 }
  +#if defined(CONFIG_DRM_AMD_DC_DCN)
+   for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
+   struct amdgpu_dm_connector *aconnector;
+   struct drm_connector *connector;
+   struct drm_connector_list_iter iter;
+   u8 link_coding_cap;
+
+   if (!mgr->mst_state )
+   continue;

extraneous space


+
+   drm_connector_list_iter_begin(dev, );
+   drm_for_each_connector_iter(connector, ) {
+   int id = connector->index;
+
+   if (id == mst_state->mgr->conn_base_id) {
+   aconnector =
to_amdgpu_dm_connector(connector);
+   link_coding_cap =
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+   drm_dp_mst_update_coding_cap(mst_state,
link_coding_cap);
+
+   break;
+   }
+   }
+   drm_connector_list_iter_end();
+
+   }
+#endif
 /**
  * Streams and planes are reset when there are changes that 
affect
  * bandwidth. Anything that affects bandwidth needs to go 
through

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
b/drivers/gpu/drm/drm_dp_mst_topology.c
index ad0795afc21c..fb5c47c4cb2e 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3368,7 +3368,7 @@ int drm_dp_update_payload_part1(struct
drm_dp_mst_topology_mgr *mgr)
 struct drm_dp_payload req_payload;
 struct drm_dp_mst_port *port;
 int i, j;
-   int cur_slots = 1;
+   int cur_slots = mgr->start_slot;
 bool skip;
   mutex_lock(>payload_lock);
@@ -4321,7 +4321,7 @@ int drm_dp_find_vcpi_slots(struct
drm_dp_mst_topology_mgr *mgr,
 num_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
   /* max. time slots - one slot for MTP header */
-   if (num_slots > 63)
+   if (num_slots > mgr->total_avail_slots)
 return -ENOSPC;
For reasons I will explain a little further in this email, we might 
want to

drop this…


 return num_slots;
  }
@@ -4333,7 +4333,7 @@ static int drm_dp_init_vcpi(struct
drm_dp_mst_topology_mgr *mgr,
 int ret;
   /* max. time slots - one slot for MTP header */
-   if (slots > 63)
+   if (slots > mgr->total_avail_slots)

…and this


 return -ENOSPC;
   vcpi->pbn = pbn;
@@ -4507,6 +4507,18 @@ int drm_dp_atomic_release_vcpi_slots(struct
drm_atomic_state *state,
  }
  EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots);
  +void drm_dp_mst_update_coding_cap(struct drm_dp_mst_topology_state
*mst_state, uint8_t link_coding_cap)

Need some kdocs here


+{
+   if (link_coding_cap == DP_CAP_ANSI_128B132B) {
+   mst_state->total_avail_slots = 64;
+   mst_state->start_slot = 0;
+   }
+
+   DRM_DEBUG_KMS("%s coding format on mgr 0x%p\n",
+   (link_coding_cap == DP_CAP_ANSI_128B132B) ?
"128b/132b":"8b/10b", mst_state->mgr);

need to fix indenting here, and wrap this to 100 chars


+}

Re: [PATCH] drm: Update MST First Link Slot Information Based on Encoding Format

2021-10-14 Thread Bhawanpreet Lakha



On 2021-10-13 6:25 p.m., Lyude Paul wrote:

Some comments below (also, sorry again for the mixup on the last review!)

On Tue, 2021-10-12 at 17:58 -0400, Bhawanpreet Lakha wrote:

8b/10b encoding format requires to reserve the first slot for
recording metadata. Real data transmission starts from the second slot,
with a total of available 63 slots available.

In 128b/132b encoding format, metadata is transmitted separately
in LLCP packet before MTP. Real data transmission starts from
the first slot, with a total of 64 slots available.

v2:
* Remove get_mst_link_encoding_cap
* Move total/start slots to mst_state, and copy it to mst_mgr in
atomic_check

Signed-off-by: Fangzhi Zuo 
Signed-off-by: Bhawanpreet Lakha 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 28 +++
  drivers/gpu/drm/drm_dp_mst_topology.c | 35 +++
  include/drm/drm_dp_mst_helper.h   | 13 +++
  3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5020f2d36fe1..4ad50eb0091a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10612,6 +10612,8 @@ static int amdgpu_dm_atomic_check(struct drm_device
*dev,
  #if defined(CONFIG_DRM_AMD_DC_DCN)
 struct dsc_mst_fairness_vars vars[MAX_PIPES];
  #endif
+   struct drm_dp_mst_topology_state *mst_state;
+   struct drm_dp_mst_topology_mgr *mgr;
  
 trace_amdgpu_dm_atomic_check_begin(state);
  
@@ -10819,6 +10821,32 @@ static int amdgpu_dm_atomic_check(struct drm_device

*dev,
 lock_and_validation_needed = true;
 }
  
+#if defined(CONFIG_DRM_AMD_DC_DCN)

+   for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
+   struct amdgpu_dm_connector *aconnector;
+   struct drm_connector *connector;
+   struct drm_connector_list_iter iter;
+   u8 link_coding_cap;
+
+   if (!mgr->mst_state )
+   continue;

extraneous space


+
+   drm_connector_list_iter_begin(dev, );
+   drm_for_each_connector_iter(connector, ) {
+   int id = connector->index;
+
+   if (id == mst_state->mgr->conn_base_id) {
+   aconnector =
to_amdgpu_dm_connector(connector);
+   link_coding_cap =
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+   drm_dp_mst_update_coding_cap(mst_state,
link_coding_cap);
+
+   break;
+   }
+   }
+   drm_connector_list_iter_end();
+
+   }
+#endif
 /**
  * Streams and planes are reset when there are changes that affect
  * bandwidth. Anything that affects bandwidth needs to go through
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
b/drivers/gpu/drm/drm_dp_mst_topology.c
index ad0795afc21c..fb5c47c4cb2e 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3368,7 +3368,7 @@ int drm_dp_update_payload_part1(struct
drm_dp_mst_topology_mgr *mgr)
 struct drm_dp_payload req_payload;
 struct drm_dp_mst_port *port;
 int i, j;
-   int cur_slots = 1;
+   int cur_slots = mgr->start_slot;
 bool skip;
  
 mutex_lock(>payload_lock);

@@ -4321,7 +4321,7 @@ int drm_dp_find_vcpi_slots(struct
drm_dp_mst_topology_mgr *mgr,
 num_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
  
 /* max. time slots - one slot for MTP header */

-   if (num_slots > 63)
+   if (num_slots > mgr->total_avail_slots)
 return -ENOSPC;

For reasons I will explain a little further in this email, we might want to
drop this…


 return num_slots;
  }
@@ -4333,7 +4333,7 @@ static int drm_dp_init_vcpi(struct
drm_dp_mst_topology_mgr *mgr,
 int ret;
  
 /* max. time slots - one slot for MTP header */

-   if (slots > 63)
+   if (slots > mgr->total_avail_slots)

…and this


 return -ENOSPC;
  
 vcpi->pbn = pbn;

@@ -4507,6 +4507,18 @@ int drm_dp_atomic_release_vcpi_slots(struct
drm_atomic_state *state,
  }
  EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots);
  
+void drm_dp_mst_update_coding_cap(struct drm_dp_mst_topology_state

*mst_state, uint8_t link_coding_cap)

Need some kdocs here


+{
+   if (link_coding_cap == DP_CAP_ANSI_128B132B) {
+   mst_state->total_avail_slots = 64;
+   mst_state->start_slot = 0;
+   }
+
+   DRM_DEBUG_KMS("%s coding format on mgr 0x%p\n",
+   (link_coding_cap == DP_CAP_ANSI_128B132B) ?
"128b/132b":"8b/10b", mst_state->mgr);

need to fix indenting here, and wrap this to 100 chars


+}
+EXPORT_SYMBOL(drm_dp_mst_update_coding_cap);
+
  /**
   * 

[RFC PATCH] drm/panel: ilitek-ili9881d: add support for Wanchanglong W552946ABA panel

2021-10-14 Thread Michael Trimarchi
W552946ABA is a panel by Wanchanglong. This panel utilizes the Ilitek ILI9881D
controller.

Add this panel's initialzation sequence and timing to ILI9881D driver.
Tested on px30-evb v11

Signed-off-by: Michael Trimarchi 
---
 drivers/gpu/drm/panel/panel-ilitek-ili9881c.c | 238 +-
 1 file changed, 237 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-ilitek-ili9881c.c 
b/drivers/gpu/drm/panel/panel-ilitek-ili9881c.c
index 0145129d7c66..cf53b43e0907 100644
--- a/drivers/gpu/drm/panel/panel-ilitek-ili9881c.c
+++ b/drivers/gpu/drm/panel/panel-ilitek-ili9881c.c
@@ -42,6 +42,7 @@ struct ili9881c_desc {
const struct ili9881c_instr *init;
const size_t init_length;
const struct drm_display_mode *mode;
+   const unsigned long mode_flags;
 };
 
 struct ili9881c {
@@ -453,6 +454,213 @@ static const struct ili9881c_instr k101_im2byl02_init[] = 
{
ILI9881C_COMMAND_INSTR(0xD3, 0x3F), /* VN0 */
 };
 
+static const struct ili9881c_instr w552946ab_init[] = {
+   ILI9881C_SWITCH_PAGE_INSTR(3),
+   ILI9881C_COMMAND_INSTR(0x01, 0x00),
+   ILI9881C_COMMAND_INSTR(0x02, 0x00),
+   ILI9881C_COMMAND_INSTR(0x03, 0x53),
+   ILI9881C_COMMAND_INSTR(0x04, 0x53),
+   ILI9881C_COMMAND_INSTR(0x05, 0x13),
+   ILI9881C_COMMAND_INSTR(0x06, 0x04),
+   ILI9881C_COMMAND_INSTR(0x07, 0x02),
+   ILI9881C_COMMAND_INSTR(0x08, 0x02),
+   ILI9881C_COMMAND_INSTR(0x09, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0A, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0B, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0C, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0D, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0E, 0x00),
+   ILI9881C_COMMAND_INSTR(0x0F, 0x00),
+
+   ILI9881C_COMMAND_INSTR(0x10, 0x00),
+   ILI9881C_COMMAND_INSTR(0x11, 0x00),
+   ILI9881C_COMMAND_INSTR(0x12, 0x00),
+   ILI9881C_COMMAND_INSTR(0x13, 0x00),
+   ILI9881C_COMMAND_INSTR(0x14, 0x00),
+   ILI9881C_COMMAND_INSTR(0x15, 0x08),
+   ILI9881C_COMMAND_INSTR(0x16, 0x10),
+   ILI9881C_COMMAND_INSTR(0x17, 0x00),
+   ILI9881C_COMMAND_INSTR(0x18, 0x08),
+   ILI9881C_COMMAND_INSTR(0x19, 0x00),
+   ILI9881C_COMMAND_INSTR(0x1A, 0x00),
+   ILI9881C_COMMAND_INSTR(0x1B, 0x00),
+   ILI9881C_COMMAND_INSTR(0x1C, 0x00),
+   ILI9881C_COMMAND_INSTR(0x1D, 0x00),
+   ILI9881C_COMMAND_INSTR(0x1E, 0xC0),
+   ILI9881C_COMMAND_INSTR(0x1F, 0x80),
+
+   ILI9881C_COMMAND_INSTR(0x20, 0x02),
+   ILI9881C_COMMAND_INSTR(0x21, 0x09),
+   ILI9881C_COMMAND_INSTR(0x22, 0x00),
+   ILI9881C_COMMAND_INSTR(0x23, 0x00),
+   ILI9881C_COMMAND_INSTR(0x24, 0x00),
+   ILI9881C_COMMAND_INSTR(0x25, 0x00),
+   ILI9881C_COMMAND_INSTR(0x26, 0x00),
+   ILI9881C_COMMAND_INSTR(0x27, 0x00),
+   ILI9881C_COMMAND_INSTR(0x28, 0x55),
+   ILI9881C_COMMAND_INSTR(0x29, 0x03),
+   ILI9881C_COMMAND_INSTR(0x2A, 0x00),
+   ILI9881C_COMMAND_INSTR(0x2B, 0x00),
+   ILI9881C_COMMAND_INSTR(0x2C, 0x00),
+   ILI9881C_COMMAND_INSTR(0x2D, 0x00),
+   ILI9881C_COMMAND_INSTR(0x2E, 0x00),
+   ILI9881C_COMMAND_INSTR(0x2F, 0x00),
+
+   ILI9881C_COMMAND_INSTR(0x30, 0x00),
+   ILI9881C_COMMAND_INSTR(0x31, 0x00),
+   ILI9881C_COMMAND_INSTR(0x32, 0x00),
+   ILI9881C_COMMAND_INSTR(0x33, 0x00),
+   ILI9881C_COMMAND_INSTR(0x34, 0x04),
+   ILI9881C_COMMAND_INSTR(0x35, 0x05),
+   ILI9881C_COMMAND_INSTR(0x36, 0x05),
+   ILI9881C_COMMAND_INSTR(0x37, 0x00),
+   ILI9881C_COMMAND_INSTR(0x38, 0x3C),
+   ILI9881C_COMMAND_INSTR(0x39, 0x35),
+   ILI9881C_COMMAND_INSTR(0x3A, 0x00),
+   ILI9881C_COMMAND_INSTR(0x3B, 0x40),
+   ILI9881C_COMMAND_INSTR(0x3C, 0x00),
+   ILI9881C_COMMAND_INSTR(0x3D, 0x00),
+   ILI9881C_COMMAND_INSTR(0x3E, 0x00),
+   ILI9881C_COMMAND_INSTR(0x3F, 0x00),
+
+   ILI9881C_COMMAND_INSTR(0x40, 0x00),
+   ILI9881C_COMMAND_INSTR(0x41, 0x88),
+   ILI9881C_COMMAND_INSTR(0x42, 0x00),
+   ILI9881C_COMMAND_INSTR(0x43, 0x00),
+   ILI9881C_COMMAND_INSTR(0x44, 0x1F),
+
+   ILI9881C_COMMAND_INSTR(0x50, 0x01),
+   ILI9881C_COMMAND_INSTR(0x51, 0x23),
+   ILI9881C_COMMAND_INSTR(0x52, 0x45),
+   ILI9881C_COMMAND_INSTR(0x53, 0x67),
+   ILI9881C_COMMAND_INSTR(0x54, 0x89),
+   ILI9881C_COMMAND_INSTR(0x55, 0xaB),
+   ILI9881C_COMMAND_INSTR(0x56, 0x01),
+   ILI9881C_COMMAND_INSTR(0x57, 0x23),
+   ILI9881C_COMMAND_INSTR(0x58, 0x45),
+   ILI9881C_COMMAND_INSTR(0x59, 0x67),
+   ILI9881C_COMMAND_INSTR(0x5A, 0x89),
+   ILI9881C_COMMAND_INSTR(0x5B, 0xAB),
+   ILI9881C_COMMAND_INSTR(0x5C, 0xCD),
+   ILI9881C_COMMAND_INSTR(0x5D, 0xEF),
+   ILI9881C_COMMAND_INSTR(0x5E, 0x03),
+   ILI9881C_COMMAND_INSTR(0x5F, 0x14),
+
+   ILI9881C_COMMAND_INSTR(0x60, 0x15),
+   ILI9881C_COMMAND_INSTR(0x61, 0x0C),
+   ILI9881C_COMMAND_INSTR(0x62, 0x0D),
+   ILI9881C_COMMAND_INSTR(0x63, 0x0E),
+   ILI9881C_COMMAND_INSTR(0x64, 0x0F),
+   

Re: [PATCH] dt-bindings: display/bridge: tc358767: Convert to YAML binding

2021-10-14 Thread Rob Herring
On Wed, Oct 06, 2021 at 03:52:04PM +0200, AngeloGioacchino Del Regno wrote:
> Convert the Toshiba TC358767 txt documentation to YAML.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  .../display/bridge/toshiba,tc358767.txt   |  54 
>  .../display/bridge/toshiba,tc358767.yaml  | 118 ++
>  2 files changed, 118 insertions(+), 54 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.txt
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.txt 
> b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.txt
> deleted file mode 100644
> index 583c5e9dbe6b..
> --- a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.txt
> +++ /dev/null
> @@ -1,54 +0,0 @@
> -Toshiba TC358767 eDP bridge bindings
> -
> -Required properties:
> - - compatible: "toshiba,tc358767"
> - - reg: i2c address of the bridge, 0x68 or 0x0f, depending on bootstrap pins
> - - clock-names: should be "ref"
> - - clocks: OF device-tree clock specification for refclk input. The reference
> -   clock rate must be 13 MHz, 19.2 MHz, 26 MHz, or 38.4 MHz.
> -
> -Optional properties:
> - - shutdown-gpios: OF device-tree gpio specification for SD pin
> -   (active high shutdown input)
> - - reset-gpios: OF device-tree gpio specification for RSTX pin
> -(active low system reset)
> - - toshiba,hpd-pin: TC358767 GPIO pin number to which HPD is connected to (0 
> or 1)
> - - ports: the ports node can contain video interface port nodes to connect
> -   to a DPI/DSI source and to an eDP/DP sink according to [1][2]:
> -- port@0: DSI input port
> -- port@1: DPI input port
> -- port@2: eDP/DP output port
> -
> -[1]: Documentation/devicetree/bindings/graph.txt
> -[2]: Documentation/devicetree/bindings/media/video-interfaces.txt
> -
> -Example:
> - edp-bridge@68 {
> - compatible = "toshiba,tc358767";
> - reg = <0x68>;
> - shutdown-gpios = < 23 GPIO_ACTIVE_HIGH>;
> - reset-gpios = < 24 GPIO_ACTIVE_LOW>;
> - clock-names = "ref";
> - clocks = <_refclk>;
> -
> - ports {
> - #address-cells = <1>;
> - #size-cells = <0>;
> -
> - port@1 {
> - reg = <1>;
> -
> - bridge_in: endpoint {
> - remote-endpoint = <_out>;
> - };
> - };
> -
> - port@2 {
> - reg = <2>;
> -
> - bridge_out: endpoint {
> - remote-endpoint = <_in>;
> - };
> - };
> - };
> - };
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.yaml 
> b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.yaml
> new file mode 100644
> index ..8e27e6f0fc7d
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358767.yaml
> @@ -0,0 +1,118 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/bridge/toshiba,tc358767.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Toshiba TC358767 MIPI-DSI or MIPI-DPI to DP/eDP bridge
> +
> +maintainers:
> +  - Tomi Valkeinen 
> +
> +properties:
> +  compatible:
> +enum:
> +  - toshiba,tc358767
> +
> +  reg:
> +description: I2C address of the bridge
> +enum: [0x68, 0x0f]
> +
> +  clocks:
> +description:
> +  Reference clock input. The reference clock rate must be 13MHz, 19.2MHz,
> +  26MHz, or 38.4MHz.
> +maxItems: 1
> +
> +  clock-names:
> +const: ref
> +
> +  reset-gpios:
> +description: GPIO connected to the RSTX signal.
> +maxItems: 1
> +
> +  shutdown-gpios:
> +description: GPIO connected to the SD signal.
> +maxItems: 1
> +
> +  toshiba,hpd-pin:
> +$ref: "/schemas/types.yaml#/definitions/uint32"
> +description: TC356767 GPIO pin number to which HPD is connected
> +enum:
> +  - 0
> +  - 1
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +
> +properties:
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Video port for MIPI DSI input
> +
> +  port@1:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Video port for MIPI DPI input
> +
> +  port@2:
> +$ref: /schemas/graph.yaml#/properties/port
> +description:
> +  Video port for DP/eDP output (panel or connector).
> +
> +oneOf:
> +  - required:
> +  - port@0
> + 

Re: [PATCH] drm/panel: ej030na: Make use of the helper function dev_err_probe()

2021-10-14 Thread Sam Ravnborg
Hi Cai,

On Thu, Sep 16, 2021 at 06:42:24PM +0800, Cai Huoqing wrote:
> When possible use dev_err_probe help to properly deal with the
> PROBE_DEFER error, the benefit is that DEFER issue will be logged
> in the devices_deferred debugfs file.
> And using dev_err_probe() can reduce code size, and the error value
> gets printed.
> 
> Signed-off-by: Cai Huoqing 

Thanks for updating the panel drivers to use dev_err_probe().
I have applied all patches to drm-misc-next (11 in total + one
white space fix).
The patches will show up in -next in 1-2 weeks.

Sam


Re: [PATCH] dt-bindings: display/bridge: tc358764: Convert to YAML binding

2021-10-14 Thread Rob Herring
On Mon, Oct 11, 2021 at 10:12:02PM +0200, Sam Ravnborg wrote:
> Hi AngeloGioacchino,
> 
> On Wed, Oct 06, 2021 at 03:51:50PM +0200, AngeloGioacchino Del Regno wrote:
> > Convert the Toshiba TC358764 txt documentation to YAML.
> > 
> > Signed-off-by: AngeloGioacchino Del Regno 
> > 
> 
> Thanks for all these conversions to DT-schema.
> 
> It would be very good if the changelog could document the warnings they
> triggers when they are used to check the existing dts files.
> This is a good way to document that the warnings are expected.

Really what's missing is adding 'ports'.

I'm fine with just a note below '---' on the intent WRT dtbs_check. I 
assume the intent is to fix the one case which is fine given there is 
only 1. The graph parsing code doesn't care which way is done and we 
prefer having 'ports'.

> 
> While waiting for Rob to review, here is one small nit. See inline
> comment below.
> 
> My personal preference is to use 4 spaces for indent in the examples.
> But two is perfectly fine and there is today no rule for it.
> 
> When you are resending these, then it would be nice with a cover letter
> and all patches in one series. You can then use the cover letter both to
> tell on a higher level what was changed since v1 and to give a status on the
> conversion effort. I hope you have converted all bridge DT-schemas.
> 
>   Sam
> 
> > ---
> >  .../display/bridge/toshiba,tc358764.txt   | 35 ---
> >  .../display/bridge/toshiba,tc358764.yaml  | 94 +++
> >  2 files changed, 94 insertions(+), 35 deletions(-)
> >  delete mode 100644 
> > Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.txt
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.txt 
> > b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.txt
> > deleted file mode 100644
> > index 8f9abf28a8fa..
> > --- a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.txt
> > +++ /dev/null
> > @@ -1,35 +0,0 @@
> > -TC358764 MIPI-DSI to LVDS panel bridge
> > -
> > -Required properties:
> > -  - compatible: "toshiba,tc358764"
> > -  - reg: the virtual channel number of a DSI peripheral
> > -  - vddc-supply: core voltage supply, 1.2V
> > -  - vddio-supply: I/O voltage supply, 1.8V or 3.3V
> > -  - vddlvds-supply: LVDS1/2 voltage supply, 3.3V
> > -  - reset-gpios: a GPIO spec for the reset pin
> > -
> > -The device node can contain following 'port' child nodes,
> > -according to the OF graph bindings defined in [1]:
> > -  0: DSI Input, not required, if the bridge is DSI controlled
> > -  1: LVDS Output, mandatory
> > -
> > -[1]: Documentation/devicetree/bindings/media/video-interfaces.txt
> > -
> > -Example:
> > -
> > -   bridge@0 {
> > -   reg = <0>;
> > -   compatible = "toshiba,tc358764";
> > -   vddc-supply = <_1v2_reg>;
> > -   vddio-supply = <_1v8_reg>;
> > -   vddlvds-supply = <_3v3_reg>;
> > -   reset-gpios = < 6 GPIO_ACTIVE_LOW>;
> > -   #address-cells = <1>;
> > -   #size-cells = <0>;
> > -   port@1 {
> > -   reg = <1>;
> > -   lvds_ep: endpoint {
> > -   remote-endpoint = <_ep>;
> > -   };
> > -   };
> > -   };
> > diff --git 
> > a/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.yaml 
> > b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.yaml
> > new file mode 100644
> > index ..267a870b6b0b
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/bridge/toshiba,tc358764.yaml
> > @@ -0,0 +1,94 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/display/bridge/toshiba,tc358764.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Toshiba TC358764 MIPI-DSI to LVDS bridge
> > +
> > +maintainers:
> > +  - Andrzej Hajda 
> > +
> > +description: |
> > +  The TC358764 is bridge device which converts MIPI DSI or MIPI DPI to 
> > DP/eDP.
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - toshiba,tc358764
> > +
> > +  reg:
> > +description: Virtual channel number of a DSI peripheral
> > +maxItems: 1
> > +
> > +  reset-gpios:
> > +description: GPIO connected to the reset pin.
> > +maxItems: 1
> > +
> > +  vddc-supply:
> > +description: Core voltage supply, 1.2V
> > +
> > +  vddio-supply:
> > +description: I/O voltage supply, 1.8V or 3.3V
> > +
> > +  vddlvds-supply:
> > +description: LVDS1/2 voltage supply, 3.3V
> > +
> > +  ports:
> > +$ref: /schemas/graph.yaml#/properties/ports
> > +
> > +properties:
> > +  port@0:
> > +$ref: /schemas/graph.yaml#/properties/port
> > +description:
> > +  Video port for MIPI DSI input, if the 

Re: [PATCH] drm: panel: nt36672a: Removed extra whitespace.

2021-10-14 Thread Sam Ravnborg
Hi Cai,

On Thu, Sep 16, 2021 at 03:37:05PM +0800, Cai Huoqing wrote:
> Removed extra whitespace before dev_err_probe() according to coding style.
> 
> Signed-off-by: Cai Huoqing 

Thanks, applied to drm-misc-next.

Sam


Re: [PATCH 0/3] drm/panel: Proper cleanup after mipi_dsi_attach()

2021-10-14 Thread Sam Ravnborg
Hi Brian,

On Thu, Sep 23, 2021 at 05:33:52PM -0700, Brian Norris wrote:
> I've tested a few dual-DSI panel drivers which choke if they PROBE_DEFER
> at the wrong time, so I patched those up in patch 1 and 2. Patch 3 fixes
> the other drivers that I couldn't test, but seem to have all the same
> problem.
> 
> 
> Brian Norris (3):
>   drm/panel: kingdisplay-kd097d04: Delete panel on attach() failure
>   drm/panel: innolux-p079zca: Delete panel on attach() failure
>   drm/panel: Delete panel on mipi_dsi_attach() failure

Thanks for fixing these up, and especially for the fix of the remaining
panel drivers.

I have applied all three patches to drm-misc-next as I did not consider
the fixes urgent.

They will show up in -next in 1-2 weeks.

Sam


RE: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-14 Thread Shankar, Uma



> -Original Message-
> From: Pekka Paalanen 
> Sent: Wednesday, October 13, 2021 12:56 PM
> To: Shankar, Uma 
> Cc: Simon Ser ; daniel.vet...@ffwll.ch; intel-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> harry.wentl...@amd.com; ville.syrj...@linux.intel.com; brian.star...@arm.com;
> sebast...@sebastianwick.net; shashank.sha...@amd.com
> Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> 
> On Tue, 12 Oct 2021 19:11:29 +
> "Shankar, Uma"  wrote:
> 
> > > -Original Message-
> > > From: Pekka Paalanen 
> > > Sent: Tuesday, October 12, 2021 5:30 PM
> > > To: Simon Ser 
> > > Cc: Shankar, Uma ;
> > > intel-...@lists.freedesktop.org; dri- de...@lists.freedesktop.org;
> > > harry.wentl...@amd.com; ville.syrj...@linux.intel.com;
> > > brian.star...@arm.com; sebast...@sebastianwick.net;
> > > shashank.sha...@amd.com
> > > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware
> > > Pipeline
> > >
> > > On Tue, 12 Oct 2021 10:35:37 +
> > > Simon Ser  wrote:
> > >
> > > > On Tuesday, October 12th, 2021 at 12:30, Pekka Paalanen
> > >  wrote:
> > > >
> > > > > is there a practise of landing proposal documents in the kernel?
> > > > > How does that work, will a kernel tree carry the patch files?
> > > > > Or should this document be worded like documentation for an
> > > > > accepted feature, and then the patches either land or don't?
> > > >
> > > > Once everyone agrees, the RFC can land. I don't think a kernel
> > > > tree is necessary. See:
> > > >
> > > > https://dri.freedesktop.org/docs/drm/gpu/rfc/index.html
> > >
> > > Does this mean the RFC doc patch will land, but the code patches
> > > will remain in the review cycles waiting for userspace proving vehicles?
> > > Rather than e.g. committed as files that people would need to apply
> > > themselves? Or how does one find the code patches corresponding to RFC 
> > > docs?
> >
> > As I understand, this section was added to finalize the design and
> > debate on the UAPI, structures, headers and design etc. Once a general
> > agreement is in place with all the stakeholders, we can have ack on
> > design and approach and get it merged. This hence serves as an approved
> reference for the UAPI, accepted and agreed by community at large.
> >
> > Once the code lands, all the documentation will be added to the right
> > driver sections and helpers, like it's been done currently.
> 
> I'm just wondering: someone browses a kernel tree, and discovers this RFC doc 
> in
> there. They want to see or test the latest (WIP) kernel implementation of it. 
> How will
> they find the code / patches?

Maybe we could include the WIP links here to help with getting the pieces, this 
may include
the driver patches and also the userspace efforts as well.

Regards,
Uma Shankar

> 
> Thanks,
> pq


RE: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-14 Thread Shankar, Uma


> -Original Message-
> From: Pekka Paalanen 
> Sent: Wednesday, October 13, 2021 2:01 PM
> To: Shankar, Uma 
> Cc: harry.wentl...@amd.com; ville.syrj...@linux.intel.com; intel- 
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; 
> brian.star...@arm.com; sebast...@sebastianwick.net; 
> shashank.sha...@amd.com
> Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> 
> On Tue, 12 Oct 2021 20:58:27 +
> "Shankar, Uma"  wrote:
> 
> > > -Original Message-
> > > From: Pekka Paalanen 
> > > Sent: Tuesday, October 12, 2021 4:01 PM
> > > To: Shankar, Uma 
> > > Cc: intel-...@lists.freedesktop.org; 
> > > dri-devel@lists.freedesktop.org; harry.wentl...@amd.com; 
> > > ville.syrj...@linux.intel.com; brian.star...@arm.com; 
> > > sebast...@sebastianwick.net; shashank.sha...@amd.com
> > > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware 
> > > Pipeline
> > >
> > > On Tue,  7 Sep 2021 03:08:43 +0530 Uma Shankar 
> > >  wrote:
> > >
> > > > This is a RFC proposal for plane color hardware blocks.
> > > > It exposes the property interface to userspace and calls out the 
> > > > details or interfaces created and the intended purpose.
> > > >
> > > > Credits: Ville Syrjälä 
> > > > Signed-off-by: Uma Shankar 
> > > > ---
> > > >  Documentation/gpu/rfc/drm_color_pipeline.rst | 167
> > > > +++
> > > >  1 file changed, 167 insertions(+)  create mode 100644 
> > > > Documentation/gpu/rfc/drm_color_pipeline.rst
> > > >
> > > > diff --git a/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > > b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > > new file mode 100644
> > > > index ..0d1ca858783b
> > > > --- /dev/null
> > > > +++ b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > > @@ -0,0 +1,167 @@
> > > > +==
> > > > +Display Color Pipeline: Proposed DRM Properties
> 
> ...
> 
> > > > +Proposal is to have below properties for a plane:
> > > > +
> > > > +* Plane Degamma or Pre-Curve:
> > > > +   * This will be used to linearize the input framebuffer data.
> > > > +   * It will apply the reverse of the color transfer function.
> > > > +   * It can be a degamma curve or OETF for HDR.
> > >
> > > As you want to produce light-linear values, you use EOTF or inverse OETF.
> > >
> > > The term OETF has a built-in assumption that that happens in a camera:
> > > it takes in light and produces and electrical signal. Lately I 
> > > have personally started talking about non-linear encoding of color 
> > > values, since EOTF is often associated with displays if nothing 
> > > else is said (taking
> in an electrical signal and producing light).
> > >
> > > So this would be decoding the color values into light-linear color 
> > > values. That is what an EOTF does, yes, but I feel there is a 
> > > nuanced difference. A piece of equipment implements an EOTF by 
> > > turning an electrical signal into light, hence EOTF often refers 
> > > to specific equipment. You could talk about content EOTF to denote 
> > > content value encoding, as opposed to output or display EOTF, but 
> > > that might be confusing if you look at e.g. the diagrams in
> > > BT.2100: is it the EOTF
> or is it the inverse OETF? Is the (inverse?) OOTF included?
> > >
> > > So I try to side-step those questions by talking about encoding.
> >
> > The idea here is that frame buffer presented to display plane engine 
> > will be non-
> linear.
> > So output of a media decode should result in content with EOTF applied.
> 
> Hi,
> 
> sure, but the question is: which EOTF. There can be many different 
> things called "EOTF" in a single pipeline, and then it's up to the 
> document writer to make the difference between them. Comparing two 
> documents with different conventions causes a lot of confusion in my 
> personal experience, so it is good to define the concepts more carefully.

Hi Pekka,
I think you have given some nice inputs to really deep dive and document here.
Will update this with the right expectations wrt what transfer functions to be 
used at various stages in the operation pipeline.

> > So output of a media decode should result in content with EOTF applied.
> 
> I suspect you have it backwards. Media decode produces electrical
> (non-linear) pixel color values. If EOTF was applied, they would be 
> linear instead (and require more memory to achieve the same visual precision).
> 
> If you want to put it this way, you could say "with inverse EOTF 
> applied", but that might be slightly confusing because it is already 
> baked in to the video, it's not something a media decoder has to 
> specifically apply, I think. However, the (inverse) EOTF in this case is the 
> content EOTF, not the display EOTF.
> 
> If content and display EOTF differ, then one must apply first content 
> EOTF and then inverse display EOTF to get values that are correctly 
> encoded for the display. (This is necessary but not sufficient in
> 

Re: [PATCH] drm: fix null-ptr-deref in drm_dev_init_release()

2021-10-14 Thread Sam Ravnborg
Hi Wang,

On Wed, Oct 13, 2021 at 07:41:39PM +0800, Wang Hai wrote:
> I got a null-ptr-deref report:
> 
> [drm:drm_dev_init [drm]] *ERROR* Cannot allocate anonymous inode: -12
> ==
> BUG: KASAN: null-ptr-deref in iput+0x3c/0x4a0
> ...
> Call Trace:
>  dump_stack_lvl+0x6c/0x8b
>  kasan_report.cold+0x64/0xdb
>  __asan_load8+0x69/0x90
>  iput+0x3c/0x4a0
>  drm_dev_init_release+0x39/0xb0 [drm]
>  drm_managed_release+0x158/0x2d0 [drm]
>  drm_dev_init+0x3a7/0x4c0 [drm]
>  __devm_drm_dev_alloc+0x55/0xd0 [drm]
>  mi0283qt_probe+0x8a/0x2b5 [mi0283qt]
>  spi_probe+0xeb/0x130
> ...
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> If drm_fs_inode_new() fails in drm_dev_init(), dev->anon_inode will point
> to PTR_ERR(...) instead of NULL. This will result in null-ptr-deref when
> drm_fs_inode_free(dev->anon_inode) is called.
> 
> drm_dev_init()
>   drm_fs_inode_new() // fail, dev->anon_inode = PTR_ERR(...)
>   drm_managed_release()
>   drm_dev_init_release()
>   drm_fs_inode_free() // access non-existent anon_inode
> 
> Define a temp variable and assign it to dev->anon_inode if the temp
> variable is not PTR_ERR.
> 
> Fixes: 2cbf7fc6718b ("drm: Use drmm_ for drm_dev_init cleanup")
> Reported-by: Hulk Robot 
> Signed-off-by: Wang Hai 

Nice simple fix. Applied to drm-misc-next as I did not see the fix as
urgent. It will show up in -next in 1-2 weeks.

Sam


Re: [PATCH] drm/bridge: Ignore -EPROBE_DEFER when bridge attach fails

2021-10-14 Thread Sam Ravnborg
Hi Guido,

> > >  
> > > + if (ret != -EPROBE_DEFER) {
> > >  #ifdef CONFIG_OF
> > > - DRM_ERROR("failed to attach bridge %pOF to encoder %s: %d\n",
> > > -   bridge->of_node, encoder->name, ret);
> > > + DRM_ERROR("failed to attach bridge %pOF to encoder %s: %d\n",
> > > +   bridge->of_node, encoder->name, ret);
> > 
> > It would be better to use drm_probe_err().
> 
> That's what i thought initially but since the rest here uses DRM_*
> logging i stuck with it. Happy to change that though.

If we continue to use the deprecated DRM_ logging functions then we will
never get rid of them. And then I like the simplicity of
drm_probe_err().

In the end it is up to you.

Sam


Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Dan Williams
On Thu, Oct 14, 2021 at 11:45 AM Matthew Wilcox  wrote:
>
>
> It would probably help if you cc'd Dan on this.

Thanks.

[..]
>
> On Thu, Oct 14, 2021 at 02:06:34PM -0300, Jason Gunthorpe wrote:
> > On Thu, Oct 14, 2021 at 10:39:28AM -0500, Alex Sierra wrote:
> > > From: Ralph Campbell 
> > >
> > > ZONE_DEVICE struct pages have an extra reference count that complicates 
> > > the
> > > code for put_page() and several places in the kernel that need to check 
> > > the
> > > reference count to see that a page is not being used (gup, compaction,
> > > migration, etc.). Clean up the code so the reference count doesn't need to
> > > be treated specially for ZONE_DEVICE.
> > >
> > > Signed-off-by: Ralph Campbell 
> > > Signed-off-by: Alex Sierra 
> > > Reviewed-by: Christoph Hellwig 
> > > ---
> > > v2:
> > > AS: merged this patch in linux 5.11 version
> > >
> > > v5:
> > > AS: add condition at try_grab_page to check for the zone device type, 
> > > while
> > > page ref counter is checked less/equal to zero. In case of device zone, 
> > > pages
> > > ref counter are initialized to zero.
> > >
> > > v7:
> > > AS: fix condition at try_grab_page added at v5, is invalid. It supposed
> > > to fix xfstests/generic/413 test, however, there's a known issue on
> > > this test where DAX mapped area DIO to non-DAX expect to fail.
> > > https://patchwork.kernel.org/project/fstests/patch/1489463960-3579-1-git-send-email-xz...@redhat.com
> > > This condition was removed after rebase over patch series
> > > https://lore.kernel.org/r/20210813044133.1536842-4-jhubb...@nvidia.com
> > > ---
> > >  arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
> > >  drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
> > >  fs/dax.c   |  4 +-
> > >  include/linux/dax.h|  2 +-
> > >  include/linux/memremap.h   |  7 +--
> > >  include/linux/mm.h | 11 
> > >  lib/test_hmm.c |  2 +-
> > >  mm/internal.h  |  8 +++
> > >  mm/memcontrol.c|  6 +--
> > >  mm/memremap.c  | 69 +++---
> > >  mm/migrate.c   |  5 --
> > >  mm/page_alloc.c|  3 ++
> > >  mm/swap.c  | 45 ++---
> > >  13 files changed, 46 insertions(+), 120 deletions(-)
> >
> > Has anyone tested this with FSDAX? Does get_user_pages() on fsdax
> > backed memory still work?
> >
> > What refcount value does the struct pages have when they are installed
> > in the PTEs? Remember a 0 refcount will make all the get_user_pages()
> > fail.
> >
> > I'm looking at the call path starting in ext4_punch_hole() and I would
> > expect to see something manipulating the page ref count before
> > the ext4_break_layouts() call path gets to the dax_page_unused() test.
> >
> > All I see is we go into unmap_mapping_pages() - that would normally
> > put back the page references held by PTEs but insert_pfn() has this:
> >
> >   if (pfn_t_devmap(pfn))
> >   entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
> >
> > And:
> >
> > static inline pte_t pte_mkdevmap(pte_t pte)
> > {
> >   return pte_set_flags(pte, _PAGE_SPECIAL|_PAGE_DEVMAP);
> > }
> >
> > Which interacts with vm_normal_page():
> >
> >   if (pte_devmap(pte))
> >   return NULL;
> >
> > To disable that refcounting?
> >
> > So... I have a feeling this will have PTEs pointing to 0 refcount
> > pages? Unless FSDAX is !pte_devmap which is not the case, right?
> >
> > This seems further confirmed by this comment:
> >
> >   /*
> >* If we race get_user_pages_fast() here either we'll see the
> >* elevated page count in the iteration and wait, or
> >* get_user_pages_fast() will see that the page it took a reference
> >* against is no longer mapped in the page tables and bail to the
> >* get_user_pages() slow path.  The slow path is protected by
> >* pte_lock() and pmd_lock(). New references are not taken without
> >* holding those locks, and unmap_mapping_pages() will not zero the
> >* pte or pmd without holding the respective lock, so we are
> >* guaranteed to either see new references or prevent new
> >* references from being established.
> >*/
> >
> > Which seems to explain this scheme relies on unmap_mapping_pages() to
> > fence GUP_fast, not on GUP_fast observing 0 refcounts when it should
> > stop.
> >
> > This seems like it would be properly fixed by using normal page
> > refcounting for PTEs - ie stop using special for these pages?
> >
> > Does anyone know why devmap is pte_special anyhow?

It does not need to be special as mentioned here:

https://lore.kernel.org/all/CAPcyv4iFeVDVPn6uc=aksyuvkiu3-fk-n16ijvzq3n8ot00...@mail.gmail.com/

The refcount dependencies also go away after this...


Re: [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-14 Thread Matthew Brost
On Thu, Oct 14, 2021 at 11:42:41AM -0700, John Harrison wrote:
> On 10/14/2021 10:20, Matthew Brost wrote:
> > A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
> > execlists. Doing as little as possible to support this interface for
> > execlists - basically just passing submit fences between each request
> > generated and virtual engines are not allowed. This is on par with what
> > is there for the existing (hopefully soon deprecated) bonding interface.
> > 
> > We perma-pin these execlists contexts to align with GuC implementation.
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 10 ++--
> >   drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
> >   .../drm/i915/gt/intel_execlists_submission.c  | 56 ++-
> >   drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 -
> >   5 files changed, 64 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index fb33d0322960..35e87a7d0ea9 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
> > i915_user_extension __user *base,
> > struct intel_engine_cs **siblings = NULL;
> > intel_engine_mask_t prev_mask;
> > -   /* FIXME: This is NIY for execlists */
> > -   if (!(intel_uc_uses_guc_submission(>gt.uc)))
> > -   return -ENODEV;
> > -
> > if (get_user(slot, >engine_index))
> > return -EFAULT;
> > @@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct 
> > i915_user_extension __user *base,
> > if (get_user(num_siblings, >num_siblings))
> > return -EFAULT;
> > +   if (!intel_uc_uses_guc_submission(>gt.uc) && num_siblings != 1) {
> > +   drm_dbg(>drm, "Only 1 sibling (%d) supported in non-GuC 
> > mode\n",
> > +   num_siblings);
> > +   return -EINVAL;
> > +   }
> > +
> > if (slot >= set->num_engines) {
> > drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
> > slot, set->num_engines);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 5634d14052bc..1bec92e1d8e6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct 
> > intel_context *ce)
> > __i915_active_acquire(>active);
> > -   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
> > +   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) ||
> > +   intel_context_is_parallel(ce))
> > return 0;
> > /* Preallocate tracking nodes */
> > @@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct 
> > intel_context *parent,
> >  * Callers responsibility to validate that this function is used
> >  * correctly but we use GEM_BUG_ON here ensure that they do.
> >  */
> > -   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
> > GEM_BUG_ON(intel_context_is_pinned(parent));
> > GEM_BUG_ON(intel_context_is_child(parent));
> > GEM_BUG_ON(intel_context_is_pinned(child));
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index bedb80057046..8cd986bdf26c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -927,8 +927,7 @@ static void execlists_submit_ports(struct 
> > intel_engine_cs *engine)
> >   static bool ctx_single_port_submission(const struct intel_context *ce)
> >   {
> > -   return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> > -   intel_context_force_single_submission(ce));
> > +   return intel_context_force_single_submission(ce);
> Does this change not affect all execlist operation rather than just parallel
> submission?
> 

I don't think so. The only place that sets single submission was in the
GVT code. I think was an optimization so this would just compile out if
GVT wasn't built.

> >   }
> >   static bool can_merge_ctx(const struct intel_context *prev,
> > @@ -2598,6 +2597,58 @@ static void execlists_context_cancel_request(struct 
> > intel_context *ce,
> >   current->comm);
> >   }
> > +static struct intel_context *
> > +execlists_create_parallel(struct intel_engine_cs **engines,
> > + unsigned int num_siblings,
> > + unsigned int width)
> > +{
> > +   struct intel_engine_cs **siblings = NULL;
> > +   struct intel_context *parent = NULL, *ce, *err;
> > +   int i, j;
> > +
> > +   GEM_BUG_ON(num_siblings != 1);
> > +
> > +   siblings = kmalloc_array(num_siblings,
> > +

Re: [PATCH] drm/bridge: display-connector: fix an uninitialized pointer in probe()

2021-10-14 Thread Sam Ravnborg
Hi Dan,

On Wed, Oct 13, 2021 at 11:08:25AM +0300, Dan Carpenter wrote:
> The "label" pointer is used for debug output.  The code assumes that it
> is either NULL or valid, but it is never set to NULL.  It is either
> valid or uninitialized.
> 
> Fixes: 0c275c30176b ("drm/bridge: Add bridge driver for display connectors")
> Signed-off-by: Dan Carpenter 
Thanks, applied to drm-misc-next.
It was IMO not urgent.

Sam


Re: [PATCH v3] drm: of: Add drm_of_lvds_get_data_mapping

2021-10-14 Thread Sam Ravnborg
Hi Marek,

On Wed, Oct 13, 2021 at 12:42:52AM +0200, Marek Vasut wrote:
> Add helper function to convert DT "data-mapping" property string value
> into media bus format value, and deduplicate the code in panel-lvds.c
> and lvds-codec.c .
> 
> Signed-off-by: Marek Vasut 
> Cc: Laurent Pinchart 
> Cc: Sam Ravnborg 
> To: dri-devel@lists.freedesktop.org
> ---
> V2: Drop bogus semicolon
> V3: Add missing asterisk to return values per kerneldoc
> 
> https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#return-values

Thanks for the follow-up on this, applied to drm-misc-next.

Sam


Re: [PATCH] drm/panel-simple: Add Vivax TPC-9150 panel v6

2021-10-14 Thread Sam Ravnborg
Hi Nikola,
On Mon, Oct 11, 2021 at 11:27:31PM +0200, Nikola Pavlica wrote:
> The model and make of the LCD panel of the Vivax TPC-9150 is unknown,
> hence the panel settings that were retrieved with a FEX dump are named
> after the device NOT the actual panel.
> 
> The LCD in question is a 50 pin MISO TFT LCD panel of the resolution
> 1024x600 used by the aforementioned device.
> 
> Version 2, as Thierry kindly suggested that I fix the order in which the
> panel was ordered compared to others.
> 
> Version 3, filling in the required info suggested by Sam. Plus some
> factual issues that I've corrected myself (tested working)
> 
> Version 4, rearranged the display parameters and fix invalid bit format
> issue. (Thanks Sam)
> 
> Version 5, referred to FEX file instead of manual debugging for
> information.
> 
> Version 6, same as above. This time, it'll be documented.
Looks good now, applied to drm-misc-next.
There was an extra line I deleted while applying - checkpatch --strict
complained.
Patch will show up in -next in 1-2 weeks.

Sam


Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Matthew Wilcox


It would probably help if you cc'd Dan on this.
As far as I know he's the only person left who cares about GUP on DAX.

On Thu, Oct 14, 2021 at 02:06:34PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 14, 2021 at 10:39:28AM -0500, Alex Sierra wrote:
> > From: Ralph Campbell 
> > 
> > ZONE_DEVICE struct pages have an extra reference count that complicates the
> > code for put_page() and several places in the kernel that need to check the
> > reference count to see that a page is not being used (gup, compaction,
> > migration, etc.). Clean up the code so the reference count doesn't need to
> > be treated specially for ZONE_DEVICE.
> > 
> > Signed-off-by: Ralph Campbell 
> > Signed-off-by: Alex Sierra 
> > Reviewed-by: Christoph Hellwig 
> > ---
> > v2:
> > AS: merged this patch in linux 5.11 version
> > 
> > v5:
> > AS: add condition at try_grab_page to check for the zone device type, while
> > page ref counter is checked less/equal to zero. In case of device zone, 
> > pages
> > ref counter are initialized to zero.
> > 
> > v7:
> > AS: fix condition at try_grab_page added at v5, is invalid. It supposed
> > to fix xfstests/generic/413 test, however, there's a known issue on
> > this test where DAX mapped area DIO to non-DAX expect to fail.
> > https://patchwork.kernel.org/project/fstests/patch/1489463960-3579-1-git-send-email-xz...@redhat.com
> > This condition was removed after rebase over patch series
> > https://lore.kernel.org/r/20210813044133.1536842-4-jhubb...@nvidia.com
> > ---
> >  arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
> >  drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
> >  fs/dax.c   |  4 +-
> >  include/linux/dax.h|  2 +-
> >  include/linux/memremap.h   |  7 +--
> >  include/linux/mm.h | 11 
> >  lib/test_hmm.c |  2 +-
> >  mm/internal.h  |  8 +++
> >  mm/memcontrol.c|  6 +--
> >  mm/memremap.c  | 69 +++---
> >  mm/migrate.c   |  5 --
> >  mm/page_alloc.c|  3 ++
> >  mm/swap.c  | 45 ++---
> >  13 files changed, 46 insertions(+), 120 deletions(-)
> 
> Has anyone tested this with FSDAX? Does get_user_pages() on fsdax
> backed memory still work?
> 
> What refcount value does the struct pages have when they are installed
> in the PTEs? Remember a 0 refcount will make all the get_user_pages()
> fail.
> 
> I'm looking at the call path starting in ext4_punch_hole() and I would
> expect to see something manipulating the page ref count before
> the ext4_break_layouts() call path gets to the dax_page_unused() test.
> 
> All I see is we go into unmap_mapping_pages() - that would normally
> put back the page references held by PTEs but insert_pfn() has this:
> 
>   if (pfn_t_devmap(pfn))
>   entry = pte_mkdevmap(pfn_t_pte(pfn, prot));
> 
> And:
> 
> static inline pte_t pte_mkdevmap(pte_t pte)
> {
>   return pte_set_flags(pte, _PAGE_SPECIAL|_PAGE_DEVMAP);
> }
> 
> Which interacts with vm_normal_page():
> 
>   if (pte_devmap(pte))
>   return NULL;
> 
> To disable that refcounting?
> 
> So... I have a feeling this will have PTEs pointing to 0 refcount
> pages? Unless FSDAX is !pte_devmap which is not the case, right?
> 
> This seems further confirmed by this comment:
> 
>   /*
>* If we race get_user_pages_fast() here either we'll see the
>* elevated page count in the iteration and wait, or
>* get_user_pages_fast() will see that the page it took a reference
>* against is no longer mapped in the page tables and bail to the
>* get_user_pages() slow path.  The slow path is protected by
>* pte_lock() and pmd_lock(). New references are not taken without
>* holding those locks, and unmap_mapping_pages() will not zero the
>* pte or pmd without holding the respective lock, so we are
>* guaranteed to either see new references or prevent new
>* references from being established.
>*/
> 
> Which seems to explain this scheme relies on unmap_mapping_pages() to
> fence GUP_fast, not on GUP_fast observing 0 refcounts when it should
> stop.
> 
> This seems like it would be properly fixed by using normal page
> refcounting for PTEs - ie stop using special for these pages?
> 
> Does anyone know why devmap is pte_special anyhow?
> 
> > +void free_zone_device_page(struct page *page)
> > +{
> > +   switch (page->pgmap->type) {
> > +   case MEMORY_DEVICE_PRIVATE:
> > +   free_device_page(page);
> > +   return;
> > +   case MEMORY_DEVICE_FS_DAX:
> > +   /* notify page idle */
> > +   wake_up_var(>_refcount);
> > +   return;
> 
> It is not for this series, but I wonder if we should just always call
> ops->page_free and have free_device_page() logic in that callback 

Re: [PATCH v3 6/7] drm/kmb: Enable ADV bridge after modeset

2021-10-14 Thread Sam Ravnborg
Hi Anitha,

On Wed, Oct 13, 2021 at 04:36:31PM -0700, Anitha Chrisanthus wrote:
> On KMB, ADV bridge must be programmed and powered on prior to
> MIPI DSI HW initialization.
> 
> Fixes: 98521f4d4b4c ("drm/kmb: Mipi DSI part of the display driver")
> Signed-off-by: Anitha Chrisanthus 

> ---
>  drivers/gpu/drm/kmb/kmb_dsi.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_dsi.c b/drivers/gpu/drm/kmb/kmb_dsi.c
> index a0669b842ff5..7ab6b7b44cbc 100644
> --- a/drivers/gpu/drm/kmb/kmb_dsi.c
> +++ b/drivers/gpu/drm/kmb/kmb_dsi.c
> @@ -1341,6 +1341,7 @@ static void connect_lcd_to_mipi(struct kmb_dsi *kmb_dsi)
>   return;
>   }
>  
> + drm_bridge_chain_enable(adv_bridge);

Do not use drm_bridge_chain_enable(). If you really need to do this hack
use the atomic variant. The one you use here is about to be deleted.

That said - I expect this to be a workaround for some design issue.
One bridge driver should not need to power on another bridge, the
infrastructure should take care.
I know we did not make kmb_dsi a proper bridge, but this may be the time
to do it so we avoid such a workaround and the driver will then be able
to use the bridge infrastructure as it is supposed to do.

Sam

>   /* DISABLE MIPI->CIF CONNECTION */
>   regmap_write(msscam, MSS_MIPI_CIF_CFG, 0);
>  
> -- 
> 2.25.1


Re: [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-14 Thread John Harrison

On 10/14/2021 10:20, Matthew Brost wrote:

A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
execlists. Doing as little as possible to support this interface for
execlists - basically just passing submit fences between each request
generated and virtual engines are not allowed. This is on par with what
is there for the existing (hopefully soon deprecated) bonding interface.

We perma-pin these execlists contexts to align with GuC implementation.

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 10 ++--
  drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
  .../drm/i915/gt/intel_execlists_submission.c  | 56 ++-
  drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 -
  5 files changed, 64 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fb33d0322960..35e87a7d0ea9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
struct intel_engine_cs **siblings = NULL;
intel_engine_mask_t prev_mask;
  
-	/* FIXME: This is NIY for execlists */

-   if (!(intel_uc_uses_guc_submission(>gt.uc)))
-   return -ENODEV;
-
if (get_user(slot, >engine_index))
return -EFAULT;
  
@@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base,

if (get_user(num_siblings, >num_siblings))
return -EFAULT;
  
+	if (!intel_uc_uses_guc_submission(>gt.uc) && num_siblings != 1) {

+   drm_dbg(>drm, "Only 1 sibling (%d) supported in non-GuC 
mode\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
if (slot >= set->num_engines) {
drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
slot, set->num_engines);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5634d14052bc..1bec92e1d8e6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct intel_context 
*ce)
  
  	__i915_active_acquire(>active);
  
-	if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))

+   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) ||
+   intel_context_is_parallel(ce))
return 0;
  
  	/* Preallocate tracking nodes */

@@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
 * Callers responsibility to validate that this function is used
 * correctly but we use GEM_BUG_ON here ensure that they do.
 */
-   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
GEM_BUG_ON(intel_context_is_pinned(parent));
GEM_BUG_ON(intel_context_is_child(parent));
GEM_BUG_ON(intel_context_is_pinned(child));
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index bedb80057046..8cd986bdf26c 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -927,8 +927,7 @@ static void execlists_submit_ports(struct intel_engine_cs 
*engine)
  
  static bool ctx_single_port_submission(const struct intel_context *ce)

  {
-   return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
-   intel_context_force_single_submission(ce));
+   return intel_context_force_single_submission(ce);
Does this change not affect all execlist operation rather than just 
parallel submission?



  }
  
  static bool can_merge_ctx(const struct intel_context *prev,

@@ -2598,6 +2597,58 @@ static void execlists_context_cancel_request(struct 
intel_context *ce,
  current->comm);
  }
  
+static struct intel_context *

+execlists_create_parallel(struct intel_engine_cs **engines,
+ unsigned int num_siblings,
+ unsigned int width)
+{
+   struct intel_engine_cs **siblings = NULL;
+   struct intel_context *parent = NULL, *ce, *err;
+   int i, j;
+
+   GEM_BUG_ON(num_siblings != 1);
+
+   siblings = kmalloc_array(num_siblings,
+sizeof(*siblings),
+GFP_KERNEL);
+   if (!siblings)
+   return ERR_PTR(-ENOMEM);
+
+   for (i = 0; i < width; ++i) {
+   for (j = 0; j < num_siblings; ++j)
+   siblings[j] = engines[i * num_siblings + j];
What is the purpose of this array? The only usage that I can see is 
siblings[0] on the line below. The rest of the entries never seem to be 
used. So you could just 

Re: [PATCH v3 5/7] drm/kmb: Corrected typo in handle_lcd_irq

2021-10-14 Thread Sam Ravnborg
Hi Anitha,

On Wed, Oct 13, 2021 at 04:36:30PM -0700, Anitha Chrisanthus wrote:
> Check for Overflow bits for layer3 in the irq handler.
> 
> Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
> Signed-off-by: Anitha Chrisanthus 

Obvious fix,
Acked-by: Sam Ravnborg 


Re: [PATCH v3 4/7] drm/kmb: Disable change of plane parameters

2021-10-14 Thread Sam Ravnborg
Hi Anitha,

On Wed, Oct 13, 2021 at 04:36:29PM -0700, Anitha Chrisanthus wrote:
> From: Edmund Dea 
> 
> Due to HW limitations, KMB cannot change height, width, or
> pixel format after initial plane configuration.
> 
> v2: removed memset disp_cfg as it is already zero.
> 
> Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
> Signed-off-by: Edmund Dea 
> Signed-off-by: Anitha Chrisanthus 

Looks fine and the s-o-b order is OK.
Acked-by: Sam Ravnborg 

Sam


Re: [PATCH v3 3/7] drm/kmb: Remove clearing DPHY regs

2021-10-14 Thread Sam Ravnborg
On Wed, Oct 13, 2021 at 04:36:28PM -0700, Anitha Chrisanthus wrote:
> From: Edmund Dea 
> 
> Don't clear the shared DPHY registers common to MIPI Rx and MIPI Tx during
> DSI initialization since this was causing MIPI Rx reset. Rest of the
> writes are bitwise, so will not affect Mipi Rx side.
> 
> Fixes: 98521f4d4b4c ("drm/kmb: Mipi DSI part of the display driver")
> Signed-off-by: Anitha Chrisanthus 
> Signed-off-by: Edmund Dea 

As Edmund is author, he should be listed first. The s-o-b lines
expressed the order.
With this fixed.
Acked-by: Sam Ravnborg 

> ---
>  drivers/gpu/drm/kmb/kmb_dsi.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_dsi.c b/drivers/gpu/drm/kmb/kmb_dsi.c
> index 86e8e7943e89..a0669b842ff5 100644
> --- a/drivers/gpu/drm/kmb/kmb_dsi.c
> +++ b/drivers/gpu/drm/kmb/kmb_dsi.c
> @@ -1393,11 +1393,6 @@ int kmb_dsi_mode_set(struct kmb_dsi *kmb_dsi, struct 
> drm_display_mode *mode,
>   mipi_tx_init_cfg.lane_rate_mbps = data_rate;
>   }
>  
> - kmb_write_mipi(kmb_dsi, DPHY_ENABLE, 0);
> - kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL0, 0);
> - kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL1, 0);
> - kmb_write_mipi(kmb_dsi, DPHY_INIT_CTRL2, 0);
> -
>   /* Initialize mipi controller */
>   mipi_tx_init_cntrl(kmb_dsi, _tx_init_cfg);
>  
> -- 
> 2.25.1


Re: [PATCH v3 2/7] drm/kmb: Limit supported mode to 1080p

2021-10-14 Thread Sam Ravnborg
On Wed, Oct 13, 2021 at 04:36:27PM -0700, Anitha Chrisanthus wrote:
> KMB only supports single resolution(1080p), this commit checks for
> 1920x1080x60 or 1920x1080x59 in crtc_mode_valid.
> Also, modes with vfp < 4 are not supported in KMB display. This change
> prunes display modes with vfp < 4.
> 
> v2: added vfp check
> 
> Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
> Signed-off-by: Anitha Chrisanthus 
> Signed-off-by: Edmund Dea 

This looks wrong. I assume it Edmund also did some development work so
you should also add:
Co-developed-by: Edmund Dea 

One nit below. With these nits fixed:
Acked-by: Sam Ravnborg 

Sam

> ---
>  drivers/gpu/drm/kmb/kmb_crtc.c | 34 ++
>  drivers/gpu/drm/kmb/kmb_drv.h  | 13 ++---
>  2 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_crtc.c b/drivers/gpu/drm/kmb/kmb_crtc.c
> index 44327bc629ca..08a45e813db7 100644
> --- a/drivers/gpu/drm/kmb/kmb_crtc.c
> +++ b/drivers/gpu/drm/kmb/kmb_crtc.c
> @@ -185,11 +185,45 @@ static void kmb_crtc_atomic_flush(struct drm_crtc *crtc,
>   spin_unlock_irq(>dev->event_lock);
>  }
>  
> +static enum drm_mode_status
> + kmb_crtc_mode_valid(struct drm_crtc *crtc,
> + const struct drm_display_mode *mode)
> +{
> + int refresh;
> + struct drm_device *dev = crtc->dev;
> + int vfp = mode->vsync_start - mode->vdisplay;
> +
> + if (mode->vdisplay < KMB_CRTC_MAX_HEIGHT) {
> + drm_dbg(dev, "height = %d less than %d",
> + mode->vdisplay, KMB_CRTC_MAX_HEIGHT);
> + return MODE_BAD_VVALUE;
> + }
> + if (mode->hdisplay < KMB_CRTC_MAX_WIDTH) {
> + drm_dbg(dev, "width = %d less than %d",
> + mode->hdisplay, KMB_CRTC_MAX_WIDTH);
> + return MODE_BAD_HVALUE;
> + }
> + refresh = drm_mode_vrefresh(mode);
> + if (refresh < KMB_MIN_VREFRESH || refresh > KMB_MAX_VREFRESH) {
> + drm_dbg(dev, "refresh = %d less than %d or greater than %d",
> + refresh, KMB_MIN_VREFRESH, KMB_MAX_VREFRESH);
> + return MODE_BAD;
> + }
> +
> + if (vfp < KMB_CRTC_MIN_VFP) {
> + drm_dbg(dev, "vfp = %d less than %d", vfp, KMB_CRTC_MIN_VFP);
> + return MODE_BAD;
> + }
> +
> + return MODE_OK;
> +}
> +
>  static const struct drm_crtc_helper_funcs kmb_crtc_helper_funcs = {
>   .atomic_begin = kmb_crtc_atomic_begin,
>   .atomic_enable = kmb_crtc_atomic_enable,
>   .atomic_disable = kmb_crtc_atomic_disable,
>   .atomic_flush = kmb_crtc_atomic_flush,
> + .mode_valid = kmb_crtc_mode_valid,
>  };
>  
>  int kmb_setup_crtc(struct drm_device *drm)
> diff --git a/drivers/gpu/drm/kmb/kmb_drv.h b/drivers/gpu/drm/kmb/kmb_drv.h
> index 69a62e2d03ff..d297218869e8 100644
> --- a/drivers/gpu/drm/kmb/kmb_drv.h
> +++ b/drivers/gpu/drm/kmb/kmb_drv.h
> @@ -18,13 +18,20 @@
>  
>  #define DRIVER_DATE  "20210223"
>  #define DRIVER_MAJOR 1
> -#define DRIVER_MINOR 1
> -
> +#define DRIVER_MINOR 2
This change looks un-related.

> +
> +/* Platform definitions */
> +#define KMB_CRTC_MIN_VFP 4
> +#define KMB_CRTC_MAX_WIDTH   1920 /* max width in pixels */
> +#define KMB_CRTC_MAX_HEIGHT  1080 /* max height in pixels */
> +#define KMB_CRTC_MIN_WIDTH   1920
> +#define KMB_CRTC_MIN_HEIGHT  1080
>  #define KMB_FB_MAX_WIDTH 1920
>  #define KMB_FB_MAX_HEIGHT1080
>  #define KMB_FB_MIN_WIDTH 1
>  #define KMB_FB_MIN_HEIGHT1
> -
> +#define KMB_MIN_VREFRESH 59/*vertical refresh in Hz */
> +#define KMB_MAX_VREFRESH 60/*vertical refresh in Hz */
>  #define KMB_LCD_DEFAULT_CLK  2
>  #define KMB_SYS_CLK_MHZ  500
>  
> -- 
> 2.25.1


Re: Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"

2021-10-14 Thread Vladimir Zapolskiy

Hi Dmitry,

On 10/14/21 4:54 PM, Dmitry Baryshkov wrote:

From: Amit Pundir 

This reverts commit 001ce9785c0674d913531345e86222c965fc8bf4.

This upstream commit broke AOSP (post Android 12 merge) build
on RB5. The device either silently crashes into USB crash mode
after android boot animation or we see a blank blue screen
with following dpu errors in dmesg:

[  T444] hw recovery is not complete for ctl:3
[  T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 
intf1 ctl 3 reset failure: -22
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank 
timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done 
returned -110
[C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done 
timeout
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank 
timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done 
returned -110

Signed-off-by: Amit Pundir 


your sob tag is missing.


---
  arch/arm64/boot/dts/qcom/sm8250.dtsi | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index 8c15d9fed08f..d12e4cbfc852 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -2590,9 +2590,10 @@
power-domains = < MDSS_GDSC>;
  
  			clocks = < DISP_CC_MDSS_AHB_CLK>,

+< GCC_DISP_HF_AXI_CLK>,
 < GCC_DISP_SF_AXI_CLK>,
 < DISP_CC_MDSS_MDP_CLK>;
-   clock-names = "iface", "nrt_bus", "core";
+   clock-names = "iface", "bus", "nrt_bus", "core";
  
  			assigned-clocks = < DISP_CC_MDSS_MDP_CLK>;

assigned-clock-rates = <46000>;



--
Best wishes,
Vladimir


Re: [PATCH v3 1/7] drm/kmb: Work around for higher system clock

2021-10-14 Thread Sam Ravnborg
Hi Anitha,

On Wed, Oct 13, 2021 at 04:36:26PM -0700, Anitha Chrisanthus wrote:
> Use a different value for system clock offset in the
> ppl/llp ratio calculations for clocks higher than 500 Mhz.
> 
> Fixes: 98521f4d4b4c ("drm/kmb: Mipi DSI part of the display driver")
> Signed-off-by: Anitha Chrisanthus 

Looks fine.
Acked-by: Sam Ravnborg 

Sam


Re: [PATCH v2 2/2] drm: panel-simple: Add support for the Innolux G070Y2-T02 panel

2021-10-14 Thread Sam Ravnborg
Hi Oleksij,

On Thu, Oct 14, 2021 at 11:52:02AM +0200, Oleksij Rempel wrote:
> Add compatible and timings for the Innolux G070Y2-T02 panel. It is 7"
> WVGA (800x480) TFT LCD panel with TTL interface and a backlight unit.
> 
> Co-Developed-by: Robin van der Gracht 
> Signed-off-by: Robin van der Gracht 
> Signed-off-by: Oleksij Rempel 

applied to drm-misc-next.
Note: The bindings patch was already applied and will show up in -next
soon.

This patch will show up in -next in 1-2 weeks.

Sam


Re: [PATCH] video: omapfb: Fix fall-through warning for Clang

2021-10-14 Thread Gustavo A. R. Silva
On Thu, Oct 14, 2021 at 08:26:52PM +0200, Sam Ravnborg wrote:
> Hi Gustavo,
> On Thu, Oct 14, 2021 at 11:53:20AM -0500, Gustavo A. R. Silva wrote:
> > Fix the following fallthrough warnings:
> > 
> > drivers/video/fbdev/omap/omapfb_main.c:1558:2: warning: unannotated 
> > fall-through between switch labels [-Wimplicit-fallthrough]
> >case 0:
> >^
> >drivers/video/fbdev/omap/omapfb_main.c:1558:2: note: insert 'break;' to 
> > avoid fall-through
> >case 0:
> >^
> >break;
> >1 warning generated.
> > 
> > This helps with the ongoing efforts to globally enable
> > -Wimplicit-fallthrough for Clang.
> > 
> > Link: https://github.com/KSPP/linux/issues/115
> > Link: https://lore.kernel.org/lkml/202110141005.hujaymei-...@intel.com/
> > Reported-by: kernel test robot 
> > Signed-off-by: Gustavo A. R. Silva 
> applied to drm-misc-next. It will show up in -next in 1-2 weeks time.

Great. :)

Thanks, Sam.
--
Gustavo


Re: [PATCH 24/25] drm/i915: Enable multi-bb execbuf

2021-10-14 Thread John Harrison

On 10/14/2021 10:20, Matthew Brost wrote:

Enable multi-bb execbuf by enabling the set_parallel extension.

Signed-off-by: Matthew Brost 

Reviewed-by: John Harrison 


---
  drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9a00f11fef46..fb33d0322960 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -570,9 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
struct intel_engine_cs **siblings = NULL;
intel_engine_mask_t prev_mask;
  
-	/* Disabling for now */

-   return -ENODEV;
-
/* FIXME: This is NIY for execlists */
if (!(intel_uc_uses_guc_submission(>gt.uc)))
return -ENODEV;




Re: [PATCH 20/25] drm/i915: Multi-BB execbuf

2021-10-14 Thread John Harrison

On 10/14/2021 10:20, Matthew Brost wrote:

Allow multiple batch buffers to be submitted in a single execbuf IOCTL
after a context has been configured with the 'set_parallel' extension.
The number batches is implicit based on the contexts configuration.

This is implemented with a series of loops. First a loop is used to find
all the batches, a loop to pin all the HW contexts, a loop to create all
the requests, a loop to submit (emit BB start, etc...) all the requests,
a loop to tie the requests to the VMAs they touch, and finally a loop to
commit the requests to the backend.

A composite fence is also created for the generated requests to return
to the user and to stick in dma resv slots.

No behavior from the existing IOCTL should be changed aside from when
throttling because the ring for a context is full. In this situation,
i915 will now wait while holding the object locks. This change was done
because the code is much simpler to wait while holding the locks and we
believe there isn't a huge benefit of dropping these locks. If this
proves false we can restructure the code to drop the locks during the
wait.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
  (Matthew Brost)
   - Return proper error value if i915_request_create fails
v3:
  (John Harrison)
   - Add comment explaining create / add order loops + locking
   - Update commit message explaining different in IOCTL behavior
   - Line wrap some comments
   - eb_add_request returns void
   - Return -EINVAL rather triggering BUG_ON if cmd parser used
  (Checkpatch)
   - Check eb->batch_len[*current_batch]
v4:
  (CI)
   - Set batch len if passed if via execbuf args
   - Call __i915_request_skip after __i915_request_commit
  (Kernel test robot)
   - Initialize rq to NULL in eb_pin_timeline
v5:
  (John Harrison)
   - Fix typo in comments near bb order loops

Signed-off-by: Matthew Brost 

Reviewed-by: John Harrison 


---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 783 --
  drivers/gpu/drm/i915/gt/intel_context.h   |   8 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |  10 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   2 +
  drivers/gpu/drm/i915/i915_request.h   |   9 +
  drivers/gpu/drm/i915/i915_vma.c   |  21 +-
  drivers/gpu/drm/i915/i915_vma.h   |  13 +-
  7 files changed, 595 insertions(+), 251 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c75afc8784e3..fc30856e81fa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -246,17 +246,25 @@ struct i915_execbuffer {
struct drm_i915_gem_exec_object2 *exec; /** ioctl execobj[] */
struct eb_vma *vma;
  
-	struct intel_engine_cs *engine; /** engine to queue the request to */

+   struct intel_gt *gt; /* gt for the execbuf */
struct intel_context *context; /* logical state for the request */
struct i915_gem_context *gem_context; /** caller's context */
  
-	struct i915_request *request; /** our request to build */

-   struct eb_vma *batch; /** identity of the batch obj/vma */
+   /** our requests to build */
+   struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+   /** identity of the batch obj/vma */
+   struct eb_vma *batches[MAX_ENGINE_INSTANCE + 1];
struct i915_vma *trampoline; /** trampoline used for chaining */
  
+	/** used for excl fence in dma_resv objects when > 1 BB submitted */

+   struct dma_fence *composite_fence;
+
/** actual size of execobj[] as we may extend it for the cmdparser */
unsigned int buffer_count;
  
+	/* number of batches in execbuf IOCTL */

+   unsigned int num_batches;
+
/** list of vma not yet bound during reservation phase */
struct list_head unbound;
  
@@ -283,7 +291,8 @@ struct i915_execbuffer {
  
  	u64 invalid_flags; /** Set of execobj.flags that are invalid */
  
-	u64 batch_len; /** Length of batch within object */

+   /** Length of batch within object */
+   u64 batch_len[MAX_ENGINE_INSTANCE + 1];
u32 batch_start_offset; /** Location within object of batch */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch 
buffer */
@@ -301,14 +310,13 @@ struct i915_execbuffer {
  };
  
  static int eb_parse(struct i915_execbuffer *eb);

-static struct i915_request *eb_pin_engine(struct i915_execbuffer *eb,
- bool throttle);
+static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
  static void eb_unpin_engine(struct i915_execbuffer *eb);
  
  static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)

  {
-   return intel_engine_requires_cmd_parser(eb->engine) ||
-   

Re: [PATCH] video: omapfb: Fix fall-through warning for Clang

2021-10-14 Thread Sam Ravnborg
Hi Gustavo,
On Thu, Oct 14, 2021 at 11:53:20AM -0500, Gustavo A. R. Silva wrote:
> Fix the following fallthrough warnings:
> 
> drivers/video/fbdev/omap/omapfb_main.c:1558:2: warning: unannotated 
> fall-through between switch labels [-Wimplicit-fallthrough]
>case 0:
>^
>drivers/video/fbdev/omap/omapfb_main.c:1558:2: note: insert 'break;' to 
> avoid fall-through
>case 0:
>^
>break;
>1 warning generated.
> 
> This helps with the ongoing efforts to globally enable
> -Wimplicit-fallthrough for Clang.
> 
> Link: https://github.com/KSPP/linux/issues/115
> Link: https://lore.kernel.org/lkml/202110141005.hujaymei-...@intel.com/
> Reported-by: kernel test robot 
> Signed-off-by: Gustavo A. R. Silva 
applied to drm-misc-next. It will show up in -next in 1-2 weeks time.
Thanks,
Sam


Re: [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread John Harrison

On 10/14/2021 10:19, Matthew Brost wrote:

Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
  (Daniel Vetter)
   - Add IGT link and placeholder for media UMD link
v3:
  (Kernel test robot)
   - Fix warning in unpin engines call
  (John Harrison)
   - Reword a bunch of the kernel doc
v4:
  (John Harrison)
   - Add comment why perma-pin is done after setting gem context
   - Update some comments / docs for proto contexts
v5:
  (John Harrison)
   - Rework perma-pin comment
   - Add BUG_IN if context is pinned when setting gem context

IN?



Cc: Tvrtko Ursulin 
Signed-off-by: Matthew Brost 

Reviewed-by: John Harrison 


---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 230 +-
  .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +-
  drivers/gpu/drm/i915/gt/intel_engine.h|  12 +-
  drivers/gpu/drm/i915/gt/intel_engine_cs.c |   6 +-
  .../drm/i915/gt/intel_execlists_submission.c  |   6 +-
  drivers/gpu/drm/i915/gt/selftest_execlists.c  |  12 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 114 -
  include/uapi/drm/i915_drm.h   | 131 ++
  9 files changed, 505 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d225d3dd0b40..9a00f11fef46 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -556,9 +556,150 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
__user *base, void *data)
return 0;
  }
  
+static int

+set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct i915_context_engines_parallel_submit __user *ext =
+   container_of_user(base, typeof(*ext), base);
+   const struct set_proto_ctx_engines *set = data;
+   struct drm_i915_private *i915 = set->i915;
+   u64 flags;
+   int err = 0, n, i, j;
+   u16 slot, width, num_siblings;
+   struct intel_engine_cs **siblings = NULL;
+   intel_engine_mask_t prev_mask;
+
+   /* Disabling for now */
+   return -ENODEV;
+
+   /* FIXME: This is NIY for execlists */
+   if (!(intel_uc_uses_guc_submission(>gt.uc)))
+   return -ENODEV;
+
+   if (get_user(slot, >engine_index))
+   return -EFAULT;
+
+   if (get_user(width, >width))
+   return -EFAULT;
+
+   if (get_user(num_siblings, >num_siblings))
+   return -EFAULT;
+
+   if (slot >= set->num_engines) {
+   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
+   slot, set->num_engines);
+   return -EINVAL;
+   }
+
+   if (set->engines[slot].type != I915_GEM_ENGINE_TYPE_INVALID) {
+   drm_dbg(>drm,
+   "Invalid placement[%d], already occupied\n", slot);
+   return -EINVAL;
+   }
+
+   if (get_user(flags, >flags))
+   return -EFAULT;
+
+   if (flags) {
+   drm_dbg(>drm, "Unknown flags 0x%02llx", flags);
+   return -EINVAL;
+   }
+
+   for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+   err = check_user_mbz(>mbz64[n]);
+   if (err)
+   return err;
+   }
+
+   if (width < 2) {
+   drm_dbg(>drm, "Width (%d) < 2\n", width);
+   return -EINVAL;
+   }
+
+   if (num_siblings < 1) {
+   drm_dbg(>drm, "Number siblings (%d) < 1\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
+   siblings = kmalloc_array(num_siblings * width,
+sizeof(*siblings),
+GFP_KERNEL);
+   if (!siblings)
+   return -ENOMEM;
+
+   /* Create contexts / engines */
+   for (i = 0; i < width; ++i) {
+   intel_engine_mask_t current_mask = 0;
+   struct i915_engine_class_instance prev_engine;
+
+   for (j = 0; j < num_siblings; ++j) {
+   struct i915_engine_class_instance ci;
+
+   n = i * num_siblings + j;
+   if (copy_from_user(, >engines[n], sizeof(ci))) {
+   err = -EFAULT;
+   goto out_err;
+   }
+
+   siblings[n] =
+   intel_engine_lookup_user(i915, ci.engine_class,
+ci.engine_instance);
+   if (!siblings[n]) {
+   drm_dbg(>drm,
+   

Re: [PATCH 5/7] drm/meson: encoder_hdmi: switch to bridge DRM_BRIDGE_ATTACH_NO_CONNECTOR

2021-10-14 Thread Sam Ravnborg
Hi Neil,

On Thu, Oct 14, 2021 at 05:26:04PM +0200, Neil Armstrong wrote:
> This implements the necessary change to no more use the embedded
> connector in dw-hdmi and use the dedicated bridge connector driver
> by passing DRM_BRIDGE_ATTACH_NO_CONNECTOR to the bridge attach call.
> 
> The necessary connector properties are added to handle the same
> functionalities as the embedded dw-hdmi connector, i.e. the HDR
> metadata, the CEC notifier & other flags.
> 
> The dw-hdmi output_port is set to 1 in order to look for a connector
> next bridge in order to get DRM_BRIDGE_ATTACH_NO_CONNECTOR working.
> 
> Signed-off-by: Neil Armstrong 
Look fine,
Acked-by: Sam Ravnborg 



Re: [PATCH 08/25] drm/i915/guc: Add multi-lrc context registration

2021-10-14 Thread John Harrison

On 10/14/2021 10:19, Matthew Brost wrote:

Add multi-lrc context registration H2G. In addition a workqueue and
process descriptor are setup during multi-lrc context registration as
these data structures are needed for multi-lrc submission.

v2:
  (John Harrison)
   - Move GuC specific fields into sub-struct
   - Clean up WQ defines
   - Add comment explaining math to derive WQ / PD address
v3:
  (John Harrison)
   - Add PARENT_SCRATCH_SIZE define
   - Update comment explaining multi-lrc register
v4:
  (John Harrison)
   - Move PARENT_SCRATCH_SIZE to common file

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_context.h   |   2 +
  drivers/gpu/drm/i915/gt/intel_context_types.h |  12 ++
  drivers/gpu/drm/i915/gt/intel_lrc.c   |   5 +
  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 -
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 115 +-
  6 files changed, 134 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index b63c10a144af..9f0995150a7a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -44,6 +44,8 @@ void intel_context_free(struct intel_context *ce);
  int intel_context_reconfigure_sseu(struct intel_context *ce,
   const struct intel_sseu sseu);
  
+#define PARENT_SCRATCH_SIZE	PAGE_SIZE
Would have been nice to have a comment. At least something like 'For 
multi-LRC submission, see uc/intel_guc_submission.c for details'. But 
the description is there in the other file for those who want to look. 
So either way:

Reviewed-by: John Harrison 



+
  static inline bool intel_context_is_child(struct intel_context *ce)
  {
return !!ce->parallel.parent;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 76dfca57cb45..48decb5ee954 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -239,6 +239,18 @@ struct intel_context {
struct intel_context *parent;
/** @number_children: number of children if parent */
u8 number_children;
+   /** @guc: GuC specific members for parallel submission */
+   struct {
+   /** @wqi_head: head pointer in work queue */
+   u16 wqi_head;
+   /** @wqi_tail: tail pointer in work queue */
+   u16 wqi_tail;
+   /**
+* @parent_page: page in context state (ce->state) used
+* by parent for work queue, process descriptor
+*/
+   u8 parent_page;
+   } guc;
} parallel;
  
  #ifdef CONFIG_DRM_I915_SELFTEST

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3ef9eaf8c50e..56156cf18c41 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -942,6 +942,11 @@ __lrc_alloc_state(struct intel_context *ce, struct 
intel_engine_cs *engine)
context_size += PAGE_SIZE;
}
  
+	if (intel_context_is_parent(ce) && intel_engine_uses_guc(engine)) {

+   ce->parallel.guc.parent_page = context_size / PAGE_SIZE;
+   context_size += PARENT_SCRATCH_SIZE;
+   }
+
obj = i915_gem_object_create_lmem(engine->i915, context_size,
  I915_BO_ALLOC_PM_VOLATILE);
if (IS_ERR(obj))
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
index 8ff58aff..ba10bd374cee 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
@@ -142,6 +142,7 @@ enum intel_guc_action {
INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE = 0x4600,
+   INTEL_GUC_ACTION_REGISTER_CONTEXT_MULTI_LRC = 0x4601,
INTEL_GUC_ACTION_RESET_CLIENT = 0x5507,
INTEL_GUC_ACTION_LIMIT
  };
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index fa4be13c8854..0eeb2a9feeed 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -52,8 +52,6 @@
  
  #define GUC_DOORBELL_INVALID		256
  
-#define GUC_WQ_SIZE			(PAGE_SIZE * 2)

-
  /* Work queue item header definitions */
  #define WQ_STATUS_ACTIVE  1
  #define WQ_STATUS_SUSPENDED   2
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index fd6594bc1b96..d9f5be00e586 100644
--- 

Re: [PATCH 6/7] drm/meson: rename venc_cvbs to encoder_cvbs

2021-10-14 Thread Sam Ravnborg
On Thu, Oct 14, 2021 at 05:26:05PM +0200, Neil Armstrong wrote:
> Rename the cvbs encoder to match the newly introduced meson_encoder_hdmi.
> 
> Signed-off-by: Neil Armstrong 
Acked-by: Sam Ravnborg 



Re: [PATCH 7/7] drm/meson: encoder_cvbs: switch to bridge with ATTACH_NO_CONNECTOR

2021-10-14 Thread Sam Ravnborg
Hi Neil,

with include order fixed and the comment below considered:
Acked-by: Sam Ravnborg 

Sam


On Thu, Oct 14, 2021 at 05:26:06PM +0200, Neil Armstrong wrote:
> Drop the local connector and move all callback to bridge funcs in order
> to leverage the generic CVBS diplay connector.
> 
> This will also permit adding custom cvbs connectors for ADC based HPD
> detection on some Amlogic SoC based boards.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/gpu/drm/meson/meson_encoder_cvbs.c | 178 +
>  1 file changed, 79 insertions(+), 99 deletions(-)
> 
> diff --git a/drivers/gpu/drm/meson/meson_encoder_cvbs.c 
> b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
> index 01024c5f610c..0b974667cf55 100644
> --- a/drivers/gpu/drm/meson/meson_encoder_cvbs.c
> +++ b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
> @@ -13,6 +13,9 @@
>  #include 
>  
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
alphabetic order.

> @@ -30,14 +33,14 @@
>  
>  struct meson_encoder_cvbs {
>   struct drm_encoder  encoder;
> - struct drm_connectorconnector;
> + struct drm_bridge   bridge;
> + struct drm_bridge   *next_bridge;
> + struct drm_connector*connector;
Maybe drop this - see later.

>   struct meson_drm*priv;
>  };
> -#define encoder_to_meson_encoder_cvbs(x) \
> - container_of(x, struct meson_encoder_cvbs, encoder)
>  
> -#define connector_to_meson_encoder_cvbs(x) \
> - container_of(x, struct meson_encoder_cvbs, connector)
> +#define bridge_to_meson_encoder_cvbs(x) \
> + container_of(x, struct meson_encoder_cvbs, bridge)
>  
>  /* Supported Modes */
>  
> @@ -81,21 +84,18 @@ meson_cvbs_get_mode(const struct drm_display_mode 
> *req_mode)
>   return NULL;
>  }
>  
> -/* Connector */
> -
> -static void meson_cvbs_connector_destroy(struct drm_connector *connector)
> +static int meson_encoder_cvbs_attach(struct drm_bridge *bridge,
> +  enum drm_bridge_attach_flags flags)
>  {
> - drm_connector_cleanup(connector);
> -}
> + struct meson_encoder_cvbs *meson_encoder_cvbs =
> + bridge_to_meson_encoder_cvbs(bridge);
>  
> -static enum drm_connector_status
> -meson_cvbs_connector_detect(struct drm_connector *connector, bool force)
> -{
> - /* FIXME: Add load-detect or jack-detect if possible */
> - return connector_status_connected;
> + return drm_bridge_attach(bridge->encoder, 
> meson_encoder_cvbs->next_bridge,
> +  _encoder_cvbs->bridge, flags);
>  }
>  
> -static int meson_cvbs_connector_get_modes(struct drm_connector *connector)
> +static int meson_encoder_cvbs_get_modes(struct drm_bridge *bridge,
> + struct drm_connector *connector)
>  {
>   struct drm_device *dev = connector->dev;
>   struct drm_display_mode *mode;
> @@ -116,40 +116,18 @@ static int meson_cvbs_connector_get_modes(struct 
> drm_connector *connector)
>   return i;
>  }
>  
> -static int meson_cvbs_connector_mode_valid(struct drm_connector *connector,
> -struct drm_display_mode *mode)
> +static int meson_encoder_cvbs_mode_valid(struct drm_bridge *bridge,
> + const struct drm_display_info 
> *display_info,
> + const struct drm_display_mode *mode)
>  {
> - /* Validate the modes added in get_modes */
> - return MODE_OK;
> -}
> -
> -static const struct drm_connector_funcs meson_cvbs_connector_funcs = {
> - .detect = meson_cvbs_connector_detect,
> - .fill_modes = drm_helper_probe_single_connector_modes,
> - .destroy= meson_cvbs_connector_destroy,
> - .reset  = drm_atomic_helper_connector_reset,
> - .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
> - .atomic_destroy_state   = drm_atomic_helper_connector_destroy_state,
> -};
> -
> -static const
> -struct drm_connector_helper_funcs meson_cvbs_connector_helper_funcs = {
> - .get_modes  = meson_cvbs_connector_get_modes,
> - .mode_valid = meson_cvbs_connector_mode_valid,
> -};
> + if (meson_cvbs_get_mode(mode))
> + return MODE_OK;
>  
> -/* Encoder */
> -
> -static void meson_encoder_cvbs_encoder_destroy(struct drm_encoder *encoder)
> -{
> - drm_encoder_cleanup(encoder);
> + return MODE_BAD;
>  }
>  
> -static const struct drm_encoder_funcs meson_encoder_cvbs_encoder_funcs = {
> - .destroy= meson_encoder_cvbs_encoder_destroy,
> -};
> -
> -static int meson_encoder_cvbs_encoder_atomic_check(struct drm_encoder 
> *encoder,
> +static int meson_encoder_cvbs_atomic_check(struct drm_bridge *bridge,
> + struct drm_bridge_state *bridge_state,
>   struct drm_crtc_state *crtc_state,
>   

Re: [PATCH 4/7] drm/bridge: synopsys: dw-hdmi: also allow interlace on bridge

2021-10-14 Thread Sam Ravnborg
On Thu, Oct 14, 2021 at 05:26:03PM +0200, Neil Armstrong wrote:
> Since we allow interlace on the encoder, also allow it on the bridge
> so we can allow interlaced modes when using DRM_BRIDGE_ATTACH_NO_CONNECTOR.
> 
> Signed-off-by: Neil Armstrong 
Acked-by: Sam Ravnborg 
> ---
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
> b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> index f08d0fded61f..62ae63565d3a 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> @@ -3413,6 +3413,7 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device 
> *pdev,
>   hdmi->bridge.funcs = _hdmi_bridge_funcs;
>   hdmi->bridge.ops = DRM_BRIDGE_OP_DETECT | DRM_BRIDGE_OP_EDID
>| DRM_BRIDGE_OP_HPD;
> + hdmi->bridge.interlace_allowed = true;
>  #ifdef CONFIG_OF
>   hdmi->bridge.of_node = pdev->dev.of_node;
>  #endif
> -- 
> 2.25.1


Re: [PATCH 3/7] drm/meson: split out encoder from meson_dw_hdmi

2021-10-14 Thread Sam Ravnborg
Hi Neil,

I did not verify all the code movements - but it looked correct from a
quick glance.
A few comments below, especially the use of mode_set() should be
addressed as it is deprecated.

Sam

On Thu, Oct 14, 2021 at 05:26:02PM +0200, Neil Armstrong wrote:
> This moves all the non-DW-HDMI code where it should be:
> an encoder in the drm/meson core driver.
> 
> The bridge functions are copied as-is, the encoder init uses the
> simple kms helper and for now the bridge attach flags is 0.
> 
> The meson dw-hdmi glue is slighly fixed to live without the
> encoder in the same driver.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/gpu/drm/meson/Makefile |   1 +
>  drivers/gpu/drm/meson/meson_drv.c  |   5 +
>  drivers/gpu/drm/meson/meson_dw_hdmi.c  | 341 ++-
>  drivers/gpu/drm/meson/meson_encoder_hdmi.c | 359 +
>  drivers/gpu/drm/meson/meson_encoder_hdmi.h |  12 +
>  5 files changed, 396 insertions(+), 322 deletions(-)
>  create mode 100644 drivers/gpu/drm/meson/meson_encoder_hdmi.c
>  create mode 100644 drivers/gpu/drm/meson/meson_encoder_hdmi.h
> 
> diff --git a/drivers/gpu/drm/meson/Makefile b/drivers/gpu/drm/meson/Makefile
> index 28a519cdf66b..523fce45f16b 100644
> --- a/drivers/gpu/drm/meson/Makefile
> +++ b/drivers/gpu/drm/meson/Makefile
> @@ -2,6 +2,7 @@
>  meson-drm-y := meson_drv.o meson_plane.o meson_crtc.o meson_venc_cvbs.o
>  meson-drm-y += meson_viu.o meson_vpp.o meson_venc.o meson_vclk.o 
> meson_overlay.o
>  meson-drm-y += meson_rdma.o meson_osd_afbcd.o
> +meson-drm-y += meson_encoder_hdmi.o
>  
>  obj-$(CONFIG_DRM_MESON) += meson-drm.o
>  obj-$(CONFIG_DRM_MESON_DW_HDMI) += meson_dw_hdmi.o
> diff --git a/drivers/gpu/drm/meson/meson_drv.c 
> b/drivers/gpu/drm/meson/meson_drv.c
> index b53606d8825f..0a28a8e29d63 100644
> --- a/drivers/gpu/drm/meson/meson_drv.c
> +++ b/drivers/gpu/drm/meson/meson_drv.c
> @@ -32,6 +32,7 @@
>  #include "meson_osd_afbcd.h"
>  #include "meson_registers.h"
>  #include "meson_venc_cvbs.h"
> +#include "meson_encoder_hdmi.h"
>  #include "meson_viu.h"
>  #include "meson_vpp.h"
>  #include "meson_rdma.h"
> @@ -319,6 +320,10 @@ static int meson_drv_bind_master(struct device *dev, 
> bool has_components)
>   }
>   }
>  
> + ret = meson_encoder_hdmi_init(priv);
> + if (ret)
> + goto free_drm;
> +
>   ret = meson_plane_create(priv);
>   if (ret)
>   goto free_drm;
> diff --git a/drivers/gpu/drm/meson/meson_dw_hdmi.c 
> b/drivers/gpu/drm/meson/meson_dw_hdmi.c
> index 2ed87cfdd735..c2480b3335ac 100644
> --- a/drivers/gpu/drm/meson/meson_dw_hdmi.c
> +++ b/drivers/gpu/drm/meson/meson_dw_hdmi.c
> @@ -22,14 +22,11 @@
>  #include 
>  #include 
>  
> -#include 
>  #include 
>  
>  #include "meson_drv.h"
>  #include "meson_dw_hdmi.h"
>  #include "meson_registers.h"
> -#include "meson_vclk.h"
> -#include "meson_venc.h"
>  
>  #define DRIVER_NAME "meson-dw-hdmi"
>  #define DRIVER_DESC "Amlogic Meson HDMI-TX DRM driver"
> @@ -135,8 +132,6 @@ struct meson_dw_hdmi_data {
>  };
>  
>  struct meson_dw_hdmi {
> - struct drm_encoder encoder;
> - struct drm_bridge bridge;
>   struct dw_hdmi_plat_data dw_plat_data;
>   struct meson_drm *priv;
>   struct device *dev;
> @@ -148,12 +143,8 @@ struct meson_dw_hdmi {
>   struct regulator *hdmi_supply;
>   u32 irq_stat;
>   struct dw_hdmi *hdmi;
> - unsigned long output_bus_fmt;
> + struct drm_bridge *bridge;
>  };
> -#define encoder_to_meson_dw_hdmi(x) \
> - container_of(x, struct meson_dw_hdmi, encoder)
> -#define bridge_to_meson_dw_hdmi(x) \
> - container_of(x, struct meson_dw_hdmi, bridge)
>  
>  static inline int dw_hdmi_is_compatible(struct meson_dw_hdmi *dw_hdmi,
>   const char *compat)
> @@ -295,14 +286,14 @@ static inline void dw_hdmi_dwc_write_bits(struct 
> meson_dw_hdmi *dw_hdmi,
>  
>  /* Setup PHY bandwidth modes */
>  static void meson_hdmi_phy_setup_mode(struct meson_dw_hdmi *dw_hdmi,
> -   const struct drm_display_mode *mode)
> +   const struct drm_display_mode *mode,
> +   bool mode_is_420)
>  {
>   struct meson_drm *priv = dw_hdmi->priv;
>   unsigned int pixel_clock = mode->clock;
>  
>   /* For 420, pixel clock is half unlike venc clock */
> - if (dw_hdmi->output_bus_fmt == MEDIA_BUS_FMT_UYYVYY8_0_5X24)
> - pixel_clock /= 2;
> + if (mode_is_420) pixel_clock /= 2;
>  
>   if (dw_hdmi_is_compatible(dw_hdmi, "amlogic,meson-gxl-dw-hdmi") ||
>   dw_hdmi_is_compatible(dw_hdmi, "amlogic,meson-gxm-dw-hdmi")) {
> @@ -374,68 +365,25 @@ static inline void meson_dw_hdmi_phy_reset(struct 
> meson_dw_hdmi *dw_hdmi)
>   mdelay(2);
>  }
>  
> -static void dw_hdmi_set_vclk(struct meson_dw_hdmi *dw_hdmi,
> -  const struct drm_display_mode *mode)
> -{
> - struct meson_drm 

Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Jason Gunthorpe
On Thu, Oct 14, 2021 at 10:35:27AM -0700, Ralph Campbell wrote:

> I ran xfstests-dev using the kernel boot option to "fake" a pmem device
> when I first posted this patch. The tests ran OK (or at least the same
> tests passed with and without my patch). 

Hmm. I know nothing of xfstests but 

tests/generic/413

Looks kind of like it might cover this situation?

Did it run for you?

Jason


Re: [PATCH 2/7] drm/meson: remove useless recursive components matching

2021-10-14 Thread Sam Ravnborg
Hi Neil,

one comment below. Other than that
Acked-by: Sam Ravnborg 

Sam

On Thu, Oct 14, 2021 at 05:26:01PM +0200, Neil Armstrong wrote:
> The initial design was recursive to cover all port/endpoints, but only the 
> first layer
> of endpoints should be covered by the components list.
> This also breaks the MIPI-DSI init/bridge attach sequence, thus only parse the
> first endpoints instead of recursing.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/gpu/drm/meson/meson_drv.c | 62 +++
>  1 file changed, 21 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/gpu/drm/meson/meson_drv.c 
> b/drivers/gpu/drm/meson/meson_drv.c
> index bc0d60df04ae..b53606d8825f 100644
> --- a/drivers/gpu/drm/meson/meson_drv.c
> +++ b/drivers/gpu/drm/meson/meson_drv.c
> @@ -427,46 +427,6 @@ static int compare_of(struct device *dev, void *data)
>   return dev->of_node == data;
>  }
>  
> -/* Possible connectors nodes to ignore */
> -static const struct of_device_id connectors_match[] = {
> - { .compatible = "composite-video-connector" },
> - { .compatible = "svideo-connector" },
> - { .compatible = "hdmi-connector" },
> - { .compatible = "dvi-connector" },
> - {}
> -};
> -
> -static int meson_probe_remote(struct platform_device *pdev,
> -   struct component_match **match,
> -   struct device_node *parent,
> -   struct device_node *remote)
> -{
> - struct device_node *ep, *remote_node;
> - int count = 1;
> -
> - /* If node is a connector, return and do not add to match table */
> - if (of_match_node(connectors_match, remote))
> - return 1;
> -
> - component_match_add(>dev, match, compare_of, remote);
> -
> - for_each_endpoint_of_node(remote, ep) {
> - remote_node = of_graph_get_remote_port_parent(ep);
> - if (!remote_node ||
> - remote_node == parent || /* Ignore parent endpoint */
> - !of_device_is_available(remote_node)) {
> - of_node_put(remote_node);
> - continue;
> - }
> -
> - count += meson_probe_remote(pdev, match, remote, remote_node);
> -
> - of_node_put(remote_node);
> - }
> -
> - return count;
> -}
> -
>  static void meson_drv_shutdown(struct platform_device *pdev)
>  {
>   struct meson_drm *priv = dev_get_drvdata(>dev);
> @@ -478,6 +438,13 @@ static void meson_drv_shutdown(struct platform_device 
> *pdev)
>   drm_atomic_helper_shutdown(priv->drm);
>  }
>  
> +/* Possible connectors nodes to ignore */
> +static const struct of_device_id connectors_match[] = {
> + { .compatible = "composite-video-connector" },
> + { .compatible = "svideo-connector" },
> + {}
> +};
> +
>  static int meson_drv_probe(struct platform_device *pdev)
>  {
>   struct component_match *match = NULL;
> @@ -492,8 +459,21 @@ static int meson_drv_probe(struct platform_device *pdev)
>   continue;
>   }
>  
> - count += meson_probe_remote(pdev, , np, remote);
> + /* If an analog connector is detected, count it as an output */
> + if (of_match_node(connectors_match, remote)) {
> + ++count;
> + of_node_put(remote);
> + continue;
> + }
> +
> + DRM_DEBUG_DRIVER("parent %pOF remote match add %pOF parent 
> %s\n",
> +   np, remote, dev_name(>dev));
Avoid the deprecated logging functions.
Use drm_dbg() or if there is no drm_device just dev_dbg().

I assume the driver uses DRM_xxx all over, so I understand if you keep
it as-is.

> +
> + component_match_add(>dev, , compare_of, remote);
> +
>   of_node_put(remote);
> +
> + ++count;
>   }
>  
>   if (count && !match)
> -- 
> 2.25.1


[PATCH 13/25] drm/i915/guc: Insert submit fences between requests in parent-child relationship

2021-10-14 Thread Matthew Brost
The GuC must receive requests in the order submitted for contexts in a
parent-child relationship to function correctly. To ensure this, insert
a submit fence between the current request and last request submitted
for requests / contexts in a parent child relationship. This is
conceptually similar to a single timeline.

Signed-off-by: Matthew Brost 
Cc: John Harrison 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context.h   |   5 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   6 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   5 +-
 drivers/gpu/drm/i915/i915_request.c   | 120 ++
 4 files changed, 108 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 9f0995150a7a..edf12caaade3 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -77,6 +77,11 @@ intel_context_to_parent(struct intel_context *ce)
}
 }
 
+static inline bool intel_context_is_parallel(struct intel_context *ce)
+{
+   return intel_context_is_child(ce) || intel_context_is_parent(ce);
+}
+
 void intel_context_bind_parent_child(struct intel_context *parent,
 struct intel_context *child);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 48decb5ee954..8309d1141d0a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -237,6 +237,12 @@ struct intel_context {
};
/** @parent: pointer to parent if child */
struct intel_context *parent;
+   /**
+* @last_rq: last request submitted on a parallel context, used
+* to insert submit fences between requests in the parallel
+* context
+*/
+   struct i915_request *last_rq;
/** @number_children: number of children if parent */
u8 number_children;
/** @guc: GuC specific members for parallel submission */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 71ae5eb69849..ebb64fb50396 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -684,8 +684,7 @@ static inline int rq_prio(const struct i915_request *rq)
 
 static bool is_multi_lrc_rq(struct i915_request *rq)
 {
-   return intel_context_is_child(rq->context) ||
-   intel_context_is_parent(rq->context);
+   return intel_context_is_parallel(rq->context);
 }
 
 static bool can_merge_rq(struct i915_request *rq,
@@ -2873,6 +2872,8 @@ static void guc_parent_context_unpin(struct intel_context 
*ce)
GEM_BUG_ON(!intel_context_is_parent(ce));
GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
 
+   if (ce->parallel.last_rq)
+   i915_request_put(ce->parallel.last_rq);
unpin_guc_id(guc, ce);
lrc_unpin(ce);
 }
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index c0d27072c28d..8bdf9f2f9b90 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1525,36 +1525,62 @@ i915_request_await_object(struct i915_request *to,
return ret;
 }
 
+static inline bool is_parallel_rq(struct i915_request *rq)
+{
+   return intel_context_is_parallel(rq->context);
+}
+
+static inline struct intel_context *request_to_parent(struct i915_request *rq)
+{
+   return intel_context_to_parent(rq->context);
+}
+
 static struct i915_request *
-__i915_request_add_to_timeline(struct i915_request *rq)
+__i915_request_ensure_parallel_ordering(struct i915_request *rq,
+   struct intel_timeline *timeline)
 {
-   struct intel_timeline *timeline = i915_request_timeline(rq);
struct i915_request *prev;
 
-   /*
-* Dependency tracking and request ordering along the timeline
-* is special cased so that we can eliminate redundant ordering
-* operations while building the request (we know that the timeline
-* itself is ordered, and here we guarantee it).
-*
-* As we know we will need to emit tracking along the timeline,
-* we embed the hooks into our request struct -- at the cost of
-* having to have specialised no-allocation interfaces (which will
-* be beneficial elsewhere).
-*
-* A second benefit to open-coding i915_request_await_request is
-* that we can apply a slight variant of the rules specialised
-* for timelines that jump between engines (such as virtual engines).
-* If we consider the case of virtual engine, we must emit a dma-fence
-* to prevent scheduling of the second request until the first is
-* 

[PATCH 18/25] drm/i915/guc: Add basic GuC multi-lrc selftest

2021-10-14 Thread Matthew Brost
Add very basic (single submission) multi-lrc selftest.

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   1 +
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   | 179 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 3 files changed, 181 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7c12364a017a..57eb5f8bc8bb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3954,4 +3954,5 @@ bool intel_guc_virtual_engine_has_heartbeat(const struct 
intel_engine_cs *ve)
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_guc.c"
+#include "selftest_guc_multi_lrc.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c 
b/drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c
new file mode 100644
index ..50953c8e8b53
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright �� 2019 Intel Corporation
+ */
+
+#include "selftests/igt_spinner.h"
+#include "selftests/igt_reset.h"
+#include "selftests/intel_scheduler_helpers.h"
+#include "gt/intel_engine_heartbeat.h"
+#include "gem/selftests/mock_context.h"
+
+static void logical_sort(struct intel_engine_cs **engines, int num_engines)
+{
+   struct intel_engine_cs *sorted[MAX_ENGINE_INSTANCE + 1];
+   int i, j;
+
+   for (i = 0; i < num_engines; ++i)
+   for (j = 0; j < MAX_ENGINE_INSTANCE + 1; ++j) {
+   if (engines[j]->logical_mask & BIT(i)) {
+   sorted[i] = engines[j];
+   break;
+   }
+   }
+
+   memcpy(*engines, *sorted,
+  sizeof(struct intel_engine_cs *) * num_engines);
+}
+
+static struct intel_context *
+multi_lrc_create_parent(struct intel_gt *gt, u8 class,
+   unsigned long flags)
+{
+   struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+   struct intel_engine_cs *engine;
+   enum intel_engine_id id;
+   int i = 0;
+
+   for_each_engine(engine, gt, id) {
+   if (engine->class != class)
+   continue;
+
+   siblings[i++] = engine;
+   }
+
+   if (i <= 1)
+   return ERR_PTR(0);
+
+   logical_sort(siblings, i);
+
+   return intel_engine_create_parallel(siblings, 1, i);
+}
+
+static void multi_lrc_context_unpin(struct intel_context *ce)
+{
+   struct intel_context *child;
+
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+
+   for_each_child(ce, child)
+   intel_context_unpin(child);
+   intel_context_unpin(ce);
+}
+
+static void multi_lrc_context_put(struct intel_context *ce)
+{
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+
+   /*
+* Only the parent gets the creation ref put in the uAPI, the parent
+* itself is responsible for creation ref put on the children.
+*/
+   intel_context_put(ce);
+}
+
+static struct i915_request *
+multi_lrc_nop_request(struct intel_context *ce)
+{
+   struct intel_context *child;
+   struct i915_request *rq, *child_rq;
+   int i = 0;
+
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+
+   rq = intel_context_create_request(ce);
+   if (IS_ERR(rq))
+   return rq;
+
+   i915_request_get(rq);
+   i915_request_add(rq);
+
+   for_each_child(ce, child) {
+   child_rq = intel_context_create_request(child);
+   if (IS_ERR(child_rq))
+   goto child_error;
+
+   if (++i == ce->parallel.number_children)
+   set_bit(I915_FENCE_FLAG_SUBMIT_PARALLEL,
+   _rq->fence.flags);
+   i915_request_add(child_rq);
+   }
+
+   return rq;
+
+child_error:
+   i915_request_put(rq);
+
+   return ERR_PTR(-ENOMEM);
+}
+
+static int __intel_guc_multi_lrc_basic(struct intel_gt *gt, unsigned int class)
+{
+   struct intel_context *parent;
+   struct i915_request *rq;
+   int ret;
+
+   parent = multi_lrc_create_parent(gt, class, 0);
+   if (IS_ERR(parent)) {
+   pr_err("Failed creating contexts: %ld", PTR_ERR(parent));
+   return PTR_ERR(parent);
+   } else if (!parent) {
+   pr_debug("Not enough engines in class: %d", class);
+   return 0;
+   }
+
+   rq = multi_lrc_nop_request(parent);
+   if (IS_ERR(rq)) {
+   ret = PTR_ERR(rq);
+   pr_err("Failed creating requests: %d", ret);
+   goto out;
+   }
+
+   ret = intel_selftest_wait_for_rq(rq);
+   if (ret)
+   pr_err("Failed waiting on request: %d", 

[PATCH 12/25] drm/i915/guc: Implement multi-lrc submission

2021-10-14 Thread Matthew Brost
Implement multi-lrc submission via a single workqueue entry and single
H2G. The workqueue entry contains an updated tail value for each
request, of all the contexts in the multi-lrc submission, and updates
these values simultaneously. As such, the tasklet and bypass path have
been updated to coalesce requests into a single submission.

v2:
 (John Harrison)
  - s/wqe/wqi
  - Use FIELD_PREP macros
  - Add GEM_BUG_ONs ensures length fits within field
  - Add comment / white space to intel_guc_write_barrier
 (Kernel test robot)
  - Make need_tasklet a static function
v3:
 (Docs)
  - A comment for submission_stall_reason
v4:
 (Kernel test robot)
  - Initialize return value in bypass tasklt submit function
 (John Harrison)
  - Add comment near work queue defs
  - Add BUILD_BUG_ON to ensure WQ_SIZE is a power of 2
  - Update write_barrier comment to talk about work queue
v5:
 (John Harrison)
  - Fix typo in work queue comment

Reviewed-by: John Harrison 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  29 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  11 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  30 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 323 +++---
 drivers/gpu/drm/i915/i915_request.h   |   8 +
 6 files changed, 350 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 8f8182bf7c11..6e228343e8cb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -756,3 +756,32 @@ void intel_guc_load_status(struct intel_guc *guc, struct 
drm_printer *p)
}
}
 }
+
+void intel_guc_write_barrier(struct intel_guc *guc)
+{
+   struct intel_gt *gt = guc_to_gt(guc);
+
+   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
+   /*
+* Ensure intel_uncore_write_fw can be used rather than
+* intel_uncore_write.
+*/
+   GEM_BUG_ON(guc->send_regs.fw_domains);
+
+   /*
+* This register is used by the i915 and GuC for MMIO based
+* communication. Once we are in this code CTBs are the only
+* method the i915 uses to communicate with the GuC so it is
+* safe to write to this register (a value of 0 is NOP for MMIO
+* communication). If we ever start mixing CTBs and MMIOs a new
+* register will have to be chosen. This function is also used
+* to enforce ordering of a work queue item write and an update
+* to the process descriptor. When a work queue is being used,
+* CTBs are also the only mechanism of communication.
+*/
+   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
+   } else {
+   /* wmb() sufficient for a barrier if in smem */
+   wmb();
+   }
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4ca197f400ba..31cf9fb48c7e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -46,6 +46,15 @@ struct intel_guc {
 * submitted until the stalled request is processed.
 */
struct i915_request *stalled_request;
+   /**
+* @submission_stall_reason: reason why submission is stalled
+*/
+   enum {
+   STALL_NONE,
+   STALL_REGISTER_CONTEXT,
+   STALL_MOVE_LRC_TAIL,
+   STALL_ADD_REQUEST,
+   } submission_stall_reason;
 
/* intel_guc_recv interrupt related state */
/** @irq_lock: protects GuC irq state */
@@ -367,4 +376,6 @@ void intel_guc_submission_cancel_requests(struct intel_guc 
*guc);
 
 void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
 
+void intel_guc_write_barrier(struct intel_guc *guc);
+
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 0a3504bc0b61..a0cc34be7b56 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -383,28 +383,6 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
return ++ct->requests.last_fence;
 }
 
-static void write_barrier(struct intel_guc_ct *ct)
-{
-   struct intel_guc *guc = ct_to_guc(ct);
-   struct intel_gt *gt = guc_to_gt(guc);
-
-   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
-   GEM_BUG_ON(guc->send_regs.fw_domains);
-   /*
-* This register is used by the i915 and GuC for MMIO based
-* communication. Once we are in this code CTBs are the only
-* method the i915 uses to communicate with the GuC so it is
-* safe to write to this register (a value of 0 is NOP for MMIO

[PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread Matthew Brost
Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Daniel Vetter)
  - Add IGT link and placeholder for media UMD link
v3:
 (Kernel test robot)
  - Fix warning in unpin engines call
 (John Harrison)
  - Reword a bunch of the kernel doc
v4:
 (John Harrison)
  - Add comment why perma-pin is done after setting gem context
  - Update some comments / docs for proto contexts
v5:
 (John Harrison)
  - Rework perma-pin comment
  - Add BUG_IN if context is pinned when setting gem context

Cc: Tvrtko Ursulin 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 230 +-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|  12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   6 +-
 .../drm/i915/gt/intel_execlists_submission.c  |   6 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  12 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 114 -
 include/uapi/drm/i915_drm.h   | 131 ++
 9 files changed, 505 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d225d3dd0b40..9a00f11fef46 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -556,9 +556,150 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
__user *base, void *data)
return 0;
 }
 
+static int
+set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct i915_context_engines_parallel_submit __user *ext =
+   container_of_user(base, typeof(*ext), base);
+   const struct set_proto_ctx_engines *set = data;
+   struct drm_i915_private *i915 = set->i915;
+   u64 flags;
+   int err = 0, n, i, j;
+   u16 slot, width, num_siblings;
+   struct intel_engine_cs **siblings = NULL;
+   intel_engine_mask_t prev_mask;
+
+   /* Disabling for now */
+   return -ENODEV;
+
+   /* FIXME: This is NIY for execlists */
+   if (!(intel_uc_uses_guc_submission(>gt.uc)))
+   return -ENODEV;
+
+   if (get_user(slot, >engine_index))
+   return -EFAULT;
+
+   if (get_user(width, >width))
+   return -EFAULT;
+
+   if (get_user(num_siblings, >num_siblings))
+   return -EFAULT;
+
+   if (slot >= set->num_engines) {
+   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
+   slot, set->num_engines);
+   return -EINVAL;
+   }
+
+   if (set->engines[slot].type != I915_GEM_ENGINE_TYPE_INVALID) {
+   drm_dbg(>drm,
+   "Invalid placement[%d], already occupied\n", slot);
+   return -EINVAL;
+   }
+
+   if (get_user(flags, >flags))
+   return -EFAULT;
+
+   if (flags) {
+   drm_dbg(>drm, "Unknown flags 0x%02llx", flags);
+   return -EINVAL;
+   }
+
+   for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+   err = check_user_mbz(>mbz64[n]);
+   if (err)
+   return err;
+   }
+
+   if (width < 2) {
+   drm_dbg(>drm, "Width (%d) < 2\n", width);
+   return -EINVAL;
+   }
+
+   if (num_siblings < 1) {
+   drm_dbg(>drm, "Number siblings (%d) < 1\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
+   siblings = kmalloc_array(num_siblings * width,
+sizeof(*siblings),
+GFP_KERNEL);
+   if (!siblings)
+   return -ENOMEM;
+
+   /* Create contexts / engines */
+   for (i = 0; i < width; ++i) {
+   intel_engine_mask_t current_mask = 0;
+   struct i915_engine_class_instance prev_engine;
+
+   for (j = 0; j < num_siblings; ++j) {
+   struct i915_engine_class_instance ci;
+
+   n = i * num_siblings + j;
+   if (copy_from_user(, >engines[n], sizeof(ci))) {
+   err = -EFAULT;
+   goto out_err;
+   }
+
+   siblings[n] =
+   intel_engine_lookup_user(i915, ci.engine_class,
+ci.engine_instance);
+   if (!siblings[n]) {
+   drm_dbg(>drm,
+   "Invalid sibling[%d]: { class:%d, 
inst:%d }\n",
+   n, 

[PATCH 14/25] drm/i915/guc: Implement multi-lrc reset

2021-10-14 Thread Matthew Brost
Update context and full GPU reset to work with multi-lrc. The idea is
parent context tracks all the active requests inflight for itself and
its children. The parent context owns the reset replaying / canceling
requests as needed.

v2:
 (John Harrison)
  - Simply loop in find active request
  - Add comments to find ative request / reset loop
v3:
 (John Harrison)
  - s/its'/its/g
  - Fix comment when searching for active request
  - Reorder if state in __guc_reset_context
v4:
 (Kernel test robot)
  - Delete ununsed is_multi_lrc function

Reviewed-by: John Harrison 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context.c   | 15 -
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 64 +--
 2 files changed, 58 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 79f321c6c008..6aab60584ee5 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -529,20 +529,29 @@ struct i915_request *intel_context_create_request(struct 
intel_context *ce)
 
 struct i915_request *intel_context_find_active_request(struct intel_context 
*ce)
 {
+   struct intel_context *parent = intel_context_to_parent(ce);
struct i915_request *rq, *active = NULL;
unsigned long flags;
 
GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
 
-   spin_lock_irqsave(>guc_state.lock, flags);
-   list_for_each_entry_reverse(rq, >guc_state.requests,
+   /*
+* We search the parent list to find an active request on the submitted
+* context. The parent list contains the requests for all the contexts
+* in the relationship so we have to do a compare of each request's
+* context.
+*/
+   spin_lock_irqsave(>guc_state.lock, flags);
+   list_for_each_entry_reverse(rq, >guc_state.requests,
sched.link) {
+   if (rq->context != ce)
+   continue;
if (i915_request_completed(rq))
break;
 
active = rq;
}
-   spin_unlock_irqrestore(>guc_state.lock, flags);
+   spin_unlock_irqrestore(>guc_state.lock, flags);
 
return active;
 }
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ebb64fb50396..112b5e6fe39d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1217,10 +1217,15 @@ __unwind_incomplete_requests(struct intel_context *ce)
 
 static void __guc_reset_context(struct intel_context *ce, bool stalled)
 {
+   bool local_stalled;
struct i915_request *rq;
unsigned long flags;
u32 head;
+   int i, number_children = ce->parallel.number_children;
bool skip = false;
+   struct intel_context *parent = ce;
+
+   GEM_BUG_ON(intel_context_is_child(ce));
 
intel_context_get(ce);
 
@@ -1246,25 +1251,38 @@ static void __guc_reset_context(struct intel_context 
*ce, bool stalled)
if (unlikely(skip))
goto out_put;
 
-   rq = intel_context_find_active_request(ce);
-   if (!rq) {
-   head = ce->ring->tail;
-   stalled = false;
-   goto out_replay;
-   }
+   /*
+* For each context in the relationship find the hanging request
+* resetting each context / request as needed
+*/
+   for (i = 0; i < number_children + 1; ++i) {
+   if (!intel_context_is_pinned(ce))
+   goto next_context;
+
+   local_stalled = false;
+   rq = intel_context_find_active_request(ce);
+   if (!rq) {
+   head = ce->ring->tail;
+   goto out_replay;
+   }
 
-   if (!i915_request_started(rq))
-   stalled = false;
+   if (i915_request_started(rq))
+   local_stalled = true;
 
-   GEM_BUG_ON(i915_active_is_idle(>active));
-   head = intel_ring_wrap(ce->ring, rq->head);
-   __i915_request_reset(rq, stalled);
+   GEM_BUG_ON(i915_active_is_idle(>active));
+   head = intel_ring_wrap(ce->ring, rq->head);
 
+   __i915_request_reset(rq, local_stalled && stalled);
 out_replay:
-   guc_reset_state(ce, head, stalled);
-   __unwind_incomplete_requests(ce);
+   guc_reset_state(ce, head, local_stalled && stalled);
+next_context:
+   if (i != number_children)
+   ce = list_next_entry(ce, parallel.child_link);
+   }
+
+   __unwind_incomplete_requests(parent);
 out_put:
-   intel_context_put(ce);
+   intel_context_put(parent);
 }
 
 void intel_guc_submission_reset(struct intel_guc *guc, bool stalled)
@@ -1285,7 +1303,8 @@ void intel_guc_submission_reset(struct intel_guc *guc, 

[PATCH 09/25] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts

2021-10-14 Thread Matthew Brost
In GuC parent-child contexts the parent context controls the scheduling,
ensure only the parent does the scheduling operations.

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index d9f5be00e586..fbcf2dc2b2de 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -324,6 +324,12 @@ static inline void decr_context_committed_requests(struct 
intel_context *ce)
GEM_BUG_ON(ce->guc_state.number_committed_requests < 0);
 }
 
+static struct intel_context *
+request_to_scheduling_context(struct i915_request *rq)
+{
+   return intel_context_to_parent(rq->context);
+}
+
 static inline bool context_guc_id_invalid(struct intel_context *ce)
 {
return ce->guc_id.id == GUC_INVALID_LRC_ID;
@@ -1711,6 +1717,7 @@ static void __guc_context_sched_disable(struct intel_guc 
*guc,
 
GEM_BUG_ON(guc_id == GUC_INVALID_LRC_ID);
 
+   GEM_BUG_ON(intel_context_is_child(ce));
trace_intel_context_sched_disable(ce);
 
guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action),
@@ -1936,6 +1943,8 @@ static void guc_context_sched_disable(struct 
intel_context *ce)
intel_wakeref_t wakeref;
u16 guc_id;
 
+   GEM_BUG_ON(intel_context_is_child(ce));
+
spin_lock_irqsave(>guc_state.lock, flags);
 
/*
@@ -2304,6 +2313,8 @@ static void guc_signal_context_fence(struct intel_context 
*ce)
 {
unsigned long flags;
 
+   GEM_BUG_ON(intel_context_is_child(ce));
+
spin_lock_irqsave(>guc_state.lock, flags);
clr_context_wait_for_deregister_to_register(ce);
__guc_signal_context_fence(ce);
@@ -2334,7 +2345,7 @@ static void guc_context_init(struct intel_context *ce)
 
 static int guc_request_alloc(struct i915_request *rq)
 {
-   struct intel_context *ce = rq->context;
+   struct intel_context *ce = request_to_scheduling_context(rq);
struct intel_guc *guc = ce_to_guc(ce);
unsigned long flags;
int ret;
-- 
2.32.0



[PATCH 04/25] drm/i915/guc: Don't call switch_to_kernel_context with GuC submission

2021-10-14 Thread Matthew Brost
Calling switch_to_kernel_context isn't needed if the engine PM reference
is taken while all user contexts are pinned as if don't have PM ref that
guarantees that all user contexts scheduling is disabled. By not calling
switch_to_kernel_context we save on issuing a request to the engine.

v2:
 (Daniel Vetter)
  - Add FIXME comment about pushing switch_to_kernel_context to backend
v3:
 (John Harrison)
  - Update commit message
  - Fix workding comment

Signed-off-by: Matthew Brost 
Reviewed-by: Daniel Vetter 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index dacd62773735..a1334b48dde7 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -162,6 +162,19 @@ static bool switch_to_kernel_context(struct 
intel_engine_cs *engine)
unsigned long flags;
bool result = true;
 
+   /*
+* This is execlist specific behaviour intended to ensure the GPU is
+* idle by switching to a known 'safe' context. With GuC submission, the
+* same idle guarantee is achieved by other means (disabling
+* scheduling). Further, switching to a 'safe' context has no effect
+* with GuC submission as the scheduler can just switch back again.
+*
+* FIXME: Move this backend scheduler specific behaviour into the
+* scheduler backend.
+*/
+   if (intel_engine_uses_guc(engine))
+   return true;
+
/* GPU is pointing to the void, as good as in the kernel context. */
if (intel_gt_is_wedged(engine->gt))
return true;
-- 
2.32.0



[PATCH 05/25] drm/i915: Add logical engine mapping

2021-10-14 Thread Matthew Brost
Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
 (Daniel Vetter)
  - Add kernel doc for new fields
v3:
 (Tvrtko)
  - Update comment for new logical_mask field
v4:
 (John Harrison)
  - Update comment for new logical_mask field

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 +++
 .../drm/i915/gt/intel_execlists_submission.c  |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
 5 files changed, 62 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2ae57e4656a3..2eb798ad068b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
 }
 
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
 {
const struct engine_info *info = _engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -335,6 +336,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
 
engine->class = info->class;
engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
 
engine->props.heartbeat_interval_ms =
@@ -588,6 +590,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
 }
 
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;
+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
 /**
  * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
  * @gt: pointer to struct intel_gt
@@ -599,7 +632,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
 
drm_WARN_ON(>drm, engine_mask == 0);
@@ -609,15 +643,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
 
-   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
 
-   err = intel_engine_setup(gt, i);
-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
 
-   mask |= BIT(i);
+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanup;
+
+   mask |= BIT(i);
+   }
}
 
/*
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 9167ce52487c..e0f773585c29 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -269,6 +269,13 @@ struct intel_engine_cs {
unsigned int guc_id;
 
intel_engine_mask_t mask;
+   /**
+* @logical_mask: logical mask of engine, reported 

[PATCH 06/25] drm/i915: Expose logical engine instance to user

2021-10-14 Thread Matthew Brost
Expose logical engine instance to user via query engine info IOCTL. This
is required for split-frame workloads as these needs to be placed on
engines in a logically contiguous order. The logical mapping can change
based on fusing. Rather than having user have knowledge of the fusing we
simply just expose the logical mapping with the existing query engine
info IOCTL.

IGT: https://patchwork.freedesktop.org/patch/445637/?series=92854=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Daniel Vetter)
  - Add IGT link, placeholder for media UMD

Cc: Tvrtko Ursulin 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/i915_query.c | 2 ++
 include/uapi/drm/i915_drm.h   | 8 +++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index 5e2b909827f4..51b368be0fc4 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -124,7 +124,9 @@ query_engine_info(struct drm_i915_private *i915,
for_each_uabi_engine(engine, i915) {
info.engine.engine_class = engine->uabi_class;
info.engine.engine_instance = engine->uabi_instance;
+   info.flags = I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE;
info.capabilities = engine->uabi_capabilities;
+   info.logical_instance = ilog2(engine->logical_mask);
 
if (copy_to_user(info_ptr, , sizeof(info)))
return -EFAULT;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index aa2a7eccfb94..0179f92e0916 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2775,14 +2775,20 @@ struct drm_i915_engine_info {
 
/** @flags: Engine flags. */
__u64 flags;
+#define I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE  (1 << 0)
 
/** @capabilities: Capabilities of this engine. */
__u64 capabilities;
 #define I915_VIDEO_CLASS_CAPABILITY_HEVC   (1 << 0)
 #define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC(1 << 1)
 
+   /** @logical_instance: Logical instance of engine */
+   __u16 logical_instance;
+
/** @rsvd1: Reserved fields. */
-   __u64 rsvd1[4];
+   __u16 rsvd1[3];
+   /** @rsvd2: Reserved fields. */
+   __u64 rsvd2[3];
 };
 
 /**
-- 
2.32.0



[PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-14 Thread Matthew Brost
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
execlists. Doing as little as possible to support this interface for
execlists - basically just passing submit fences between each request
generated and virtual engines are not allowed. This is on par with what
is there for the existing (hopefully soon deprecated) bonding interface.

We perma-pin these execlists contexts to align with GuC implementation.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 10 ++--
 drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
 .../drm/i915/gt/intel_execlists_submission.c  | 56 ++-
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 -
 5 files changed, 64 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fb33d0322960..35e87a7d0ea9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
struct intel_engine_cs **siblings = NULL;
intel_engine_mask_t prev_mask;
 
-   /* FIXME: This is NIY for execlists */
-   if (!(intel_uc_uses_guc_submission(>gt.uc)))
-   return -ENODEV;
-
if (get_user(slot, >engine_index))
return -EFAULT;
 
@@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
if (get_user(num_siblings, >num_siblings))
return -EFAULT;
 
+   if (!intel_uc_uses_guc_submission(>gt.uc) && num_siblings != 1) {
+   drm_dbg(>drm, "Only 1 sibling (%d) supported in non-GuC 
mode\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
if (slot >= set->num_engines) {
drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
slot, set->num_engines);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5634d14052bc..1bec92e1d8e6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct intel_context 
*ce)
 
__i915_active_acquire(>active);
 
-   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
+   if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) ||
+   intel_context_is_parallel(ce))
return 0;
 
/* Preallocate tracking nodes */
@@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
 * Callers responsibility to validate that this function is used
 * correctly but we use GEM_BUG_ON here ensure that they do.
 */
-   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
GEM_BUG_ON(intel_context_is_pinned(parent));
GEM_BUG_ON(intel_context_is_child(parent));
GEM_BUG_ON(intel_context_is_pinned(child));
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index bedb80057046..8cd986bdf26c 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -927,8 +927,7 @@ static void execlists_submit_ports(struct intel_engine_cs 
*engine)
 
 static bool ctx_single_port_submission(const struct intel_context *ce)
 {
-   return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
-   intel_context_force_single_submission(ce));
+   return intel_context_force_single_submission(ce);
 }
 
 static bool can_merge_ctx(const struct intel_context *prev,
@@ -2598,6 +2597,58 @@ static void execlists_context_cancel_request(struct 
intel_context *ce,
  current->comm);
 }
 
+static struct intel_context *
+execlists_create_parallel(struct intel_engine_cs **engines,
+ unsigned int num_siblings,
+ unsigned int width)
+{
+   struct intel_engine_cs **siblings = NULL;
+   struct intel_context *parent = NULL, *ce, *err;
+   int i, j;
+
+   GEM_BUG_ON(num_siblings != 1);
+
+   siblings = kmalloc_array(num_siblings,
+sizeof(*siblings),
+GFP_KERNEL);
+   if (!siblings)
+   return ERR_PTR(-ENOMEM);
+
+   for (i = 0; i < width; ++i) {
+   for (j = 0; j < num_siblings; ++j)
+   siblings[j] = engines[i * num_siblings + j];
+
+   ce = intel_context_create(siblings[0]);
+   if (!ce) {
+   err = ERR_PTR(-ENOMEM);
+   goto unwind;
+   }
+
+   if (i == 0)
+   parent = ce;
+   else
+   

[PATCH 23/25] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences

2021-10-14 Thread Matthew Brost
Parallel submission create composite fences (dma_fence_array) for excl /
shared slots in objects. The I915_GEM_BUSY IOCTL checks these slots to
determine the busyness of the object. Prior to patch it only check if
the fence in the slot was a i915_request. Update the check to understand
composite fences and correctly report the busyness.

v2:
 (Tvrtko)
  - Remove duplicate BUILD_BUG_ON

Reviewed-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_busy.c  | 57 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  5 +-
 drivers/gpu/drm/i915/i915_request.h   |  6 ++
 3 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 6234e17259c1..7358bebef15c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -4,6 +4,8 @@
  * Copyright © 2014-2016 Intel Corporation
  */
 
+#include 
+
 #include "gt/intel_engine.h"
 
 #include "i915_gem_ioctls.h"
@@ -36,7 +38,7 @@ static __always_inline u32 __busy_write_id(u16 id)
 }
 
 static __always_inline unsigned int
-__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id))
+__busy_set_if_active(struct dma_fence *fence, u32 (*flag)(u16 id))
 {
const struct i915_request *rq;
 
@@ -46,29 +48,60 @@ __busy_set_if_active(const struct dma_fence *fence, u32 
(*flag)(u16 id))
 * to eventually flush us, but to minimise latency just ask the
 * hardware.
 *
-* Note we only report on the status of native fences.
+* Note we only report on the status of native fences and we currently
+* have two native fences:
+*
+* 1. A composite fence (dma_fence_array) constructed of i915 requests
+* created during a parallel submission. In this case we deconstruct the
+* composite fence into individual i915 requests and check the status of
+* each request.
+*
+* 2. A single i915 request.
 */
-   if (!dma_fence_is_i915(fence))
+   if (dma_fence_is_array(fence)) {
+   struct dma_fence_array *array = to_dma_fence_array(fence);
+   struct dma_fence **child = array->fences;
+   unsigned int nchild = array->num_fences;
+
+   do {
+   struct dma_fence *current_fence = *child++;
+
+   /* Not an i915 fence, can't be busy per above */
+   if (!dma_fence_is_i915(current_fence) ||
+   !test_bit(I915_FENCE_FLAG_COMPOSITE,
+ _fence->flags)) {
+   return 0;
+   }
+
+   rq = to_request(current_fence);
+   if (!i915_request_completed(rq))
+   return flag(rq->engine->uabi_class);
+   } while (--nchild);
+
+   /* All requests in array complete, not busy */
return 0;
+   } else {
+   if (!dma_fence_is_i915(fence))
+   return 0;
 
-   /* opencode to_request() in order to avoid const warnings */
-   rq = container_of(fence, const struct i915_request, fence);
-   if (i915_request_completed(rq))
-   return 0;
+   rq = to_request(fence);
+   if (i915_request_completed(rq))
+   return 0;
 
-   /* Beware type-expansion follies! */
-   BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
-   return flag(rq->engine->uabi_class);
+   /* Beware type-expansion follies! */
+   BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
+   return flag(rq->engine->uabi_class);
+   }
 }
 
 static __always_inline unsigned int
-busy_check_reader(const struct dma_fence *fence)
+busy_check_reader(struct dma_fence *fence)
 {
return __busy_set_if_active(fence, __busy_read_flag);
 }
 
 static __always_inline unsigned int
-busy_check_writer(const struct dma_fence *fence)
+busy_check_writer(struct dma_fence *fence)
 {
if (!fence)
return 0;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index fc30856e81fa..1231224728e4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3008,8 +3008,11 @@ eb_composite_fence_create(struct i915_execbuffer *eb, 
int out_fence_fd)
if (!fences)
return ERR_PTR(-ENOMEM);
 
-   for_each_batch_create_order(eb, i)
+   for_each_batch_create_order(eb, i) {
fences[i] = >requests[i]->fence;
+   __set_bit(I915_FENCE_FLAG_COMPOSITE,
+ >requests[i]->fence.flags);
+   }
 
fence_array = dma_fence_array_create(eb->num_batches,
 

[PATCH 24/25] drm/i915: Enable multi-bb execbuf

2021-10-14 Thread Matthew Brost
Enable multi-bb execbuf by enabling the set_parallel extension.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9a00f11fef46..fb33d0322960 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -570,9 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct 
i915_user_extension __user *base,
struct intel_engine_cs **siblings = NULL;
intel_engine_mask_t prev_mask;
 
-   /* Disabling for now */
-   return -ENODEV;
-
/* FIXME: This is NIY for execlists */
if (!(intel_uc_uses_guc_submission(>gt.uc)))
return -ENODEV;
-- 
2.32.0



[PATCH 19/25] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-10-14 Thread Matthew Brost
For some users of multi-lrc, e.g. split frame, it isn't safe to preempt
mid BB. To safely enable preemption at the BB boundary, a handshake
between parent and child is needed, syncing the set of BBs at the
beginning and end of each batch. This is implemented via custom
emit_bb_start & emit_fini_breadcrumb functions and enabled by default if
a context is configured by set parallel extension.

Lastly, this patch updates the process descriptor to the correct size as
the memory used in the handshake is directly after the process
descriptor.

v2:
 (John Harrison)
  - Fix a few comments wording
  - Add struture for parent page layout
v3:
 (John Harrison)
  - A structure for sync semaphore
  - Use offsetof to calc address
  - Update commit message
v4:
 (John Harrison)
  - Fix typos in comment explaining memory map of scratch page

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   2 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 333 +-
 4 files changed, 326 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 6aab60584ee5..5634d14052bc 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -570,7 +570,7 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
GEM_BUG_ON(intel_context_is_child(child));
GEM_BUG_ON(intel_context_is_parent(child));
 
-   parent->parallel.number_children++;
+   parent->parallel.child_index = parent->parallel.number_children++;
list_add_tail(>parallel.child_link,
  >parallel.child_list);
child->parallel.parent = parent;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 1d880303a7e4..95a5b94b4ece 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -250,6 +250,8 @@ struct intel_context {
struct i915_request *last_rq;
/** @number_children: number of children if parent */
u8 number_children;
+   /** @child_index: index into child_list if child */
+   u8 child_index;
/** @guc: GuC specific members for parallel submission */
struct {
/** @wqi_head: head pointer in work queue */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 18da67cfcd92..722933e26347 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -186,7 +186,7 @@ struct guc_process_desc {
u32 wq_status;
u32 engine_presence;
u32 priority;
-   u32 reserved[30];
+   u32 reserved[36];
 } __packed;
 
 #define CONTEXT_REGISTRATION_FLAG_KMD  BIT(0)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 57eb5f8bc8bb..50f0f4eba03b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -11,6 +11,7 @@
 #include "gt/intel_context.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_heartbeat.h"
+#include "gt/intel_gpu_commands.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_irq.h"
 #include "gt/intel_gt_pm.h"
@@ -368,11 +369,16 @@ static inline struct i915_priolist *to_priolist(struct 
rb_node *rb)
 
 /*
  * When using multi-lrc submission a scratch memory area is reserved in the
- * parent's context state for the process descriptor and work queue. Currently
- * the scratch area is sized to a page.
+ * parent's context state for the process descriptor, work queue, and handshake
+ * between the parent + children contexts to insert safe preemption points
+ * between each of the BBs. Currently the scratch area is sized to a page.
  *
  * The layout of this scratch area is below:
  * 0   guc_process_desc
+ * + sizeof(struct guc_process_desc)   child go
+ * + CACHELINE_BYTES   child join[0]
+ * ...
+ * + CACHELINE_BYTES   child join[n - 1]
  * ... unused
  * PARENT_SCRATCH_SIZE / 2 work queue start
  * ... work queue
@@ -380,7 +386,25 @@ static inline struct i915_priolist *to_priolist(struct 
rb_node *rb)
  */
 #define WQ_SIZE(PARENT_SCRATCH_SIZE / 2)
 #define WQ_OFFSET  (PARENT_SCRATCH_SIZE - WQ_SIZE)
-static u32 __get_process_desc_offset(struct intel_context *ce)
+
+struct sync_semaphore {
+   u32 semaphore;
+   u8 unused[CACHELINE_BYTES - sizeof(u32)];
+};
+
+struct parent_scratch 

[PATCH 21/25] drm/i915/guc: Handle errors in multi-lrc requests

2021-10-14 Thread Matthew Brost
If an error occurs in the front end when multi-lrc requests are getting
generated we need to skip these in the backend but we still need to
emit the breadcrumbs seqno. An issues arises because with multi-lrc
breadcrumbs there is a handshake between the parent and children to make
forward progress. If all the requests are not present this handshake
doesn't work. To work around this, if multi-lrc request has an error we
skip the handshake but still emit the breadcrumbs seqno.

v2:
 (John Harrison)
  - Add comment explaining the skipping of the handshake logic
  - Fix typos in the commit message
v3:
 (John Harrison)
  - Fix up some comments about the math to NOP the ring

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 69 ++-
 1 file changed, 66 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 361fab2cae99..d7710debcd47 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4070,8 +4070,8 @@ static int 
emit_bb_start_child_no_preempt_mid_batch(struct i915_request *rq,
 }
 
 static u32 *
-emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
-u32 *cs)
+__emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
+  u32 *cs)
 {
struct intel_context *ce = rq->context;
u8 i;
@@ -4099,6 +4099,45 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct 
i915_request *rq,
  get_children_go_addr(ce),
  0);
 
+   return cs;
+}
+
+/*
+ * If this true, a submission of multi-lrc requests had an error and the
+ * requests need to be skipped. The front end (execuf IOCTL) should've called
+ * i915_request_skip which squashes the BB but we still need to emit the fini
+ * breadrcrumbs seqno write. At this point we don't know how many of the
+ * requests in the multi-lrc submission were generated so we can't do the
+ * handshake between the parent and children (e.g. if 4 requests should be
+ * generated but 2nd hit an error only 1 would be seen by the GuC backend).
+ * Simply skip the handshake, but still emit the breadcrumbd seqno, if an error
+ * has occurred on any of the requests in submission / relationship.
+ */
+static inline bool skip_handshake(struct i915_request *rq)
+{
+   return test_bit(I915_FENCE_FLAG_SKIP_PARALLEL, >fence.flags);
+}
+
+static u32 *
+emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
+u32 *cs)
+{
+   struct intel_context *ce = rq->context;
+
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+
+   if (unlikely(skip_handshake(rq))) {
+   /*
+* NOP everything in 
__emit_fini_breadcrumb_parent_no_preempt_mid_batch,
+* the -6 comes from the length of the emits below.
+*/
+   memset(cs, 0, sizeof(u32) *
+  (ce->engine->emit_fini_breadcrumb_dw - 6));
+   cs += ce->engine->emit_fini_breadcrumb_dw - 6;
+   } else {
+   cs = __emit_fini_breadcrumb_parent_no_preempt_mid_batch(rq, cs);
+   }
+
/* Emit fini breadcrumb */
cs = gen8_emit_ggtt_write(cs,
  rq->fence.seqno,
@@ -4115,7 +4154,8 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct 
i915_request *rq,
 }
 
 static u32 *
-emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 
*cs)
+__emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
+ u32 *cs)
 {
struct intel_context *ce = rq->context;
struct intel_context *parent = intel_context_to_parent(ce);
@@ -4142,6 +4182,29 @@ emit_fini_breadcrumb_child_no_preempt_mid_batch(struct 
i915_request *rq, u32 *cs
*cs++ = get_children_go_addr(parent);
*cs++ = 0;
 
+   return cs;
+}
+
+static u32 *
+emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
+   u32 *cs)
+{
+   struct intel_context *ce = rq->context;
+
+   GEM_BUG_ON(!intel_context_is_child(ce));
+
+   if (unlikely(skip_handshake(rq))) {
+   /*
+* NOP everything in 
__emit_fini_breadcrumb_child_no_preempt_mid_batch,
+* the -6 comes from the length of the emits below.
+*/
+   memset(cs, 0, sizeof(u32) *
+  (ce->engine->emit_fini_breadcrumb_dw - 6));
+   cs += ce->engine->emit_fini_breadcrumb_dw - 6;
+   } else {
+   cs = __emit_fini_breadcrumb_child_no_preempt_mid_batch(rq, cs);
+   }
+
/* Emit fini breadcrumb */
  

[PATCH 22/25] drm/i915: Make request conflict tracking understand parallel submits

2021-10-14 Thread Matthew Brost
If an object in the excl or shared slot is a composite fence from a
parallel submit and the current request in the conflict tracking is from
the same parallel context there is no need to enforce ordering as the
ordering is already implicit. Make the request conflict tracking
understand this by comparing a parallel submit's parent context and
skipping conflict insertion if the values match.

v2:
 (John Harrison)
  - Reword commit message

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/i915_request.c | 43 +++--
 1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 8bdf9f2f9b90..820a1f38b271 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1335,6 +1335,25 @@ i915_request_await_external(struct i915_request *rq, 
struct dma_fence *fence)
return err;
 }
 
+static inline bool is_parallel_rq(struct i915_request *rq)
+{
+   return intel_context_is_parallel(rq->context);
+}
+
+static inline struct intel_context *request_to_parent(struct i915_request *rq)
+{
+   return intel_context_to_parent(rq->context);
+}
+
+static bool is_same_parallel_context(struct i915_request *to,
+struct i915_request *from)
+{
+   if (is_parallel_rq(to))
+   return request_to_parent(to) == request_to_parent(from);
+
+   return false;
+}
+
 int
 i915_request_await_execution(struct i915_request *rq,
 struct dma_fence *fence)
@@ -1366,11 +1385,14 @@ i915_request_await_execution(struct i915_request *rq,
 * want to run our callback in all cases.
 */
 
-   if (dma_fence_is_i915(fence))
+   if (dma_fence_is_i915(fence)) {
+   if (is_same_parallel_context(rq, to_request(fence)))
+   continue;
ret = __i915_request_await_execution(rq,
 to_request(fence));
-   else
+   } else {
ret = i915_request_await_external(rq, fence);
+   }
if (ret < 0)
return ret;
} while (--nchild);
@@ -1471,10 +1493,13 @@ i915_request_await_dma_fence(struct i915_request *rq, 
struct dma_fence *fence)
 fence))
continue;
 
-   if (dma_fence_is_i915(fence))
+   if (dma_fence_is_i915(fence)) {
+   if (is_same_parallel_context(rq, to_request(fence)))
+   continue;
ret = i915_request_await_request(rq, to_request(fence));
-   else
+   } else {
ret = i915_request_await_external(rq, fence);
+   }
if (ret < 0)
return ret;
 
@@ -1525,16 +1550,6 @@ i915_request_await_object(struct i915_request *to,
return ret;
 }
 
-static inline bool is_parallel_rq(struct i915_request *rq)
-{
-   return intel_context_is_parallel(rq->context);
-}
-
-static inline struct intel_context *request_to_parent(struct i915_request *rq)
-{
-   return intel_context_to_parent(rq->context);
-}
-
 static struct i915_request *
 __i915_request_ensure_parallel_ordering(struct i915_request *rq,
struct intel_timeline *timeline)
-- 
2.32.0



[PATCH 11/25] drm/i915/guc: Implement parallel context pin / unpin functions

2021-10-14 Thread Matthew Brost
Parallel contexts are perma-pinned by the upper layers which makes the
backend implementation rather simple. The parent pins the guc_id and
children increment the parent's pin count on pin to ensure all the
contexts are unpinned before we disable scheduling with the GuC / or
deregister the context.

v2:
 (Daniel Vetter)
  - Perma-pin parallel contexts

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 70 +++
 1 file changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7ba4f1d6f69c..a412593342db 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2584,6 +2584,76 @@ static const struct intel_context_ops 
virtual_guc_context_ops = {
.get_sibling = guc_virtual_get_sibling,
 };
 
+/* Future patches will use this function */
+__maybe_unused
+static int guc_parent_context_pin(struct intel_context *ce, void *vaddr)
+{
+   struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
+   struct intel_guc *guc = ce_to_guc(ce);
+   int ret;
+
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+   GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
+
+   ret = pin_guc_id(guc, ce);
+   if (unlikely(ret < 0))
+   return ret;
+
+   return __guc_context_pin(ce, engine, vaddr);
+}
+
+/* Future patches will use this function */
+__maybe_unused
+static int guc_child_context_pin(struct intel_context *ce, void *vaddr)
+{
+   struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
+
+   GEM_BUG_ON(!intel_context_is_child(ce));
+   GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
+
+   __intel_context_pin(ce->parallel.parent);
+   return __guc_context_pin(ce, engine, vaddr);
+}
+
+/* Future patches will use this function */
+__maybe_unused
+static void guc_parent_context_unpin(struct intel_context *ce)
+{
+   struct intel_guc *guc = ce_to_guc(ce);
+
+   GEM_BUG_ON(context_enabled(ce));
+   GEM_BUG_ON(intel_context_is_barrier(ce));
+   GEM_BUG_ON(!intel_context_is_parent(ce));
+   GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
+
+   unpin_guc_id(guc, ce);
+   lrc_unpin(ce);
+}
+
+/* Future patches will use this function */
+__maybe_unused
+static void guc_child_context_unpin(struct intel_context *ce)
+{
+   GEM_BUG_ON(context_enabled(ce));
+   GEM_BUG_ON(intel_context_is_barrier(ce));
+   GEM_BUG_ON(!intel_context_is_child(ce));
+   GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
+
+   lrc_unpin(ce);
+}
+
+/* Future patches will use this function */
+__maybe_unused
+static void guc_child_context_post_unpin(struct intel_context *ce)
+{
+   GEM_BUG_ON(!intel_context_is_child(ce));
+   GEM_BUG_ON(!intel_context_is_pinned(ce->parallel.parent));
+   GEM_BUG_ON(!intel_engine_is_virtual(ce->engine));
+
+   lrc_post_unpin(ce);
+   intel_context_unpin(ce->parallel.parent);
+}
+
 static bool
 guc_irq_enable_breadcrumbs(struct intel_breadcrumbs *b)
 {
-- 
2.32.0



[PATCH 17/25] drm/i915/doc: Update parallel submit doc to point to i915_drm.h

2021-10-14 Thread Matthew Brost
Update parallel submit doc to point to i915_drm.h

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 Documentation/gpu/rfc/i915_parallel_execbuf.h | 122 --
 Documentation/gpu/rfc/i915_scheduler.rst  |   4 +-
 2 files changed, 2 insertions(+), 124 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h

diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
b/Documentation/gpu/rfc/i915_parallel_execbuf.h
deleted file mode 100644
index 8cbe2c4e0172..
--- a/Documentation/gpu/rfc/i915_parallel_execbuf.h
+++ /dev/null
@@ -1,122 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
i915_context_engines_parallel_submit */
-
-/**
- * struct drm_i915_context_engines_parallel_submit - Configure engine for
- * parallel submission.
- *
- * Setup a slot in the context engine map to allow multiple BBs to be submitted
- * in a single execbuf IOCTL. Those BBs will then be scheduled to run on the 
GPU
- * in parallel. Multiple hardware contexts are created internally in the i915
- * run these BBs. Once a slot is configured for N BBs only N BBs can be
- * submitted in each execbuf IOCTL and this is implicit behavior e.g. The user
- * doesn't tell the execbuf IOCTL there are N BBs, the execbuf IOCTL knows how
- * many BBs there are based on the slot's configuration. The N BBs are the last
- * N buffer objects or first N if I915_EXEC_BATCH_FIRST is set.
- *
- * The default placement behavior is to create implicit bonds between each
- * context if each context maps to more than 1 physical engine (e.g. context is
- * a virtual engine). Also we only allow contexts of same engine class and 
these
- * contexts must be in logically contiguous order. Examples of the placement
- * behavior described below. Lastly, the default is to not allow BBs to
- * preempted mid BB rather insert coordinated preemption on all hardware
- * contexts between each set of BBs. Flags may be added in the future to change
- * both of these default behaviors.
- *
- * Returns -EINVAL if hardware context placement configuration is invalid or if
- * the placement configuration isn't supported on the platform / submission
- * interface.
- * Returns -ENODEV if extension isn't supported on the platform / submission
- * interface.
- *
- * .. code-block:: none
- *
- * Example 1 pseudo code:
- * CS[X] = generic engine of same class, logical instance X
- * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
- * set_engines(INVALID)
- * set_parallel(engine_index=0, width=2, num_siblings=1,
- *  engines=CS[0],CS[1])
- *
- * Results in the following valid placement:
- * CS[0], CS[1]
- *
- * Example 2 pseudo code:
- * CS[X] = generic engine of same class, logical instance X
- * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
- * set_engines(INVALID)
- * set_parallel(engine_index=0, width=2, num_siblings=2,
- *  engines=CS[0],CS[2],CS[1],CS[3])
- *
- * Results in the following valid placements:
- * CS[0], CS[1]
- * CS[2], CS[3]
- *
- * This can also be thought of as 2 virtual engines described by 2-D array
- * in the engines the field with bonds placed between each index of the
- * virtual engines. e.g. CS[0] is bonded to CS[1], CS[2] is bonded to
- * CS[3].
- * VE[0] = CS[0], CS[2]
- * VE[1] = CS[1], CS[3]
- *
- * Example 3 pseudo code:
- * CS[X] = generic engine of same class, logical instance X
- * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
- * set_engines(INVALID)
- * set_parallel(engine_index=0, width=2, num_siblings=2,
- *  engines=CS[0],CS[1],CS[1],CS[3])
- *
- * Results in the following valid and invalid placements:
- * CS[0], CS[1]
- * CS[1], CS[3] - Not logical contiguous, return -EINVAL
- */
-struct drm_i915_context_engines_parallel_submit {
-   /**
-* @base: base user extension.
-*/
-   struct i915_user_extension base;
-
-   /**
-* @engine_index: slot for parallel engine
-*/
-   __u16 engine_index;
-
-   /**
-* @width: number of contexts per parallel engine
-*/
-   __u16 width;
-
-   /**
-* @num_siblings: number of siblings per context
-*/
-   __u16 num_siblings;
-
-   /**
-* @mbz16: reserved for future use; must be zero
-*/
-   __u16 mbz16;
-
-   /**
-* @flags: all undefined flags must be zero, currently not defined flags
-*/
-   __u64 flags;
-
-   /**
-* @mbz64: reserved for future use; must be zero
-*/
-   __u64 mbz64[3];
-
-   /**
-* @engines: 2-d array of engine instances to configure parallel engine
-*
-* length = width (i) * num_siblings (j)
-* index = j + i * 

[PATCH 15/25] drm/i915/guc: Update debugfs for GuC multi-lrc

2021-10-14 Thread Matthew Brost
Display the workqueue status in debugfs for GuC contexts that are in
parent-child relationship.

v2:
 (John Harrison)
  - Output number children in debugfs

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 52 ++-
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 112b5e6fe39d..938dc34e8d3a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3702,6 +3702,25 @@ static inline void guc_log_context_priority(struct 
drm_printer *p,
drm_printf(p, "\n");
 }
 
+static inline void guc_log_context(struct drm_printer *p,
+  struct intel_context *ce)
+{
+   drm_printf(p, "GuC lrc descriptor %u:\n", ce->guc_id.id);
+   drm_printf(p, "\tHW Context Desc: 0x%08x\n", ce->lrc.lrca);
+   drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n",
+  ce->ring->head,
+  ce->lrc_reg_state[CTX_RING_HEAD]);
+   drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n",
+  ce->ring->tail,
+  ce->lrc_reg_state[CTX_RING_TAIL]);
+   drm_printf(p, "\t\tContext Pin Count: %u\n",
+  atomic_read(>pin_count));
+   drm_printf(p, "\t\tGuC ID Ref Count: %u\n",
+  atomic_read(>guc_id.ref));
+   drm_printf(p, "\t\tSchedule State: 0x%x\n\n",
+  ce->guc_state.sched_state);
+}
+
 void intel_guc_submission_print_context_info(struct intel_guc *guc,
 struct drm_printer *p)
 {
@@ -3711,22 +3730,27 @@ void intel_guc_submission_print_context_info(struct 
intel_guc *guc,
 
xa_lock_irqsave(>context_lookup, flags);
xa_for_each(>context_lookup, index, ce) {
-   drm_printf(p, "GuC lrc descriptor %u:\n", ce->guc_id.id);
-   drm_printf(p, "\tHW Context Desc: 0x%08x\n", ce->lrc.lrca);
-   drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n",
-  ce->ring->head,
-  ce->lrc_reg_state[CTX_RING_HEAD]);
-   drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n",
-  ce->ring->tail,
-  ce->lrc_reg_state[CTX_RING_TAIL]);
-   drm_printf(p, "\t\tContext Pin Count: %u\n",
-  atomic_read(>pin_count));
-   drm_printf(p, "\t\tGuC ID Ref Count: %u\n",
-  atomic_read(>guc_id.ref));
-   drm_printf(p, "\t\tSchedule State: 0x%x\n\n",
-  ce->guc_state.sched_state);
+   GEM_BUG_ON(intel_context_is_child(ce));
 
+   guc_log_context(p, ce);
guc_log_context_priority(p, ce);
+
+   if (intel_context_is_parent(ce)) {
+   struct guc_process_desc *desc = __get_process_desc(ce);
+   struct intel_context *child;
+
+   drm_printf(p, "\t\tNumber children: %u\n",
+  ce->parallel.number_children);
+   drm_printf(p, "\t\tWQI Head: %u\n",
+  READ_ONCE(desc->head));
+   drm_printf(p, "\t\tWQI Tail: %u\n",
+  READ_ONCE(desc->tail));
+   drm_printf(p, "\t\tWQI Status: %u\n\n",
+  READ_ONCE(desc->wq_status));
+
+   for_each_child(ce, child)
+   guc_log_context(p, child);
+   }
}
xa_unlock_irqrestore(>context_lookup, flags);
 }
-- 
2.32.0



[PATCH 08/25] drm/i915/guc: Add multi-lrc context registration

2021-10-14 Thread Matthew Brost
Add multi-lrc context registration H2G. In addition a workqueue and
process descriptor are setup during multi-lrc context registration as
these data structures are needed for multi-lrc submission.

v2:
 (John Harrison)
  - Move GuC specific fields into sub-struct
  - Clean up WQ defines
  - Add comment explaining math to derive WQ / PD address
v3:
 (John Harrison)
  - Add PARENT_SCRATCH_SIZE define
  - Update comment explaining multi-lrc register
v4:
 (John Harrison)
  - Move PARENT_SCRATCH_SIZE to common file

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_context.h   |   2 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |  12 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c   |   5 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 -
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 115 +-
 6 files changed, 134 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index b63c10a144af..9f0995150a7a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -44,6 +44,8 @@ void intel_context_free(struct intel_context *ce);
 int intel_context_reconfigure_sseu(struct intel_context *ce,
   const struct intel_sseu sseu);
 
+#define PARENT_SCRATCH_SIZEPAGE_SIZE
+
 static inline bool intel_context_is_child(struct intel_context *ce)
 {
return !!ce->parallel.parent;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 76dfca57cb45..48decb5ee954 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -239,6 +239,18 @@ struct intel_context {
struct intel_context *parent;
/** @number_children: number of children if parent */
u8 number_children;
+   /** @guc: GuC specific members for parallel submission */
+   struct {
+   /** @wqi_head: head pointer in work queue */
+   u16 wqi_head;
+   /** @wqi_tail: tail pointer in work queue */
+   u16 wqi_tail;
+   /**
+* @parent_page: page in context state (ce->state) used
+* by parent for work queue, process descriptor
+*/
+   u8 parent_page;
+   } guc;
} parallel;
 
 #ifdef CONFIG_DRM_I915_SELFTEST
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3ef9eaf8c50e..56156cf18c41 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -942,6 +942,11 @@ __lrc_alloc_state(struct intel_context *ce, struct 
intel_engine_cs *engine)
context_size += PAGE_SIZE;
}
 
+   if (intel_context_is_parent(ce) && intel_engine_uses_guc(engine)) {
+   ce->parallel.guc.parent_page = context_size / PAGE_SIZE;
+   context_size += PARENT_SCRATCH_SIZE;
+   }
+
obj = i915_gem_object_create_lmem(engine->i915, context_size,
  I915_BO_ALLOC_PM_VOLATILE);
if (IS_ERR(obj))
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
index 8ff58aff..ba10bd374cee 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
@@ -142,6 +142,7 @@ enum intel_guc_action {
INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE = 0x4600,
+   INTEL_GUC_ACTION_REGISTER_CONTEXT_MULTI_LRC = 0x4601,
INTEL_GUC_ACTION_RESET_CLIENT = 0x5507,
INTEL_GUC_ACTION_LIMIT
 };
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index fa4be13c8854..0eeb2a9feeed 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -52,8 +52,6 @@
 
 #define GUC_DOORBELL_INVALID   256
 
-#define GUC_WQ_SIZE(PAGE_SIZE * 2)
-
 /* Work queue item header definitions */
 #define WQ_STATUS_ACTIVE   1
 #define WQ_STATUS_SUSPENDED2
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index fd6594bc1b96..d9f5be00e586 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -344,6 +344,46 @@ static inline struct i915_priolist *to_priolist(struct 
rb_node *rb)
return rb_entry(rb, struct i915_priolist, node);
 }
 
+/*
+ * When using multi-lrc submission a scratch memory area is 

[PATCH 20/25] drm/i915: Multi-BB execbuf

2021-10-14 Thread Matthew Brost
Allow multiple batch buffers to be submitted in a single execbuf IOCTL
after a context has been configured with the 'set_parallel' extension.
The number batches is implicit based on the contexts configuration.

This is implemented with a series of loops. First a loop is used to find
all the batches, a loop to pin all the HW contexts, a loop to create all
the requests, a loop to submit (emit BB start, etc...) all the requests,
a loop to tie the requests to the VMAs they touch, and finally a loop to
commit the requests to the backend.

A composite fence is also created for the generated requests to return
to the user and to stick in dma resv slots.

No behavior from the existing IOCTL should be changed aside from when
throttling because the ring for a context is full. In this situation,
i915 will now wait while holding the object locks. This change was done
because the code is much simpler to wait while holding the locks and we
believe there isn't a huge benefit of dropping these locks. If this
proves false we can restructure the code to drop the locks during the
wait.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Matthew Brost)
  - Return proper error value if i915_request_create fails
v3:
 (John Harrison)
  - Add comment explaining create / add order loops + locking
  - Update commit message explaining different in IOCTL behavior
  - Line wrap some comments
  - eb_add_request returns void
  - Return -EINVAL rather triggering BUG_ON if cmd parser used
 (Checkpatch)
  - Check eb->batch_len[*current_batch]
v4:
 (CI)
  - Set batch len if passed if via execbuf args
  - Call __i915_request_skip after __i915_request_commit
 (Kernel test robot)
  - Initialize rq to NULL in eb_pin_timeline
v5:
 (John Harrison)
  - Fix typo in comments near bb order loops

Signed-off-by: Matthew Brost 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 783 --
 drivers/gpu/drm/i915/gt/intel_context.h   |   8 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  10 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   2 +
 drivers/gpu/drm/i915/i915_request.h   |   9 +
 drivers/gpu/drm/i915/i915_vma.c   |  21 +-
 drivers/gpu/drm/i915/i915_vma.h   |  13 +-
 7 files changed, 595 insertions(+), 251 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c75afc8784e3..fc30856e81fa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -246,17 +246,25 @@ struct i915_execbuffer {
struct drm_i915_gem_exec_object2 *exec; /** ioctl execobj[] */
struct eb_vma *vma;
 
-   struct intel_engine_cs *engine; /** engine to queue the request to */
+   struct intel_gt *gt; /* gt for the execbuf */
struct intel_context *context; /* logical state for the request */
struct i915_gem_context *gem_context; /** caller's context */
 
-   struct i915_request *request; /** our request to build */
-   struct eb_vma *batch; /** identity of the batch obj/vma */
+   /** our requests to build */
+   struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+   /** identity of the batch obj/vma */
+   struct eb_vma *batches[MAX_ENGINE_INSTANCE + 1];
struct i915_vma *trampoline; /** trampoline used for chaining */
 
+   /** used for excl fence in dma_resv objects when > 1 BB submitted */
+   struct dma_fence *composite_fence;
+
/** actual size of execobj[] as we may extend it for the cmdparser */
unsigned int buffer_count;
 
+   /* number of batches in execbuf IOCTL */
+   unsigned int num_batches;
+
/** list of vma not yet bound during reservation phase */
struct list_head unbound;
 
@@ -283,7 +291,8 @@ struct i915_execbuffer {
 
u64 invalid_flags; /** Set of execobj.flags that are invalid */
 
-   u64 batch_len; /** Length of batch within object */
+   /** Length of batch within object */
+   u64 batch_len[MAX_ENGINE_INSTANCE + 1];
u32 batch_start_offset; /** Location within object of batch */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch 
buffer */
@@ -301,14 +310,13 @@ struct i915_execbuffer {
 };
 
 static int eb_parse(struct i915_execbuffer *eb);
-static struct i915_request *eb_pin_engine(struct i915_execbuffer *eb,
- bool throttle);
+static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
 static void eb_unpin_engine(struct i915_execbuffer *eb);
 
 static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
 {
-   return intel_engine_requires_cmd_parser(eb->engine) ||
-   (intel_engine_using_cmd_parser(eb->engine) &&
+   return 

[PATCH 10/25] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids

2021-10-14 Thread Matthew Brost
Assign contexts in parent-child relationship consecutive guc_ids. This
is accomplished by partitioning guc_id space between ones that need to
be consecutive (1/16 available guc_ids) and ones that do not (15/16 of
available guc_ids). The consecutive search is implemented via the bitmap
API.

This is a precursor to the full GuC multi-lrc implementation but aligns
to how GuC mutli-lrc interface is defined - guc_ids must be consecutive
when using the GuC multi-lrc interface.

v2:
 (Daniel Vetter)
  - Explicitly state why we assign consecutive guc_ids
v3:
 (John Harrison)
  - Bring back in spin lock

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|   6 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 104 ++
 2 files changed, 86 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 74f071a0b6d5..4ca197f400ba 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -82,9 +82,13 @@ struct intel_guc {
 */
spinlock_t lock;
/**
-* @guc_ids: used to allocate new guc_ids
+* @guc_ids: used to allocate new guc_ids, single-lrc
 */
struct ida guc_ids;
+   /**
+* @guc_ids_bitmap: used to allocate new guc_ids, multi-lrc
+*/
+   unsigned long *guc_ids_bitmap;
/**
 * @guc_id_list: list of intel_context with valid guc_ids but no
 * refs
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index fbcf2dc2b2de..7ba4f1d6f69c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -128,6 +128,16 @@ guc_create_virtual(struct intel_engine_cs **siblings, 
unsigned int count);
 
 #define GUC_REQUEST_SIZE 64 /* bytes */
 
+/*
+ * We reserve 1/16 of the guc_ids for multi-lrc as these need to be contiguous
+ * per the GuC submission interface. A different allocation algorithm is used
+ * (bitmap vs. ida) between multi-lrc and single-lrc hence the reason to
+ * partition the guc_id space. We believe the number of multi-lrc contexts in
+ * use should be low and 1/16 should be sufficient. Minimum of 32 guc_ids for
+ * multi-lrc.
+ */
+#define NUMBER_MULTI_LRC_GUC_ID(GUC_MAX_LRC_DESCRIPTORS / 16)
+
 /*
  * Below is a set of functions which control the GuC scheduling state which
  * require a lock.
@@ -1207,6 +1217,11 @@ int intel_guc_submission_init(struct intel_guc *guc)
INIT_WORK(>submission_state.destroyed_worker,
  destroyed_worker_func);
 
+   guc->submission_state.guc_ids_bitmap =
+   bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID, GFP_KERNEL);
+   if (!guc->submission_state.guc_ids_bitmap)
+   return -ENOMEM;
+
return 0;
 }
 
@@ -1218,6 +1233,7 @@ void intel_guc_submission_fini(struct intel_guc *guc)
guc_flush_destroyed_contexts(guc);
guc_lrc_desc_pool_destroy(guc);
i915_sched_engine_put(guc->sched_engine);
+   bitmap_free(guc->submission_state.guc_ids_bitmap);
 }
 
 static inline void queue_request(struct i915_sched_engine *sched_engine,
@@ -1269,18 +1285,43 @@ static void guc_submit_request(struct i915_request *rq)
spin_unlock_irqrestore(_engine->lock, flags);
 }
 
-static int new_guc_id(struct intel_guc *guc)
+static int new_guc_id(struct intel_guc *guc, struct intel_context *ce)
 {
-   return ida_simple_get(>submission_state.guc_ids, 0,
- GUC_MAX_LRC_DESCRIPTORS, GFP_KERNEL |
- __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+   int ret;
+
+   GEM_BUG_ON(intel_context_is_child(ce));
+
+   if (intel_context_is_parent(ce))
+   ret = 
bitmap_find_free_region(guc->submission_state.guc_ids_bitmap,
+ NUMBER_MULTI_LRC_GUC_ID,
+ 
order_base_2(ce->parallel.number_children
+  + 1));
+   else
+   ret = ida_simple_get(>submission_state.guc_ids,
+NUMBER_MULTI_LRC_GUC_ID,
+GUC_MAX_LRC_DESCRIPTORS,
+GFP_KERNEL | __GFP_RETRY_MAYFAIL |
+__GFP_NOWARN);
+   if (unlikely(ret < 0))
+   return ret;
+
+   ce->guc_id.id = ret;
+   return 0;
 }
 
 static void __release_guc_id(struct intel_guc *guc, struct intel_context *ce)
 {
+   GEM_BUG_ON(intel_context_is_child(ce));
+
if (!context_guc_id_invalid(ce)) {
-   ida_simple_remove(>submission_state.guc_ids,
- 

[PATCH 07/25] drm/i915/guc: Introduce context parent-child relationship

2021-10-14 Thread Matthew Brost
Introduce context parent-child relationship. Once this relationship is
created all pinning / unpinning operations are directed to the parent
context. The parent context is responsible for pinning all of its
children and itself.

This is a precursor to the full GuC multi-lrc implementation but aligns
to how GuC mutli-lrc interface is defined - a single H2G is used
register / deregister all of the contexts simultaneously.

Subsequent patches in the series will implement the pinning / unpinning
operations for parent / child contexts.

v2:
 (Daniel Vetter)
  - Add kernel doc, add wrapper to access parent to ensure safety
v3:
 (John Harrison)
  - Fix comment explaing GEM_BUG_ON in to_parent()
  - Make variable names generic (non-GuC specific)
v4:
 (John Harrison)
  - s/its'/its/g

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context.c   | 29 +
 drivers/gpu/drm/i915/gt/intel_context.h   | 41 +++
 drivers/gpu/drm/i915/gt/intel_context_types.h | 21 ++
 3 files changed, 91 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index f98c9f470ba1..79f321c6c008 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -404,6 +404,8 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
 
INIT_LIST_HEAD(>destroyed_link);
 
+   INIT_LIST_HEAD(>parallel.child_list);
+
/*
 * Initialize fence to be complete as this is expected to be complete
 * unless there is a pending schedule disable outstanding.
@@ -418,10 +420,17 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
 
 void intel_context_fini(struct intel_context *ce)
 {
+   struct intel_context *child, *next;
+
if (ce->timeline)
intel_timeline_put(ce->timeline);
i915_vm_put(ce->vm);
 
+   /* Need to put the creation ref for the children */
+   if (intel_context_is_parent(ce))
+   for_each_child_safe(ce, child, next)
+   intel_context_put(child);
+
mutex_destroy(>pin_mutex);
i915_active_fini(>active);
i915_sw_fence_fini(>guc_state.blocked);
@@ -538,6 +547,26 @@ struct i915_request 
*intel_context_find_active_request(struct intel_context *ce)
return active;
 }
 
+void intel_context_bind_parent_child(struct intel_context *parent,
+struct intel_context *child)
+{
+   /*
+* Callers responsibility to validate that this function is used
+* correctly but we use GEM_BUG_ON here ensure that they do.
+*/
+   GEM_BUG_ON(!intel_engine_uses_guc(parent->engine));
+   GEM_BUG_ON(intel_context_is_pinned(parent));
+   GEM_BUG_ON(intel_context_is_child(parent));
+   GEM_BUG_ON(intel_context_is_pinned(child));
+   GEM_BUG_ON(intel_context_is_child(child));
+   GEM_BUG_ON(intel_context_is_parent(child));
+
+   parent->parallel.number_children++;
+   list_add_tail(>parallel.child_link,
+ >parallel.child_list);
+   child->parallel.parent = parent;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index c41098950746..b63c10a144af 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -44,6 +44,47 @@ void intel_context_free(struct intel_context *ce);
 int intel_context_reconfigure_sseu(struct intel_context *ce,
   const struct intel_sseu sseu);
 
+static inline bool intel_context_is_child(struct intel_context *ce)
+{
+   return !!ce->parallel.parent;
+}
+
+static inline bool intel_context_is_parent(struct intel_context *ce)
+{
+   return !!ce->parallel.number_children;
+}
+
+static inline bool intel_context_is_pinned(struct intel_context *ce);
+
+static inline struct intel_context *
+intel_context_to_parent(struct intel_context *ce)
+{
+   if (intel_context_is_child(ce)) {
+   /*
+* The parent holds ref count to the child so it is always safe
+* for the parent to access the child, but the child has a
+* pointer to the parent without a ref. To ensure this is safe
+* the child should only access the parent pointer while the
+* parent is pinned.
+*/
+   GEM_BUG_ON(!intel_context_is_pinned(ce->parallel.parent));
+
+   return ce->parallel.parent;
+   } else {
+   return ce;
+   }
+}
+
+void intel_context_bind_parent_child(struct intel_context *parent,
+struct intel_context *child);
+
+#define for_each_child(parent, ce)\
+   list_for_each_entry(ce, &(parent)->parallel.child_list,\
+

Re: [PATCH 1/7] drm/bridge: display-connector: implement bus fmts callbacks

2021-10-14 Thread Sam Ravnborg
Hi Neil,

code looks fine. A few improvement proposals to the comments.
With the include order fixed and the comments considered:
Reviewed-by: Sam Ravnborg 

Sam

On Thu, Oct 14, 2021 at 05:26:00PM +0200, Neil Armstrong wrote:
> Since this bridge is tied to the connector, it acts like a passthrough,
> so concerning the output & input bus formats, either pass the bus formats 
> from the
> previous bridge or return fallback data like done in the bridge function:
> drm_atomic_bridge_chain_select_bus_fmts() & select_bus_fmt_recursive.
> 
> This permits avoiding skipping the negociation if the remaining bridge chain 
> has
> all the bits in place.
> 
> Without this bus fmt negociation breaks on drm/meson HDMI pipeline when 
> attaching
> dw-hdmi with DRM_BRIDGE_ATTACH_NO_CONNECTOR, because the last bridge of the
> display-connector doesn't implement buf fmt callbacks and MEDIA_BUS_FMT_FIXED
> is used leading to select an unsupported default bus format from dw-hdmi.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/gpu/drm/bridge/display-connector.c | 88 ++
>  1 file changed, 88 insertions(+)
> 
> diff --git a/drivers/gpu/drm/bridge/display-connector.c 
> b/drivers/gpu/drm/bridge/display-connector.c
> index 05eb759da6fc..9697ac173157 100644
> --- a/drivers/gpu/drm/bridge/display-connector.c
> +++ b/drivers/gpu/drm/bridge/display-connector.c
> @@ -14,6 +14,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
Alphabetic order.

>  
>  struct display_connector {
> @@ -87,10 +88,97 @@ static struct edid *display_connector_get_edid(struct 
> drm_bridge *bridge,
>   return drm_get_edid(connector, conn->bridge.ddc);
>  }
>  
> +/*
> + * Since this bridge is tied to the connector, it acts like a passthrough,
> + * so concerning the output bus formats, either pass the bus formats from the
> + * previous bridge or return fallback data like done in the bridge function:
> + * drm_atomic_bridge_chain_select_bus_fmts().
> + * This permits avoiding skipping the negociation if the bridge chain has all
> + * bits in place.
negociation if => negotiation of

Consider the following wording:
This supports negotiation if the bridge chain has all bits in place.

> + */
> +static u32 *display_connector_get_output_bus_fmts(struct drm_bridge *bridge,
> + struct drm_bridge_state *bridge_state,
> + struct drm_crtc_state *crtc_state,
> + struct drm_connector_state *conn_state,
> + unsigned int *num_output_fmts)
> +{
> + struct drm_bridge *prev_bridge = drm_bridge_get_prev_bridge(bridge);
> + struct drm_bridge_state *prev_bridge_state;
> +
> + if (!prev_bridge || !prev_bridge->funcs->atomic_get_output_bus_fmts) {
> + struct drm_connector *conn = conn_state->connector;
> + u32 *out_bus_fmts;
> +
> + *num_output_fmts = 1;
> + out_bus_fmts = kmalloc(sizeof(*out_bus_fmts), GFP_KERNEL);
> + if (!out_bus_fmts)
> + return NULL;
> +
> + if (conn->display_info.num_bus_formats &&
> + conn->display_info.bus_formats)
> + out_bus_fmts[0] = conn->display_info.bus_formats[0];
> + else
> + out_bus_fmts[0] = MEDIA_BUS_FMT_FIXED;
> +
> + return out_bus_fmts;
> + }
> +
> + prev_bridge_state = drm_atomic_get_new_bridge_state(crtc_state->state,
> + prev_bridge);
> +
> + return prev_bridge->funcs->atomic_get_output_bus_fmts(prev_bridge, 
> prev_bridge_state,
> +   crtc_state, 
> conn_state,
> +   num_output_fmts);
> +}
> +
> +/*
> + * Since this bridge is tied to the connector, it acts like a passthrough,
> + * so concerning the input bus formats, either pass the bus formats from the
> + * previous bridge or return fallback data like done in the bridge function:
> + * select_bus_fmt_recursive() when atomic_get_input_bus_fmts is not 
> supported.
Maybe use this this:
from the previous bridge or MEDIA_BUS_FMT_FIXED (like 
select_bus_fmt_recursive())


> + * This permits avoiding skipping the negociation if the bridge chain has all
> + * bits in place.
Like above
> + */
> +static u32 *display_connector_get_input_bus_fmts(struct drm_bridge *bridge,
> + struct drm_bridge_state *bridge_state,
> + struct drm_crtc_state *crtc_state,
> + struct drm_connector_state *conn_state,
> + u32 output_fmt,
> + unsigned int *num_input_fmts)
> +{
> + struct drm_bridge *prev_bridge = drm_bridge_get_prev_bridge(bridge);
> + struct drm_bridge_state 

[PATCH 02/25] drm/i915/guc: Take GT PM ref when deregistering context

2021-10-14 Thread Matthew Brost
Taking a PM reference to prevent intel_gt_wait_for_idle from short
circuiting while a deregister context H2G is in flight. To do this must
issue the deregister H2G from a worker as context can be destroyed from
an atomic context and taking GT PM ref blows up. Previously we took a
runtime PM from this atomic context which worked but will stop working
once runtime pm autosuspend in enabled.

So this patch is two fold, stop intel_gt_wait_for_idle from short
circuting and fix runtime pm autosuspend.

v2:
 (John Harrison)
  - Split structure changes out in different patch
 (Tvrtko)
  - Don't drop lock in deregister_destroyed_contexts
v3:
 (John Harrison)
  - Flush destroyed contexts before destroying context reg pool

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |   2 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   7 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h |   5 +
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |   4 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  11 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 146 +++---
 6 files changed, 121 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 35babd02ddfe..d008ef8623ce 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -400,6 +400,8 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
ce->guc_id.id = GUC_INVALID_LRC_ID;
INIT_LIST_HEAD(>guc_id.link);
 
+   INIT_LIST_HEAD(>destroyed_link);
+
/*
 * Initialize fence to be complete as this is expected to be complete
 * unless there is a pending schedule disable outstanding.
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e7e3984aab78..4613d027cbc3 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -213,6 +213,13 @@ struct intel_context {
struct list_head link;
} guc_id;
 
+   /**
+* @destroyed_link: link in guc->submission_state.destroyed_contexts, in
+* list when context is pending to be destroyed (deregistered with the
+* GuC), protected by guc->submission_state.lock
+*/
+   struct list_head destroyed_link;
+
 #ifdef CONFIG_DRM_I915_SELFTEST
/**
 * @drop_schedule_enable: Force drop of schedule enable G2H for selftest
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
index 8520c595f5e1..6fdeae668e6e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -16,6 +16,11 @@ intel_engine_pm_is_awake(const struct intel_engine_cs 
*engine)
return intel_wakeref_is_active(>wakeref);
 }
 
+static inline void __intel_engine_pm_get(struct intel_engine_cs *engine)
+{
+   __intel_wakeref_get(>wakeref);
+}
+
 static inline void intel_engine_pm_get(struct intel_engine_cs *engine)
 {
intel_wakeref_get(>wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index d0588d8aaa44..05de6c1af25b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -41,6 +41,10 @@ static inline void intel_gt_pm_put_async(struct intel_gt *gt)
intel_wakeref_put_async(>wakeref);
 }
 
+#define with_intel_gt_pm(gt, tmp) \
+   for (tmp = 1, intel_gt_pm_get(gt); tmp; \
+intel_gt_pm_put(gt), tmp = 0)
+
 static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
 {
return intel_wakeref_wait_for_idle(>wakeref);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 82e248c2290c..74f071a0b6d5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -90,6 +90,17 @@ struct intel_guc {
 * refs
 */
struct list_head guc_id_list;
+   /**
+* @destroyed_contexts: list of contexts waiting to be destroyed
+* (deregistered with the GuC)
+*/
+   struct list_head destroyed_contexts;
+   /**
+* @destroyed_worker: worker to deregister contexts, need as we
+* need to take a GT PM reference and can't from destroy
+* function as it might be in an atomic context (no sleeping)
+*/
+   struct work_struct destroyed_worker;
} submission_state;
 
/**
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 48bdcdb823b3..5c53428032b5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -90,8 +90,8 @@
  

[PATCH 03/25] drm/i915/guc: Take engine PM when a context is pinned with GuC submission

2021-10-14 Thread Matthew Brost
Taking a PM reference to prevent intel_gt_wait_for_idle from short
circuiting while any user context has scheduling enabled. Returning GT
idle when it is not can cause all sorts of issues throughout the stack.

v2:
 (Daniel Vetter)
  - Add might_lock annotations to pin / unpin function
v3:
 (CI)
  - Drop intel_engine_pm_might_put from unpin path as an async put is
used
v4:
 (John Harrison)
  - Make intel_engine_pm_might_get/put work with GuC virtual engines
  - Update commit message
v5:
  - Update commit message again

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.h | 32 +
 drivers/gpu/drm/i915/gt/intel_gt_pm.h | 10 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 36 +--
 drivers/gpu/drm/i915/intel_wakeref.h  | 12 +++
 5 files changed, 89 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index d008ef8623ce..f98c9f470ba1 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -240,6 +240,8 @@ int __intel_context_do_pin_ww(struct intel_context *ce,
if (err)
goto err_post_unpin;
 
+   intel_engine_pm_might_get(ce->engine);
+
if (unlikely(intel_context_is_closed(ce))) {
err = -ENOENT;
goto err_unlock;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
index 6fdeae668e6e..d68675925b79 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -6,9 +6,11 @@
 #ifndef INTEL_ENGINE_PM_H
 #define INTEL_ENGINE_PM_H
 
+#include "i915_drv.h"
 #include "i915_request.h"
 #include "intel_engine_types.h"
 #include "intel_wakeref.h"
+#include "intel_gt_pm.h"
 
 static inline bool
 intel_engine_pm_is_awake(const struct intel_engine_cs *engine)
@@ -31,6 +33,21 @@ static inline bool intel_engine_pm_get_if_awake(struct 
intel_engine_cs *engine)
return intel_wakeref_get_if_active(>wakeref);
 }
 
+static inline void intel_engine_pm_might_get(struct intel_engine_cs *engine)
+{
+   if (!intel_engine_is_virtual(engine)) {
+   intel_wakeref_might_get(>wakeref);
+   } else {
+   struct intel_gt *gt = engine->gt;
+   struct intel_engine_cs *tengine;
+   intel_engine_mask_t tmp, mask = engine->mask;
+
+   for_each_engine_masked(tengine, gt, mask, tmp)
+   intel_wakeref_might_get(>wakeref);
+   }
+   intel_gt_pm_might_get(engine->gt);
+}
+
 static inline void intel_engine_pm_put(struct intel_engine_cs *engine)
 {
intel_wakeref_put(>wakeref);
@@ -52,6 +69,21 @@ static inline void intel_engine_pm_flush(struct 
intel_engine_cs *engine)
intel_wakeref_unlock_wait(>wakeref);
 }
 
+static inline void intel_engine_pm_might_put(struct intel_engine_cs *engine)
+{
+   if (!intel_engine_is_virtual(engine)) {
+   intel_wakeref_might_put(>wakeref);
+   } else {
+   struct intel_gt *gt = engine->gt;
+   struct intel_engine_cs *tengine;
+   intel_engine_mask_t tmp, mask = engine->mask;
+
+   for_each_engine_masked(tengine, gt, mask, tmp)
+   intel_wakeref_might_put(>wakeref);
+   }
+   intel_gt_pm_might_put(engine->gt);
+}
+
 static inline struct i915_request *
 intel_engine_create_kernel_request(struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index 05de6c1af25b..bc898df7a48c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -31,6 +31,11 @@ static inline bool intel_gt_pm_get_if_awake(struct intel_gt 
*gt)
return intel_wakeref_get_if_active(>wakeref);
 }
 
+static inline void intel_gt_pm_might_get(struct intel_gt *gt)
+{
+   intel_wakeref_might_get(>wakeref);
+}
+
 static inline void intel_gt_pm_put(struct intel_gt *gt)
 {
intel_wakeref_put(>wakeref);
@@ -41,6 +46,11 @@ static inline void intel_gt_pm_put_async(struct intel_gt *gt)
intel_wakeref_put_async(>wakeref);
 }
 
+static inline void intel_gt_pm_might_put(struct intel_gt *gt)
+{
+   intel_wakeref_might_put(>wakeref);
+}
+
 #define with_intel_gt_pm(gt, tmp) \
for (tmp = 1, intel_gt_pm_get(gt); tmp; \
 intel_gt_pm_put(gt), tmp = 0)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 5c53428032b5..795cc071d05d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1571,7 +1571,12 @@ static int guc_context_pre_pin(struct intel_context *ce,
 
 static int guc_context_pin(struct intel_context *ce, void *vaddr)
 {
-   

[PATCH 00/25] Parallel submission aka multi-bb execbuf

2021-10-14 Thread Matthew Brost
As discussed in [1] we are introducing a new parallel submission uAPI
for the i915 which allows more than 1 BB to be submitted in an execbuf
IOCTL. This is the implemenation for both GuC and execlists.

In addition to selftests in the series, an IGT is available implemented
in the first 4 patches [2].

The execbuf IOCTL changes have been done in a single large patch (#21)
as all the changes flow together and I believe a single patch will be
better if some one has to lookup this change in the future. Can split in
a series of smaller patches if desired.

This code is available in a public [3] repo for UMD teams to test there
code on.

v2: Drop complicated state machine to block in kernel if no guc_ids
available, perma-pin parallel contexts, reworker execbuf IOCTL to be a
series of loops inside the IOCTL rather than 1 large one on the outside,
address Daniel Vetter's comments
v3: Address John Harrison's comments, add a couple of patches which fix
bugs found internally
v4: Address John Harrison's latest round of comments
v5: Address John Harrison's latest round of comments, resend for CI

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/92028/
[2] https://patchwork.freedesktop.org/series/93071/
[3] 
https://gitlab.freedesktop.org/mbrost/mbrost-drm-intel/-/tree/drm-intel-parallel

Matthew Brost (25):
  drm/i915/guc: Move GuC guc_id allocation under submission state
sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC
submission
  drm/i915/guc: Don't call switch_to_kernel_context with GuC submission
  drm/i915: Add logical engine mapping
  drm/i915: Expose logical engine instance to user
  drm/i915/guc: Introduce context parent-child relationship
  drm/i915/guc: Add multi-lrc context registration
  drm/i915/guc: Ensure GuC schedule operations do not operate on child
contexts
  drm/i915/guc: Assign contexts in parent-child relationship consecutive
guc_ids
  drm/i915/guc: Implement parallel context pin / unpin functions
  drm/i915/guc: Implement multi-lrc submission
  drm/i915/guc: Insert submit fences between requests in parent-child
relationship
  drm/i915/guc: Implement multi-lrc reset
  drm/i915/guc: Update debugfs for GuC multi-lrc
  drm/i915/guc: Connect UAPI to GuC multi-lrc interface
  drm/i915/doc: Update parallel submit doc to point to i915_drm.h
  drm/i915/guc: Add basic GuC multi-lrc selftest
  drm/i915/guc: Implement no mid batch preemption for multi-lrc
  drm/i915: Multi-BB execbuf
  drm/i915/guc: Handle errors in multi-lrc requests
  drm/i915: Make request conflict tracking understand parallel submits
  drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
  drm/i915: Enable multi-bb execbuf
  drm/i915/execlists: Weak parallel submission support for execlists

 Documentation/gpu/rfc/i915_parallel_execbuf.h |  122 --
 Documentation/gpu/rfc/i915_scheduler.rst  |4 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c  |   57 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  229 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   16 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  786 ++---
 drivers/gpu/drm/i915/gt/intel_context.c   |   50 +-
 drivers/gpu/drm/i915/gt/intel_context.h   |   56 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   73 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|   12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   66 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   13 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h |   37 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   63 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |   14 +
 drivers/gpu/drm/i915/gt/intel_lrc.c   |7 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   12 +-
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|   29 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|   54 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |   24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   34 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1444 ++---
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  179 ++
 drivers/gpu/drm/i915/i915_query.c |2 +
 drivers/gpu/drm/i915/i915_request.c   |  143 +-
 drivers/gpu/drm/i915/i915_request.h   |   23 +
 drivers/gpu/drm/i915/i915_vma.c   |   21 +-
 drivers/gpu/drm/i915/i915_vma.h   |   13 +-
 drivers/gpu/drm/i915/intel_wakeref.h  |   12 +
 .../drm/i915/selftests/i915_live_selftests.h  |1 +
 include/uapi/drm/i915_drm.h   |  139 +-
 34 files changed, 3053 insertions(+), 692 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 

[PATCH 01/25] drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct

2021-10-14 Thread Matthew Brost
Move guc_id allocation under submission state sub-struct as a future
patch will reuse the spin lock as a global submission state lock. Moving
this into sub-struct makes ownership of fields / lock clear.

v2:
 (Docs)
  - Add comment for submission_state sub-structure
v3:
 (John Harrison)
  - Fixup a few comments
v4:
 (John Harrison)
  - Fix typo

Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  6 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc.h| 28 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 46 ++-
 3 files changed, 46 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 12252c411159..e7e3984aab78 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -197,18 +197,18 @@ struct intel_context {
struct {
/**
 * @id: handle which is used to uniquely identify this context
-* with the GuC, protected by guc->contexts_lock
+* with the GuC, protected by guc->submission_state.lock
 */
u16 id;
/**
 * @ref: the number of references to the guc_id, when
 * transitioning in and out of zero protected by
-* guc->contexts_lock
+* guc->submission_state.lock
 */
atomic_t ref;
/**
 * @link: in guc->guc_id_list when the guc_id has no refs but is
-* still valid, protected by guc->contexts_lock
+* still valid, protected by guc->submission_state.lock
 */
struct list_head link;
} guc_id;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 5dd174babf7a..82e248c2290c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -71,16 +71,26 @@ struct intel_guc {
} interrupts;
 
/**
-* @contexts_lock: protects guc_ids, guc_id_list, ce->guc_id.id, and
-* ce->guc_id.ref when transitioning in and out of zero
+* @submission_state: sub-structure for submission state protected by
+* single lock
 */
-   spinlock_t contexts_lock;
-   /** @guc_ids: used to allocate unique ce->guc_id.id values */
-   struct ida guc_ids;
-   /**
-* @guc_id_list: list of intel_context with valid guc_ids but no refs
-*/
-   struct list_head guc_id_list;
+   struct {
+   /**
+* @lock: protects everything in submission_state,
+* ce->guc_id.id, and ce->guc_id.ref when transitioning in and
+* out of zero
+*/
+   spinlock_t lock;
+   /**
+* @guc_ids: used to allocate new guc_ids
+*/
+   struct ida guc_ids;
+   /**
+* @guc_id_list: list of intel_context with valid guc_ids but no
+* refs
+*/
+   struct list_head guc_id_list;
+   } submission_state;
 
/**
 * @submission_supported: tracks whether we support GuC submission on
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ba0de35f6323..48bdcdb823b3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -68,7 +68,7 @@
  * fence is used to stall all requests associated with this guc_id until the
  * corresponding G2H returns indicating the guc_id has been deregistered.
  *
- * guc_ids:
+ * submission_state.guc_ids:
  * Unique number associated with private GuC context data passed in during
  * context registration / submission / deregistration. 64k available. Simple 
ida
  * is used for allocation.
@@ -89,7 +89,7 @@
  * sched_engine can be submitting at a time. Currently only one sched_engine is
  * used for all of GuC submission but that could change in the future.
  *
- * guc->contexts_lock
+ * guc->submission_state.lock
  * Protects guc_id allocation for the given GuC, i.e. only one context can be
  * doing guc_id allocation operations at a time for each GuC in the system.
  *
@@ -103,7 +103,7 @@
  *
  * Lock ordering rules:
  * sched_engine->lock -> ce->guc_state.lock
- * guc->contexts_lock -> ce->guc_state.lock
+ * guc->submission_state.lock -> ce->guc_state.lock
  *
  * Reset races:
  * When a full GT reset is triggered it is assumed that some G2H responses to
@@ -1148,9 +1148,9 @@ int intel_guc_submission_init(struct intel_guc *guc)
 
xa_init_flags(>context_lookup, XA_FLAGS_LOCK_IRQ);
 
-   spin_lock_init(>contexts_lock);
-   INIT_LIST_HEAD(>guc_id_list);
-   ida_init(>guc_ids);
+   

Re: [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread John Harrison

On 10/14/2021 09:41, Matthew Brost wrote:

On Thu, Oct 14, 2021 at 09:43:36AM -0700, John Harrison wrote:

On 10/14/2021 08:32, Matthew Brost wrote:

On Wed, Oct 13, 2021 at 06:02:42PM -0700, John Harrison wrote:

On 10/13/2021 13:42, Matthew Brost wrote:

Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
(Daniel Vetter)
 - Add IGT link and placeholder for media UMD link
v3:
(Kernel test robot)
 - Fix warning in unpin engines call
(John Harrison)
 - Reword a bunch of the kernel doc
v4:
(John Harrison)
 - Add comment why perma-pin is done after setting gem context
 - Update some comments / docs for proto contexts

Cc: Tvrtko Ursulin 
Signed-off-by: Matthew Brost 
---
drivers/gpu/drm/i915/gem/i915_gem_context.c   | 228 +-
.../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +-
drivers/gpu/drm/i915/gt/intel_engine.h|  12 +-
drivers/gpu/drm/i915/gt/intel_engine_cs.c |   6 +-
.../drm/i915/gt/intel_execlists_submission.c  |   6 +-
drivers/gpu/drm/i915/gt/selftest_execlists.c  |  12 +-
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 114 -
include/uapi/drm/i915_drm.h   | 131 ++
9 files changed, 503 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d225d3dd0b40..6f23aff6e642 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -556,9 +556,150 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
__user *base, void *data)
return 0;
}
+static int
+set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct i915_context_engines_parallel_submit __user *ext =
+   container_of_user(base, typeof(*ext), base);
+   const struct set_proto_ctx_engines *set = data;
+   struct drm_i915_private *i915 = set->i915;
+   u64 flags;
+   int err = 0, n, i, j;
+   u16 slot, width, num_siblings;
+   struct intel_engine_cs **siblings = NULL;
+   intel_engine_mask_t prev_mask;
+
+   /* Disabling for now */
+   return -ENODEV;
+
+   /* FIXME: This is NIY for execlists */
+   if (!(intel_uc_uses_guc_submission(>gt.uc)))
+   return -ENODEV;
+
+   if (get_user(slot, >engine_index))
+   return -EFAULT;
+
+   if (get_user(width, >width))
+   return -EFAULT;
+
+   if (get_user(num_siblings, >num_siblings))
+   return -EFAULT;
+
+   if (slot >= set->num_engines) {
+   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
+   slot, set->num_engines);
+   return -EINVAL;
+   }
+
+   if (set->engines[slot].type != I915_GEM_ENGINE_TYPE_INVALID) {
+   drm_dbg(>drm,
+   "Invalid placement[%d], already occupied\n", slot);
+   return -EINVAL;
+   }
+
+   if (get_user(flags, >flags))
+   return -EFAULT;
+
+   if (flags) {
+   drm_dbg(>drm, "Unknown flags 0x%02llx", flags);
+   return -EINVAL;
+   }
+
+   for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+   err = check_user_mbz(>mbz64[n]);
+   if (err)
+   return err;
+   }
+
+   if (width < 2) {
+   drm_dbg(>drm, "Width (%d) < 2\n", width);
+   return -EINVAL;
+   }
+
+   if (num_siblings < 1) {
+   drm_dbg(>drm, "Number siblings (%d) < 1\n",
+   num_siblings);
+   return -EINVAL;
+   }
+
+   siblings = kmalloc_array(num_siblings * width,
+sizeof(*siblings),
+GFP_KERNEL);
+   if (!siblings)
+   return -ENOMEM;
+
+   /* Create contexts / engines */
+   for (i = 0; i < width; ++i) {
+   intel_engine_mask_t current_mask = 0;
+   struct i915_engine_class_instance prev_engine;
+
+   for (j = 0; j < num_siblings; ++j) {
+   struct i915_engine_class_instance ci;
+
+   n = i * num_siblings + j;
+   if (copy_from_user(, >engines[n], sizeof(ci))) {
+   err = -EFAULT;
+   goto out_err;
+   }
+
+   siblings[n] =
+   intel_engine_lookup_user(i915, ci.engine_class,
+ci.engine_instance);
+   if 

Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-14 Thread Ralph Campbell



On 10/14/21 10:06 AM, Jason Gunthorpe wrote:

On Thu, Oct 14, 2021 at 10:39:28AM -0500, Alex Sierra wrote:

From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for ZONE_DEVICE.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
Reviewed-by: Christoph Hellwig 
---
v2:
AS: merged this patch in linux 5.11 version

v5:
AS: add condition at try_grab_page to check for the zone device type, while
page ref counter is checked less/equal to zero. In case of device zone, pages
ref counter are initialized to zero.

v7:
AS: fix condition at try_grab_page added at v5, is invalid. It supposed
to fix xfstests/generic/413 test, however, there's a known issue on
this test where DAX mapped area DIO to non-DAX expect to fail.
https://patchwork.kernel.org/project/fstests/patch/1489463960-3579-1-git-send-email-xz...@redhat.com
This condition was removed after rebase over patch series
https://lore.kernel.org/r/20210813044133.1536842-4-jhubb...@nvidia.com
---
  arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
  drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
  fs/dax.c   |  4 +-
  include/linux/dax.h|  2 +-
  include/linux/memremap.h   |  7 +--
  include/linux/mm.h | 11 
  lib/test_hmm.c |  2 +-
  mm/internal.h  |  8 +++
  mm/memcontrol.c|  6 +--
  mm/memremap.c  | 69 +++---
  mm/migrate.c   |  5 --
  mm/page_alloc.c|  3 ++
  mm/swap.c  | 45 ++---
  13 files changed, 46 insertions(+), 120 deletions(-)

Has anyone tested this with FSDAX? Does get_user_pages() on fsdax
backed memory still work?


I ran xfstests-dev using the kernel boot option to "fake" a pmem device
when I first posted this patch. The tests ran OK (or at least the same
tests passed with and without my patch). However, I could never really
convince myself the changes were "OK" for fsdax since I didn't understand
the code that well. I would still like to see a xfsdax maintainer or
expert ACK this change.

https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


What refcount value does the struct pages have when they are installed
in the PTEs? Remember a 0 refcount will make all the get_user_pages()
fail.

I'm looking at the call path starting in ext4_punch_hole() and I would
expect to see something manipulating the page ref count before
the ext4_break_layouts() call path gets to the dax_page_unused() test.

All I see is we go into unmap_mapping_pages() - that would normally
put back the page references held by PTEs but insert_pfn() has this:

if (pfn_t_devmap(pfn))
entry = pte_mkdevmap(pfn_t_pte(pfn, prot));

And:

static inline pte_t pte_mkdevmap(pte_t pte)
{
return pte_set_flags(pte, _PAGE_SPECIAL|_PAGE_DEVMAP);
}

Which interacts with vm_normal_page():

if (pte_devmap(pte))
return NULL;

To disable that refcounting?

So... I have a feeling this will have PTEs pointing to 0 refcount
pages? Unless FSDAX is !pte_devmap which is not the case, right?

This seems further confirmed by this comment:

/*
 * If we race get_user_pages_fast() here either we'll see the
 * elevated page count in the iteration and wait, or
 * get_user_pages_fast() will see that the page it took a reference
 * against is no longer mapped in the page tables and bail to the
 * get_user_pages() slow path.  The slow path is protected by
 * pte_lock() and pmd_lock(). New references are not taken without
 * holding those locks, and unmap_mapping_pages() will not zero the
 * pte or pmd without holding the respective lock, so we are
 * guaranteed to either see new references or prevent new
 * references from being established.
 */

Which seems to explain this scheme relies on unmap_mapping_pages() to
fence GUP_fast, not on GUP_fast observing 0 refcounts when it should
stop.

This seems like it would be properly fixed by using normal page
refcounting for PTEs - ie stop using special for these pages?

Does anyone know why devmap is pte_special anyhow?


+void free_zone_device_page(struct page *page)
+{
+   switch (page->pgmap->type) {
+   case MEMORY_DEVICE_PRIVATE:
+   free_device_page(page);
+   return;
+   case MEMORY_DEVICE_FS_DAX:
+   /* notify page idle */
+   wake_up_var(>_refcount);
+   return;

It is not for this series, but I wonder if we should just always call
ops->page_free and 

  1   2   3   >