On 3/21/24 15:08, Tomi Valkeinen wrote: > On 21/03/2024 20:01, Sean Anderson wrote: >> On 3/21/24 13:25, Tomi Valkeinen wrote: >>> On 21/03/2024 17:52, Sean Anderson wrote: >>>> On 3/20/24 02:53, Tomi Valkeinen wrote: >>>>> On 20/03/2024 00:51, Sean Anderson wrote: >>>>>> Retraining the link can take a while, and might involve waiting for >>>>>> DPCD reads/writes to complete. This is inappropriate for an IRQ handler. >>>>>> Just schedule this work for later completion. This is racy, but will be >>>>>> fixed in the next commit. >>>>> >>>>> You should add the locks first, and use them here, rather than first >>>>> adding a buggy commit and fixing it in the next one. >>>> >>>> I didn't think I could add the locks first since I only noticed the IRQ >>>> was threaded right before sending out this series. So yeah, we could add >>>> locking, add the workqueue, and then unthread the IRQ. >>>> >>>>>> Signed-off-by: Sean Anderson <sean.ander...@linux.dev> >>>>>> --- >>>>>> Actually, on second look this IRQ is threaded. So why do we have a >>>>>> workqueue for HPD events? Maybe we should make it unthreaded? >>>>> >>>>> Indeed, there's not much work being done in the IRQ handler. I don't know >>>>> why it's threaded. >>>>> >>>>> We could move the queued work to be inside the threaded irq handler, >>>>> but with a quick look, the HPD work has lines like "msleep(100)" (and >>>>> that's inside a for loop...), which is probably not a good thing to do >>>>> even in threaded irq handler. >>>>> >>>>> Although I'm not sure if that code is good to have anywhere. Why do we >>>>> even have such code in the HPD work path... We already got the HPD >>>>> interrupt. What does "It takes some delay (ex, 100 ~ 500 msec) to get >>>>> the HPD signal with some monitors" even mean... >>>> >>>> The documentation for this bit is >>>> >>>> | HPD_STATE 0 ro 0x0 Contains the raw state of the HPD pin on >>>> the DisplayPort connector. >>>> >>>> So I think the idea is to perform some debouncing. >>> >>> Hmm, it just looks a bit odd to me. It can sleep for a second. And the >>> wording "It takes some delay (ex, 100 ~ 500 msec) to get the HPD signal >>> with some monitors" makes it sound like some kind of a hack... >>> >>> The docs mention debounce once: >>> >>> https://docs.amd.com/r/en-US/pg299-v-dp-txss1/Hot-Plug-Detection >> >> Are you sure this is the right document? This seems to be documentation for >> [1]. Is that instantiated as a hard block on the ZynqMP? >> >> [1] >> https://www.xilinx.com/products/intellectual-property/ef-di-displayport.html > > You're right, wrong document. The registers and bitfield names I looked at > just matched, so I didn't think it through... > > The right doc says even less: > > https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/Upon-HPD-Assertion > >>> But it's not immediately obvious what the SW must do and what's done by the >>> HW. Debounce is not mentioned later, e.g. in the HPD Event Handling. But if >>> debounce is needed, wouldn't it be perhaps in a few milliseconds, instead >>> of hundreds of milliseconds... >> >> Well, the DP spec says >> >> | If the HPD is the result of a new device being connected, either >> | directly to the Source device (signaled by a long HPD), –or– downstream >> | of a Branch device (indicated by incrementing the DFP_COUNT field value >> | in the DOWN_STREAM_PORT_COUNT register (DPCD 00007h[3:0]) and signaled >> | by an IRQ_HPD pulse), the Source device shall read the new DisplayID or >> | legacy EDID that has been made available to it to ensure that content >> | being transmitted over the link is able to be properly received and >> | rendered. >> | >> | Informative Note: If the HPD signal toggling (or bouncing) is the >> | result of the Hot Unplug followed by Hot Plug of a >> | cable-connector assembly, the HPD signal is likely >> | to remain unstable during the de-bouncing period, >> | which is in the order of tens of milliseconds. The >> | Source device may either check the HPD signal’s >> | stability before initiating an AUX read transaction, >> | –or– immediately initiate the AUX read transaction >> | after each HPD rising edge. >> >> So a 100 ms delay seems plausible for some monitors. > > I read the text above as "it may take tens of milliseconds for HPD to > stabilize". So polling it for total of 100ms sounds fine, but we're polling > it for 1000ms. > > And I think checking for stability is fine, but for detect() I think it goes > overboard: if the cable is disconnected, every detect call spends a second > checking for HPD, even if we haven't seen any sign of an HPD =). > > And if we're checking the HPD stability, wouldn't we, say, poll the HPD for > some time, and see if it stays the same? At the moment the code proceeds > right away if HPD is high, but keeps polling if HPD is low. > >> That said, maybe we can just skip this and always read the DPCD. > > If the HPD is bouncing, is the AUX line also unstable? > > I don't mind a HPD stability check, I think it makes sense as (I think) the > HW doesn't handle de-bouncing here. I think think it could be much much > shorter than what it is now, and that it would make sense to observe the HPD > for a period, instead of just waiting for the HPD to go high. > > But this could also be left for later, I don't think it matters in the > context of this series. > >>> zynqmp_dp_bridge_detect() is used for drm_bridge_funcs.detect(), and if the >>> cable is not connected, it'll sleep for 1 second (probably more) until >>> returning not connected. It just doesn't sound correct to me. >>> >>> Well, it's not part of this patch as such, but related to the amount of >>> time we spend in the interrupt handler (and also the detect()). >>> >>>>> Would it be possible to clean up the work funcs a bit (I haven't >>>>> looked a the new work func yet), to remove the worst extra sleeps, and >>>>> just do all that inside the threaded irq handler? >>>> >>>> Probably not, since a HPD IRQ results in link retraining, which can take a >>>> while. >>> >>> But is it any different if you have a workqueue? Isn't a threaded interrupt >>> handler basically the same thing? >>> >>> Probably at least the zynqmp_dp_hpd_work_func() could be done in the >>> threaded irq just fine, if the insane 1s sleep can be dropped. >> >> Anything involving AUX shouldn't been in an IRQ, since >> zynqmp_dp_aux_transfer will retry for up to 50ms by default. > > Perhaps. I'm still not sure if that's a problem. If a threaded irq is > essentially a workqueue dedicated for this device, and we don't need to > handle other irqs while the work is being done, then... What's the difference > with a threaded irq and a workqueue? > > Oh, but we do need to handle irqs, we have the vblank irq in there. We don't > want the vblanks to stop if there's a HPD IRQ. > > Btw, looks like zynqmp_dpsub_drm_handle_vblank() can sleep, so we can't move > to non-threaded irq.
I don't see that. We have zynqmp_dpsub_drm_handle_vblank drm_crtc_handle_vblank drm_handle_vblank spin_lock_irqsave(...) ... spin_lock_irqsave(...) vblank_disable_fn(...) spin_lock_irqsave(...) ... spin_lock_irqrestore(...) so no sleeping AFAICT. --Sean >>>>> Do we need to handle interrupts while either delayed work is being done? >>>> >>>> Probably not. >>>> >>>>> If we do need a delayed work, would just one work be enough which >>>>> handles both HPD_EVENT and HPD_IRQ, instead of two? >>>> >>>> Maybe, but then we need to determine which pending events we need to >>>> handle. I think since we have only two events it will be easier to just >>>> have separate workqueues. >>> >>> The less concurrency, the better...Which is why it would be nice to do it >>> all in the threaded irq. >> >> Yeah, but we can use a mutex for this which means there is not too much >> interesting going on. > > Ok. Yep, if we get (hopefully) a single mutex with clearly defined fields > that it protects, I'm ok with workqueues. > > I'd still prefer just one workqueue, though... Yeah, but then we need a spinlock or something to tell the workqueue what it should do. --Sean