mistercrunch commented on code in PR #33108:
URL: https://github.com/apache/superset/pull/33108#discussion_r2042933229


##########
superset/dashboards/api.py:
##########
@@ -1307,6 +1308,11 @@ def thumbnail(self, pk: int, digest: str, **kwargs: Any) 
-> WerkzeugResponse:
                 dashboard_id=dashboard.id,
                 force=False,
             )
+
+        if (
+            cache_payload.should_trigger_task()

Review Comment:
   > COMPUTING should not trigger a new task because (as I understand it) that 
means it's already running.
   
   the way I read your comment it seems we need to handle the COMPUTING case as 
part of `should_trigger_task`. Though maybe I'm missing something around 
`force` (?)
   
   ---
   
   Let's dig a bit deeper.
   
   Note that both charts and dashboards have essentially two contexts in which 
they call that. One is a "get me the image if it's there, and if you don't have 
it go and compute it, I'll expect a 202 if it doesn't exist yet", and another 
for "please compute the thumbnail (I'm not expecting an image here...), with a 
force flag". And I think we can assume that we call the second endpoint when 
the caller knows we need a new thumb or png (like we know that the the chart 
definition has changed, or we want a fresher thumb with new data for instance 
for a fresh report)
   
   What I'd do or recommend:
   - remove `force` as a function argument, as it's context relevant for the 
caller, not for the function itself
   - have the callers call it in the context of `if force or 
cache_payload.should_trigger_task():`. It's the caller's job to decide if it 
wants to overrule `should_trigger_task`
   - handle `COMPUTING` in `should_trigger_task` assuming that if it's 
computing, we should NOT trigger a task. Here `force` can still overrule 
whatever `should_trigger_task` returns, which is what we want.
   
   Maybe there are some small risks of race conditions (that also exist today 
AFAIK), around the rare case where say:
   - T1: force compute V1
   - T2: (while computing V1)  start force computing V2
   - T3: somehow V2 finishes before V1
   - T4: V1 finishes computing after V2
   
   So maybe ideally starting V2 in some cases should kill V1, but I wouldn't 
worry about that RN, but from experience it's hard to recall a celery job once 
dispatched...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to