vincbeck commented on code in PR #45859:
URL: https://github.com/apache/airflow/pull/45859#discussion_r1931116297


##########
providers/src/airflow/providers/standard/triggers/file.py:
##########
@@ -69,9 +69,11 @@ async def run(self) -> typing.AsyncIterator[TriggerEvent]:
                     mod_time = 
datetime.datetime.fromtimestamp(mod_time_f).strftime("%Y%m%d%H%M%S")
                     self.log.info("Found File %s last modified: %s", path, 
mod_time)
                     yield TriggerEvent(True)
+                    await asyncio.sleep(self.poke_interval)
                     return
                 for _, _, files in os.walk(self.filepath):
                     if files:
                         yield TriggerEvent(True)
+                        await asyncio.sleep(self.poke_interval)
                         return
             await asyncio.sleep(self.poke_interval)

Review Comment:
   Exactly. That's something I want to avoid. To me, firing right when the file 
is detected is important.
   
   However, there is a discussion on 
[Slack](https://apache-airflow.slack.com/archives/C06K9Q5G2UA/p1737999509008529)
 on that topic. I expect this specific issue to be in several different 
triggers. `S3KeyTrigger` is an example. As soon as the file is detected, the 
trigger will exit right away and the next loop will kick in and detect that 
same file again. Instead of modifying the triggers to handle that, why do not 
we update the logic in the triggerer to add a sleeping window between 2 fires 
of the same trigger (and only triggers use for event driven scheduling). WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to