Lee-W commented on code in PR #59183: URL: https://github.com/apache/airflow/pull/59183#discussion_r2609643241
########## airflow-core/src/airflow/assets/manager.py: ########## @@ -54,6 +55,59 @@ log = structlog.get_logger(__name__) +@contextmanager +def _aquire_apdr_lock( + *, + session: Session, + dag_id: str, + partition_key: str, + asset_id: int, + max_retries: int = 10, + retry_delay: float = 0.05, +): + """ + Context manager to acquire a lock for AssetPartitionDagRun creation. + + - SQLite: uses AssetPartitionDagRunMutexLock table as row-level lock is not supported + - Postgres/MySQL: uses row-level lock on AssetModel. + """ + if get_dialect_name(session) == "sqlite": + from airflow.models.asset import AssetPartitionDagRunMutexLock + + for _ in range(max_retries): + try: + mutex = AssetPartitionDagRunMutexLock(target_dag_id=dag_id, partition_key=partition_key) + session.add(mutex) + session.flush() + try: + yield # mutex acquired + finally: + session.delete(mutex) Review Comment: @potiuk I just had a quick discussion with @dstandish. It happens when the task is finished in the execution API layer but not scheduled. In the WAL mode, we have 2 writes happening simultaneously, but the 2nd process might still read no "APDR" and decide to create a 2nd write. It just needs to wait for the 1st one (unless I misunderstood the WAL mode). Additionally, the test case will not pass. Unless we are using SQLite differently in production compared to the unit test, this indicates that the second write is necessary. Let me know if you think we still need to remove it. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
