cong-zhu commented on pull request #5499: URL: https://github.com/apache/airflow/pull/5499#issuecomment-657828649
> Would probly be helpful to include an architecture diagram for it. Since a couple other folks are intereseted, maybe you can share a high level number on the impact this change has done inside Airbnb? > > I'm +1 for the current implementation that controls the behavior through configuration. A couple question: > > 1. How would smart sensor cover more operator types other than the two included in this PR? > 2. The `poke_context` seems to be serialized as JSON. Does it align with how we serialize DAGs in the DAG serialization project?( I think that's also using JSON but want to check if there's anything that we can reuse or combine). CC: @cong-zhu to comment on DAG serialization To comment on the serialization part. First, I think it makes sense to reuse the serialize and deserialize functions in the DAG serialization project. Second, I want to point out that, even though the fields in `poke_context` are duplicated in serialized DAGs, we still need to store `poke_context` in meta db because of the following reasons: 1. the DAG serialization is optional, which shouldn't be a dependency of smart sensor. 2. some fields the `poke_context` needs are in table `serialized_dag`, e.g. `conn_id`, others are in table `rendered_task_instance_fields`, e.g. `partition_name`. And their life cycles are managed in different ways. 3. getting data from `serialized_dag` is slow for large DAGs, unless we decouple it with `serialized_tasks` and only query one of the tasks. CC: @YingboWang ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
