cong-zhu commented on pull request #5499:
URL: https://github.com/apache/airflow/pull/5499#issuecomment-657828649


   > Would probly be helpful to include an architecture diagram for it. Since a 
couple other folks are intereseted, maybe you can share a high level number on 
the impact this change has done inside Airbnb?
   > 
   > I'm +1 for the current implementation that controls the behavior through 
configuration. A couple question:
   > 
   > 1. How would smart sensor cover more operator types other than the two 
included in this PR?
   > 2. The `poke_context` seems to be serialized as JSON. Does it align with 
how we serialize DAGs in the DAG serialization project?( I think that's also 
using JSON but want to check if there's anything that we can reuse or combine). 
CC: @cong-zhu to comment on DAG serialization
   
   To comment on the serialization part. First, I think it makes sense to reuse 
the serialize and deserialize functions in the DAG serialization project. 
Second, I want to point out that, even though the fields in `poke_context` are 
duplicated in serialized DAGs, we still need to store `poke_context` in meta db 
because of the following reasons: 1. the DAG serialization is optional, which 
shouldn't be a dependency of smart sensor. 2. some fields the `poke_context` 
needs are in table `serialized_dag`, e.g. `conn_id`, others are in table 
`rendered_task_instance_fields`, e.g. `partition_name`. And their life cycles 
are managed in different ways. 3. getting data from `serialized_dag` is slow 
for large DAGs, unless we decouple it with `serialized_tasks` and only query 
one of the tasks.
   
   CC: @YingboWang 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to