dstandish commented on code in PR #24908:
URL: https://github.com/apache/airflow/pull/24908#discussion_r917270799


##########
airflow/models/dataset.py:
##########
@@ -199,3 +199,86 @@ def __repr__(self):
         for attr in [x.name for x in self.__mapper__.primary_key]:
             args.append(f"{attr}={getattr(self, attr)!r}")
         return f"{self.__class__.__name__}({', '.join(args)})"
+
+
+class DatasetEvent(Base):
+    """
+    A table to store datasets events.
+
+    :param dataset_id: reference to Dataset record
+    :param extra: JSON field for arbitrary extra info
+    :param source_task_id: the task_id of the TI which updated the dataset
+    :param source_dag_id: the dag_id of the TI which updated the dataset
+    :param source_run_id: the run_id of the TI which updated the dataset
+    :param source_map_index: the map_index of the TI which updated the dataset
+
+    We use relationships instead of foreign keys so that dataset events are 
not deleted even
+    if the foreign key object is.
+    """
+
+    id = Column(Integer, primary_key=True, autoincrement=True)
+    dataset_id = Column(Integer, nullable=False)
+    extra = Column(ExtendedJSON, nullable=True)
+    source_task_id = Column(StringID(), nullable=True)
+    source_dag_id = Column(StringID(), nullable=True)
+    source_run_id = Column(StringID(), nullable=True)
+    source_map_index = Column(Integer, nullable=True, 
server_default=text("-1"))

Review Comment:
   Yeah, it's a good question.  So with dataset events, there tend to be 
multiple things implicated.  There's the task that updated the dataset.  But 
then there's the also dag or task that depends on the dataset, and will be 
triggered by the the update.  And e.g. with the queue table, we definitely 
needed to make clear we're talking about the target dag id.  Here though, I 
think you're probably right, we probably don't need it because it's only about 
writes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to