The GitHub Actions job "Tests" on airflow.git has succeeded.
Run started by GitHub user potiuk (triggered by potiuk).

Head commit for run:
d5d6f4b1885a99f5a5e3063dbd54e1f32700c1df / Michael Peteuil 
<[email protected]>
Make Datasets hashable (#37465)

Currently DAGs accept a
[`Collection["Dataset"]`](https://github.com/apache/airflow/blob/0c02ead4d8a527cbf0a916b6344f255c520e637f/airflow/models/dag.py#L171)
as an option for the `schedule`, but that collection cannot be a `set`
because Datasets are not a hashable type. The interesting thing is that
[the `DatasetModel` is actually already
hashable](https://github.com/apache/airflow/blob/dec78ab3f140f35e507de825327652ec24d03522/airflow/models/dataset.py#L93-L100),
so this introduces a bit of duplication since it's the same
implementation. However, Airflow users are primarily interfacing with
`Dataset`, not `DatasetModel` so I think it makes sense for `Dataset` to
be hashable. I'm not sure how to square the duplication or what `__eq__`
and `__hash__` provide for `DatasetModel` though.

There was discussion on the original PR that created the `Dataset`
(https://github.com/apache/airflow/pull/24613) about whether to create
two classes or one. In that discussion @kaxil mentioned:

> I would slightly favour a separate `DatasetModel` and `Dataset` so
`Dataset` becomes an extensible class, and `DatasetModel` just stores
the info about the class. So users don't need to care about SQLAlchmey
stuff when extending it.

That first PR created the `Dataset` model as both SQLAlchemy and user
space class though. It wasn't until later on
(https://github.com/apache/airflow/pull/25727) that the `DatasetModel`
got broken out from `Dataset` and one became two. That provides a bit of
background on why they both exist for anyone reading this who is
curious.

Report URL: https://github.com/apache/airflow/actions/runs/7995800768

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to