ashb commented on issue #17984: URL: https://github.com/apache/airflow/issues/17984#issuecomment-926536817
So my understanding of lineage for a data workflow is two things alone: - What inputs did I consume - What outputs did I produce. In my opinion you are building more than just lineage tracking -- but something larger, as , so it's my opnion that these events do not belong in the lineage backend interface. So lets talk about an alternative interface to get you the stable API you want. Some questions: 1. Where should this run? On the scheduler, or the runner? 2. If there was a problem launching the runner (pod failure, celery farted) and the task is marked as failed early does that change A1? I'm leaning towards the ability to add/configure global task hook points for this sort of thing, rather than forcing something in to the lineage api that only OpenLineage wants. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
