I like the folder structure proposed by Ash and have no objections with it.

"core_and_task_sdk" sounds good to me and justifies what it should do
pretty well.

@Jarek Potiuk <ja...@potiuk.com> a little confused on what you mean there,
I am understanding the direction
but could you elaborate a bit more please?

Naming is REALLY hard!

Thanks & Regards,
Amogh Desai


On Tue, Jul 8, 2025 at 2:52 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> How about splitting it even more and having each shared "thing" named?
> "logging", "config" and sharing them explicitly and separately with the
> right "user" ?
> That sounds way more modular and  we will be able to choose which of the
> shared "utils" we use where.
>
> J.
>
>
> On Mon, Jul 7, 2025 at 11:13 PM Jens Scheffler <j_scheff...@gmx.de.invalid
> >
> wrote:
>
> > I like "core_and_task_sdk the same like core-and-task-sdk - I have no
> > problem and it is a path only.
> >
> > if we get to "dag-parser-scheduler-task-sdk-and-triggerer" which is a
> > bit bulky we then should name it "all-not-api-server" :-D
> >
> > On 07.07.25 22:57, Ash Berlin-Taylor wrote:
> > > In case I did a bad job explaining it, the “core and task sdk” is not
> in
> > the module name/import name, just in the file path.
> > >
> > > Anyone have other ideas?
> > >
> > >> On 7 Jul 2025, at 21:37, Buğra Öztürk <ozturkbugr...@gmail.com>
> wrote:
> > >>
> > >> Thanks Ash! Looks cool! I like the structure. This will enable all
> the
> > >> combinations and structure looks easy to grasp. No strong stance on
> the
> > >> naming other than maybe it is a bit long with `and`, `core_ctl` could
> be
> > >> shorter, since no import path is defined like that, we can give any
> name
> > >> for sure.
> > >>
> > >> Best regards,
> > >>
> > >>> On Mon, 7 Jul 2025, 21:51 Jarek Potiuk, <ja...@potiuk.com> wrote:
> > >>>
> > >>> Looks good but I think we should find some better logical name for
> > >>> core_and_sdk :)
> > >>>
> > >>> pon., 7 lip 2025, 21:44 użytkownik Jens Scheffler
> > >>> <j_scheff...@gmx.de.invalid> napisał:
> > >>>
> > >>>> Cool! Especially the "shared" folder with the ability to have
> > >>>> N-combinations w/o exploding project repo root!
> > >>>>
> > >>>> On 07.07.25 14:43, Ash Berlin-Taylor wrote:
> > >>>>> Oh, and all of this will be explain in shared/README.md
> > >>>>>
> > >>>>>> On 7 Jul 2025, at 13:41, Ash Berlin-Taylor <a...@apache.org>
> wrote:
> > >>>>>>
> > >>>>>> Okay, so it seems we have agreement on the approach here, so I’ll
> > >>>> continue with this, and on the dev call it was mentioned that
> > >>>> “airflow-common” wasn’t a great name, so here is my proposal for the
> > file
> > >>>> structure;
> > >>>>>> ```
> > >>>>>> /
> > >>>>>>   task-sdk/...
> > >>>>>>   airflow-core/...
> > >>>>>>   shared/
> > >>>>>>     kuberenetes/
> > >>>>>>       pyproject.toml
> > >>>>>>       src/
> > >>>>>>         airflow_kube/__init__.py
> > >>>>>>     core-and-tasksdk/
> > >>>>>>       pyproject.toml
> > >>>>>>       src/
> > >>>>>>         airflow_shared/__init__.py
> > >>>>>> ```
> > >>>>>>
> > >>>>>> Things to note here: the “shared” folder has (the possibility) of
> > >>>> having multiple different shared “libraries” in it, in this example
> I
> > am
> > >>>> supposing a hypothetical shared kuberenetes folder a world in which
> we
> > >>>> split the KubePodOperator and the KubeExecutor in to two separate
> > >>>> distributions (example only, not proposing we do that right now, and
> > that
> > >>>> will be a separate discussion)
> > >>>>>> The other things to note here:
> > >>>>>>
> > >>>>>>
> > >>>>>> - the folder name in shared aims to be “self-documenting”, hence
> the
> > >>>> verbose “core-and-tasksdk” to say where the shared library is
> > intended to
> > >>>> be used.
> > >>>>>> - the python module itself should almost always have an `airflow_`
> > (or
> > >>>> maybe `_airflow_`?) prefix so that it does not conflict with
> anything
> > >>> else
> > >>>> we might use. It won’t matter “in production” as those will be
> > vendored
> > >>> in
> > >>>> to be imported as `airflow/_vendor/airflow_shared` etc, but avoiding
> > >>>> conflicts at dev time with the Finder approach is a good safety
> > measure.
> > >>>>>> I will start making a real PR for this proposal now, but I’m open
> to
> > >>>> feedback (either here, or in the PR when I open it)
> > >>>>>> -ash
> > >>>>>>
> > >>>>>>> On 4 Jul 2025, at 16:55, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >>>>>>>
> > >>>>>>> Yeah we have to try it and test - also building packages happens
> > semi
> > >>>>>>> frequently when you run `uv sync` (they use some kind of
> heuristics
> > >>> to
> > >>>>>>> decide when) and you can force it with `--reinstall` or
> > `--refresh`.
> > >>>>>>> Package build also happens every time when you run "ci-image
> build`
> > >>>> now in
> > >>>>>>> breeze so it seems like it will nicely integrate in our
> workflows.
> > >>>>>>>
> > >>>>>>> Looks really cool Ash.
> > >>>>>>>
> > >>>>>>> On Fri, Jul 4, 2025 at 5:14 PM Ash Berlin-Taylor <a...@apache.org
> >
> > >>>> wrote:
> > >>>>>>>> It’s not just release time, but any time we build a package
> which
> > >>>> happens
> > >>>>>>>> on “every” CI run. The normal unit tests will use code from
> > >>>>>>>> airflow-common/src/airflow_common; the kube tests which build an
> > >>>> image will
> > >>>>>>>> build the dists and vendor in the code from that commit.
> > >>>>>>>>
> > >>>>>>>> There is only a single copy of the shared code committed to the
> > >>> repo,
> > >>>> so
> > >>>>>>>> there is never anything to synchronise.
> > >>>>>>>>
> > >>>>>>>>> On 4 Jul 2025, at 15:53, Amogh Desai <amoghdesai....@gmail.com
> >
> > >>>> wrote:
> > >>>>>>>>> Thanks Ash.
> > >>>>>>>>>
> > >>>>>>>>> This is really cool and helpful that you were able to test both
> > >>>> scenarios
> > >>>>>>>>> -- repo checkout
> > >>>>>>>>> and also installing from the vendored package and the
> resolution
> > >>>> worked
> > >>>>>>>>> fine too.
> > >>>>>>>>>
> > >>>>>>>>> I like this idea compared the to relative import one for few
> > >>> reasons:
> > >>>>>>>>> - It feels like it will take some time to adjust to the new
> > coding
> > >>>>>>>> standard
> > >>>>>>>>> that we will lay
> > >>>>>>>>> if we impose relative imports in the shared dist
> > >>>>>>>>> - We can continue using repo wise absolute import standards, it
> > is
> > >>>> also
> > >>>>>>>>> much easier for situations
> > >>>>>>>>> when we do global search in IDE to find + replace, this could
> > mean
> > >>> a
> > >>>>>>>> change
> > >>>>>>>>> there
> > >>>>>>>>> - The vendoring work is a proven and established paradigm
> across
> > >>>> projects
> > >>>>>>>>> and would
> > >>>>>>>>> out of box give us the build tooling we need also
> > >>>>>>>>>
> > >>>>>>>>> Nothing too against the relative import but with the evidence
> > >>>> provided
> > >>>>>>>>> above, vendored approach
> > >>>>>>>>> seems to only do us good.
> > >>>>>>>>>
> > >>>>>>>>> Regarding synchronizing it, release time should be fine as long
> > as
> > >>> we
> > >>>>>>>> have
> > >>>>>>>>> a good CI workflow to probably
> > >>>>>>>>> catch such issues per PR if changes are made in shared dist?
> > >>>> (precommit
> > >>>>>>>>> would make it really slow i guess)
> > >>>>>>>>>
> > >>>>>>>>> If we can run our tests with vendored code we should be mostly
> > >>>> covered.
> > >>>>>>>>> Good effort all!
> > >>>>>>>>>
> > >>>>>>>>> Thanks & Regards,
> > >>>>>>>>> Amogh Desai
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> On Fri, Jul 4, 2025 at 7:23 PM Ash Berlin-Taylor <
> > a...@apache.org>
> > >>>>>>>> wrote:
> > >>>>>>>>>> Okay, I think I’ve got something that works and I’m happy
> with.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>
> >
> https://github.com/astronomer/airflow/tree/shared-vendored-lib-tasksdk-and-core
> > >>>>>>>>>> This produces the following from `uv build task-sdk`
> > >>>>>>>>>> -
> > >>>>>>>>>>
> > >>>
> >
> https://github.com/user-attachments/files/21058976/apache_airflow_task_sdk-1.1.0.tar.gz
> > >>>>>>>>>> -
> > >>>>>>>>>>
> > >>>
> >
> https://github.com/user-attachments/files/21058996/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip
> > >>>>>>>>>> (`.whl.zip` as GH won't allow .whl upload, but will .zip)
> > >>>>>>>>>>
> > >>>>>>>>>> ```
> > >>>>>>>>>> ❯ unzip -l
> > >>> dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip |
> > >>>>>>>> grep
> > >>>>>>>>>> _vendor
> > >>>>>>>>>>      50  02-02-2020 00:00   airflow/sdk/_vendor/.gitignore
> > >>>>>>>>>>    2082  02-02-2020 00:00   airflow/sdk/_vendor/__init__.py
> > >>>>>>>>>>      28  02-02-2020 00:00
> >  airflow/sdk/_vendor/airflow_common.pyi
> > >>>>>>>>>>      18  02-02-2020 00:00   airflow/sdk/_vendor/vendor.txt
> > >>>>>>>>>>     785  02-02-2020 00:00
> > >>>>>>>>>> airflow/sdk/_vendor/airflow_common/__init__.py
> > >>>>>>>>>>   10628  02-02-2020 00:00
> > >>>>>>>>>> airflow/sdk/_vendor/airflow_common/timezone.py
> > >>>>>>>>>> ```
> > >>>>>>>>>>
> > >>>>>>>>>> And similarly in the .tar.gz, so our “sdist” is complete too:
> > >>>>>>>>>> ```
> > >>>>>>>>>> ❯ tar -tzf dist/apache_airflow_task_sdk-1.1.0.tar.gz |grep
> > _vendor
> > >>>>>>>>>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/.gitignore
> > >>>>>>>>>>
> > apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/__init__.py
> > >>>>>>>>>>
> > >>>>
> > apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common.pyi
> > >>>>>>>>>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/vendor.txt
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>
> >
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/__init__.py
> > >>>
> >
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/timezone.py
> > >>>>>>>>>> ```
> > >>>>>>>>>>
> > >>>>>>>>>> The plugin works at build time by including/copying the libs
> > >>>> specified
> > >>>>>>>> in
> > >>>>>>>>>> vendor.txt into place (and let `vendoring` take care of import
> > >>>>>>>> rewrites.)
> > >>>>>>>>>> For the imports to continue to work at “dev” time/from a repo
> > >>>> checkout,
> > >>>>>>>> I
> > >>>>>>>>>> have added a import finder to `sys.meta_path`, and since it’s
> at
> > >>> the
> > >>>>>>>> end of
> > >>>>>>>>>> the list it will only be used if the normal import can’t find
> > >>>> things.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>
> >
> https://github.com/astronomer/airflow/blob/996817782be6071b306a87af9f36fe1cf2d3aaa3/task-sdk/src/airflow/sdk/_vendor/__init__.py
> > >>>>>>>>>> This doesn’t quite give us the same runtime effect “import
> > >>>> rewriting”
> > >>>>>>>>>> affect, as in this approach `airflow_common` is directly
> loaded
> > >>>> (i.e.
> > >>>>>>>>>> airflow.sdk._vendor.airflow_common and airflow_common exist in
> > >>>>>>>>>> sys.modules), but it does work for everything that I was able
> to
> > >>>> test..
> > >>>>>>>>>> I tested it with the diff at the end of this message. My test
> > >>>> ipython
> > >>>>>>>>>> shell:
> > >>>>>>>>>>
> > >>>>>>>>>> ```
> > >>>>>>>>>> In [1]: from airflow.sdk._vendor.airflow_common.timezone
> import
> > >>> foo
> > >>>>>>>>>> In [2]: foo
> > >>>>>>>>>> Out[2]: 1
> > >>>>>>>>>>
> > >>>>>>>>>> In [3]: import airflow.sdk._vendor.airflow_common
> > >>>>>>>>>>
> > >>>>>>>>>> In [4]: import airflow.sdk._vendor.airflow_common.timezone
> > >>>>>>>>>>
> > >>>>>>>>>> In [5]: airflow.sdk._vendor.airflow_common.__file__
> > >>>>>>>>>> Out[5]:
> > >>>>>>>>>>
> > >>>
> >
> '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/__init__.py'
> > >>>>>>>>>> In [6]: airflow.sdk._vendor.airflow_common.timezone.__file__
> > >>>>>>>>>> Out[6]:
> > >>>>>>>>>>
> > >>>
> >
> '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/timezone.py'
> > >>>>>>>>>> ```
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> And in an standalone environment with the SDK dist I built (it
> > >>>> needed
> > >>>>>>>> the
> > >>>>>>>>>> matching airflow-core right now, but that is nothing to do
> with
> > >>> this
> > >>>>>>>>>> discussion):
> > >>>>>>>>>>
> > >>>>>>>>>> ```
> > >>>>>>>>>> ❯ _AIRFLOW__AS_LIBRARY=1 uvx --python 3.12 --with
> > >>>>>>>>>> dist/apache_airflow_core-3.1.0-py3-none-any.whl --with
> > >>>>>>>>>> dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl ipython
> > >>>>>>>>>> Python 3.12.7 (main, Oct 16 2024, 07:12:08) [Clang 18.1.8 ]
> > >>>>>>>>>> Type 'copyright', 'credits' or 'license' for more information
> > >>>>>>>>>> IPython 9.4.0 -- An enhanced Interactive Python. Type '?' for
> > >>> help.
> > >>>>>>>>>> Tip: You can use `%hist` to view history, see the options with
> > >>>>>>>> `%history?`
> > >>>>>>>>>> In [1]: import airflow.sdk._vendor.airflow_common.timezone
> > >>>>>>>>>>
> > >>>>>>>>>> In [2]: airflow.sdk._vendor.airflow_common.timezone.__file__
> > >>>>>>>>>> Out[2]:
> > >>>>>>>>>>
> > >>>
> >
> '/Users/ash/.cache/uv/archive-v0/WWq6r65aPto2eJOyPObEH/lib/python3.12/site-packages/airflow/sdk/_vendor/airflow_common/timezone.py’
> > >>>>>>>>>> ``
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> ```diff
> > >>>>>>>>>> diff --git a/airflow-common/src/airflow_common/__init__.py
> > >>>>>>>>>> b/airflow-common/src/airflow_common/__init__.py
> > >>>>>>>>>> index 13a83393a9..927b7c6b61 100644
> > >>>>>>>>>> --- a/airflow-common/src/airflow_common/__init__.py
> > >>>>>>>>>> +++ b/airflow-common/src/airflow_common/__init__.py
> > >>>>>>>>>> @@ -14,3 +14,5 @@
> > >>>>>>>>>> # KIND, either express or implied.  See the License for the
> > >>>>>>>>>> # specific language governing permissions and limitations
> > >>>>>>>>>> # under the License.
> > >>>>>>>>>> +
> > >>>>>>>>>> +foo = 1
> > >>>>>>>>>> diff --git a/airflow-common/src/airflow_common/timezone.py
> > >>>>>>>>>> b/airflow-common/src/airflow_common/timezone.py
> > >>>>>>>>>> index 340b924c66..58384ef20f 100644
> > >>>>>>>>>> --- a/airflow-common/src/airflow_common/timezone.py
> > >>>>>>>>>> +++ b/airflow-common/src/airflow_common/timezone.py
> > >>>>>>>>>> @@ -36,6 +36,9 @@ _PENDULUM3 =
> > >>>>>>>>>> version.parse(metadata.version("pendulum")).major == 3
> > >>>>>>>>>> # - FixedTimezone(0, "UTC") in pendulum 2
> > >>>>>>>>>> utc = pendulum.UTC
> > >>>>>>>>>>
> > >>>>>>>>>> +
> > >>>>>>>>>> +from airflow_common import foo
> > >>>>>>>>>> +
> > >>>>>>>>>> TIMEZONE: Timezone
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> ```
> > >>>>>>>>>>
> > >>>>>>>>>>>> On 3 Jul 2025, at 12:43, Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > >>>>>>>>>>> I think both approaches are doable:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1) -> We can very easily prevent bad imports by pre-commit
> when
> > >>>>>>>> importing
> > >>>>>>>>>>> from different distributions and make sure we are only doing
> > >>>> relative
> > >>>>>>>>>>> imports in the shared modules. We are doing plenty of this
> > >>>> already. And
> > >>>>>>>>>> yes
> > >>>>>>>>>>> it would require relative links we currently do not allow.
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2) -> has one disadvantage that someone at some point in time
> > >>> will
> > >>>> have
> > >>>>>>>>>> to
> > >>>>>>>>>>> decide to synchronize this and if it happens just before
> > release
> > >>>> (I bet
> > >>>>>>>>>>> this is going to happen) this will lead to solving problems
> > that
> > >>>> would
> > >>>>>>>>>>> normally be solved during PR when you make a change (i.e.
> > >>> symbolic
> > >>>> link
> > >>>>>>>>>> has
> > >>>>>>>>>>> the advantage that whoever modifies shared code will be
> > >>> immediately
> > >>>>>>>>>>> notified in their PR - that they broke something because
> either
> > >>>> static
> > >>>>>>>>>>> checks or mypy or tests fail.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Ash, do you have an idea of a process (who and when) does the
> > >>>>>>>>>>> synchronisation in case of vendoring? Maybe we could solve it
> > if
> > >>>> it is
> > >>>>>>>>>> done
> > >>>>>>>>>>> more frequently and with some regularity? We could
> potentially
> > >>>> force
> > >>>>>>>>>>> re-vendoring at PR time as well any time shared code changes
> > (and
> > >>>>>>>> prevent
> > >>>>>>>>>>> it by pre-commit. And I can't think of some place (other than
> > >>>> releases)
> > >>>>>>>>>> in
> > >>>>>>>>>>> our development workflow and that seems to be a bit too late
> as
> > >>>> puts an
> > >>>>>>>>>>> extra effort on fixing potential incompatibilities introduced
> > on
> > >>>>>>>> release
> > >>>>>>>>>>> manager and delays the release. WDYT?
> > >>>>>>>>>>>
> > >>>>>>>>>>> Re: relative links. I think for a shared library we could
> > >>>> potentially
> > >>>>>>>>>> relax
> > >>>>>>>>>>> this and allow them (and actually disallow absolute links in
> > the
> > >>>> pieces
> > >>>>>>>>>> of
> > >>>>>>>>>>> code that are shared - again, by pre-commit). As I recall,
> the
> > >>> only
> > >>>>>>>>>> reason
> > >>>>>>>>>>> we forbade the relative links is because of how we are (or
> > maybe
> > >>>> were)
> > >>>>>>>>>>> doing DAG parsing and failures resulting from it. So we
> decided
> > >>> to
> > >>>> just
> > >>>>>>>>>> not
> > >>>>>>>>>>> allow it to keep consistency. The way how Dag parsing works
> is
> > >>> that
> > >>>>>>>> when
> > >>>>>>>>>>> you are using importlib to read the Dag from a file, the
> > relative
> > >>>>>>>> imports
> > >>>>>>>>>>> do not work as it does not know what they should be relative
> > to.
> > >>>> But if
> > >>>>>>>>>>> relative import is done from an imported package, it should
> be
> > no
> > >>>>>>>>>> problem,
> > >>>>>>>>>>> I think - otherwise our Dags would not be able to import any
> > >>>> library
> > >>>>>>>> that
> > >>>>>>>>>>> uses relative imports.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Of course consistency might be the reason why we do not want
> to
> > >>>>>>>> introduce
> > >>>>>>>>>>> relative imports. I don't see it as an issue if it is guarded
> > by
> > >>>>>>>>>> pre-commit
> > >>>>>>>>>>> though.
> > >>>>>>>>>>>
> > >>>>>>>>>>> J.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> J.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> czw., 3 lip 2025, 12:11 użytkownik Ash Berlin-Taylor <
> > >>>> a...@apache.org>
> > >>>>>>>>>>> napisał:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Oh yes, symlinks will work, with one big caveat: It does
> mean
> > >>> you
> > >>>>>>>> can’t
> > >>>>>>>>>>>> use absolute imports in one common module to another.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> For example
> > >>>>>>>>>>>>
> > >>>
> >
> https://github.com/apache/airflow/blob/4c66ebd06/airflow-core/src/airflow/utils/serve_logs.py#L41
> > >>>>>>>>>>>> where we have
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ```
> > >>>>>>>>>>>> from airflow.utils.module_loading import import_string
> > >>>>>>>>>>>> ```
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> if we want to move serve_logs into this common lib that is
> > then
> > >>>>>>>>>> symlinked
> > >>>>>>>>>>>> then we wouldn’t be able to have `from
> > >>>> airflow_common.module_loading
> > >>>>>>>>>> import
> > >>>>>>>>>>>> import_string`.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I can think of two possible solutions here.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 1) is to allow/require relative imports in this shared lib,
> so
> > >>>> `from
> > >>>>>>>>>>>> .module_loading import import_string`
> > >>>>>>>>>>>> 2) is to use `vendoring`[1] (from the pip maintainers) which
> > >>> will
> > >>>>>>>> handle
> > >>>>>>>>>>>> import-rewriting for us.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I’d entirely forgot that symlinks in repos was a thing, so I
> > >>>> prepared
> > >>>>>>>> a
> > >>>>>>>>>>>> minimal POC/demo of what vendoring approach could look like
> > here
> > >>>>>>>>>>>>
> > >>>
> >
> https://github.com/apache/airflow/commit/996817782be6071b306a87af9f36fe1cf2d3aaa3
> > >>>>>>>>>>>> Now personally I am more than happy with relative imports,
> but
> > >>>>>>>> generally
> > >>>>>>>>>>>> as a project we have avoided them, so I think that limits
> what
> > >>> we
> > >>>>>>>> could
> > >>>>>>>>>> do
> > >>>>>>>>>>>> with a symlink based approach.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> -ash
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1] https://github.com/pradyunsg/vendoring
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> On 3 Jul 2025, at 10:30, Pavankumar Gopidesu <
> > >>>>>>>> gopidesupa...@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>> Thanks Ash
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Yes agree option 2 would be preferred for me. Making sure
> we
> > >>>> have all
> > >>>>>>>>>> the
> > >>>>>>>>>>>>> gaurdriles to protect any unwanted behaviour in code
> sharing
> > >>> and
> > >>>>>>>>>>>> executing
> > >>>>>>>>>>>>> right of tests between the packages.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Agree with others, option 2 would be
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Thu, Jul 3, 2025 at 10:02 AM Amogh Desai <
> > >>>>>>>> amoghdesai....@gmail.com>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks for starting this discussion, Ash.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I would prefer option 2 here with proper tooling to handle
> > the
> > >>>> code
> > >>>>>>>>>>>>>> duplication at *release* time.
> > >>>>>>>>>>>>>> It is best to have a dist that has all it needs in itself.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Option 1 could very quickly get out of hand and if we
> decide
> > >>> to
> > >>>>>>>>>> separate
> > >>>>>>>>>>>>>> triggerer /
> > >>>>>>>>>>>>>> dag processor / config etc etc as separate packages, back
> > >>>> compat is
> > >>>>>>>>>>>> going
> > >>>>>>>>>>>>>> to be a nightmare
> > >>>>>>>>>>>>>> and will bite us harder than we anticipate.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks & Regards,
> > >>>>>>>>>>>>>> Amogh Desai
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Thu, Jul 3, 2025 at 1:12 AM Kaxil Naik <
> > >>> kaxiln...@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>> I prefer Option 2 as well to avoid matrix of dependencies
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Thu, 3 Jul 2025 at 01:03, Jens Scheffler
> > >>>>>>>>>> <j_scheff...@gmx.de.invalid
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I'd also rather prefer option 2 - reason here is it is
> > >>> rather
> > >>>>>>>>>>>> pragmatic
> > >>>>>>>>>>>>>>>> and we no not need to cut another package and have less
> > >>>> package
> > >>>>>>>>>> counts
> > >>>>>>>>>>>>>>>> and dependencies.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I remember some time ago I was checking (together with
> > >>> Jarek,
> > >>>> I am
> > >>>>>>>>>> not
> > >>>>>>>>>>>>>>>> sure anymore...) if the usage of symlinks would be
> > possible.
> > >>>> To
> > >>>>>>>> keep
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>> source in one package but "symlink" it into another. If
> > then
> > >>>> at
> > >>>>>>>>>> point
> > >>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>> packaging/release the files are materialized we have 1
> set
> > >>> of
> > >>>>>>>> code.
> > >>>>>>>>>>>>>>>> Otherwise if not possible still the redundancy could be
> > >>>> solved by
> > >>>>>>>> a
> > >>>>>>>>>>>>>>>> pre-commit hook - and in Git the files are de-duplicated
> > >>>> anyway
> > >>>>>>>>>> based
> > >>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>> content hash, so this does not hurt.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On 02.07.25 18:49, Shahar Epstein wrote:
> > >>>>>>>>>>>>>>>>> I support option 2 with proper automation & CI - the
> > >>>> reasonings
> > >>>>>>>>>>>>>> you've
> > >>>>>>>>>>>>>>>>> shown for that make sense to me.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Shahar
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 3:36 PM Ash Berlin-Taylor <
> > >>>> a...@apache.org
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>> Hello everyone,
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> As we work on finishing off the code-level separation
> of
> > >>>> Task
> > >>>>>>>> SDK
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>> Core
> > >>>>>>>>>>>>>>>>>> (scheduler etc) we have come across some situations
> > where
> > >>> we
> > >>>>>>>> would
> > >>>>>>>>>>>>>>> like
> > >>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> share code between these.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> However it’s not as straight forward of “just put it
> in
> > a
> > >>>> common
> > >>>>>>>>>>>>>> dist
> > >>>>>>>>>>>>>>>> they
> > >>>>>>>>>>>>>>>>>> both depend upon” because one of the goals of the Task
> > SDK
> > >>>>>>>>>>>>>> separation
> > >>>>>>>>>>>>>>>> was
> > >>>>>>>>>>>>>>>>>> to have 100% complete version independence between the
> > >>> two,
> > >>>>>>>>>> ideally
> > >>>>>>>>>>>>>>>> even if
> > >>>>>>>>>>>>>>>>>> they are built into the same image and venv. Most of
> the
> > >>>> reason
> > >>>>>>>>>> why
> > >>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>> isn’t straight forward comes down to backwards
> > >>>> compatibility -
> > >>>>>>>> if
> > >>>>>>>>>> we
> > >>>>>>>>>>>>>>>> make
> > >>>>>>>>>>>>>>>>>> an change to the common/shared distribution
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> We’ve listed the options we have thought about in
> > >>>>>>>>>>>>>>>>>> https://github.com/apache/airflow/issues/51545 (but
> > that
> > >>>> covers
> > >>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>> more
> > >>>>>>>>>>>>>>>>>> things that I don’t want to get in to in this
> discussion
> > >>>> such as
> > >>>>>>>>>>>>>>>> possibly
> > >>>>>>>>>>>>>>>>>> separating operators and executors out of a single
> > >>> provider
> > >>>>>>>> dist.)
> > >>>>>>>>>>>>>>>>>> To give a concrete example of some code I would like
> to
> > >>>> share
> > >>>
> >
> https://github.com/apache/airflow/blob/84897570bf7e438afb157ba4700768ea74824295/airflow-core/src/airflow/_logging/structlog.py
> > >>>>>>>>>>>>>>>>>> — logging config. Another thing we will want to share
> > will
> > >>>> be
> > >>>>>>>> the
> > >>>>>>>>>>>>>>>>>> AirflowConfigParser class from airflow.configuration
> > (but
> > >>>>>>>> notably:
> > >>>>>>>>>>>>>>> only
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> parser class, _not_ the default config values, again,
> > lets
> > >>>> not
> > >>>>>>>>>> dwell
> > >>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> specifics of that)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> So to bring the options listed in the issue here for
> > >>>> discussion,
> > >>>>>>>>>>>>>>> broadly
> > >>>>>>>>>>>>>>>>>> speaking there are two high-level approaches:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> 1. A single shared distribution
> > >>>>>>>>>>>>>>>>>> 2. No shared package and copy/duplicate code
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The advantage of Approach 1 is that we only have the
> > code
> > >>>> in one
> > >>>>>>>>>>>>>>> place.
> > >>>>>>>>>>>>>>>>>> However for me, at least in this specific case of
> > Logging
> > >>>> config
> > >>>>>>>>>> or
> > >>>>>>>>>>>>>>>>>> AirflowConfigParser class is that backwards
> > compatibility
> > >>> is
> > >>>>>>>> much
> > >>>>>>>>>>>>>> much
> > >>>>>>>>>>>>>>>>>> harder.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The main advantage of Approach 2 is the the code is
> > >>> released
> > >>>>>>>>>>>>>>>> with/embedded
> > >>>>>>>>>>>>>>>>>> in the dist (i.e. apache-airflow-task-sdk would
> contain
> > >>> the
> > >>>>>>>> right
> > >>>>>>>>>>>>>>>> version
> > >>>>>>>>>>>>>>>>>> of the logging config and ConfigParser etc). The
> > downside
> > >>> is
> > >>>>>>>> that
> > >>>>>>>>>>>>>>> either
> > >>>>>>>>>>>>>>>>>> the code will need to be duplicated in the repo, or
> > better
> > >>>> yet
> > >>>>>>>> it
> > >>>>>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>>>> live in a single place in the repo, but some tooling
> > (TBD)
> > >>>> will
> > >>>>>>>>>>>>>>>>>> automatically handle the duplication, either at commit
> > >>>> time, or
> > >>>>>>>> my
> > >>>>>>>>>>>>>>>>>> preference, at release time.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> For this kind of shared “utility” code I am very
> > strongly
> > >>>>>>>> leaning
> > >>>>>>>>>>>>>>>> towards
> > >>>>>>>>>>>>>>>>>> option 2 with automation, as otherwise I think the
> > >>> backwards
> > >>>>>>>>>>>>>>>> compatibility
> > >>>>>>>>>>>>>>>>>> requirements would make it unworkable (very quickly
> over
> > >>>> time
> > >>>>>>>> the
> > >>>>>>>>>>>>>>>>>> combinations we would have to test would just be
> > >>>> unreasonable)
> > >>>>>>>>>> and I
> > >>>>>>>>>>>>>>>> don’t
> > >>>>>>>>>>>>>>>>>> feel confident we can have things as stable as we need
> > to
> > >>>> really
> > >>>>>>>>>>>>>>> deliver
> > >>>>>>>>>>>>>>>>>> the version separation/independency I want to delivery
> > >>> with
> > >>>>>>>>>> AIP-72.
> > >>>>>>>>>>>>>>>>>> So unless someone feels very strongly about this, I
> will
> > >>>> come up
> > >>>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>> draft PR for further discussion that will implement
> code
> > >>>> sharing
> > >>>>>>>>>> via
> > >>>>>>>>>>>>>>>>>> “vendoring” it at build time. I have an idea of how I
> > can
> > >>>>>>>> achieve
> > >>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>> so
> > >>>>>>>>>>>>>>>>>> we have a single version in the repo and it’ll work
> > there,
> > >>>> but
> > >>>>>>>> at
> > >>>>>>>>>>>>>>>> runtime
> > >>>>>>>>>>>>>>>>>> we vendor it in to the shipped dist so it lives at
> > >>> something
> > >>>>>>>> like
> > >>>>>>>>>>>>>>>>>> `airflow.sdk._vendor` etc.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> In terms of repo layout, this likely means we would
> end
> > up
> > >>>> with:
> > >>>>>>>>>>>>>>>>>> airflow-core/pyproject.toml
> > >>>>>>>>>>>>>>>>>> airflow-core/src/
> > >>>>>>>>>>>>>>>>>> airflow-core/tests/
> > >>>>>>>>>>>>>>>>>> task-sdk/pyproject.toml
> > >>>>>>>>>>>>>>>>>> task-sdk/src/
> > >>>>>>>>>>>>>>>>>> task-sdk/tests/
> > >>>>>>>>>>>>>>>>>> airflow-common/src
> > >>>>>>>>>>>>>>>>>> airflow-common/tests/
> > >>>>>>>>>>>>>>>>>> # Possibly no airflow-common/pyproject.toml, as deps
> > would
> > >>>> be
> > >>>>>>>>>>>>>> included
> > >>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>> the downstream projects. TBD.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thoughts and feedback welcomed.
> > >>>>
> ---------------------------------------------------------------------
> > >>>>>>>>>>>>>>>> To unsubscribe, e-mail:
> > dev-unsubscr...@airflow.apache.org
> > >>>>>>>>>>>>>>>> For additional commands, e-mail:
> > >>> dev-h...@airflow.apache.org
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>
> > >>> ---------------------------------------------------------------------
> > >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>>>
> > >>>>>
> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>
> > >>>>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to