Thanks Ash! Looks cool! I like the structure. This will enable all the combinations and structure looks easy to grasp. No strong stance on the naming other than maybe it is a bit long with `and`, `core_ctl` could be shorter, since no import path is defined like that, we can give any name for sure.
Best regards, On Mon, 7 Jul 2025, 21:51 Jarek Potiuk, <ja...@potiuk.com> wrote: > Looks good but I think we should find some better logical name for > core_and_sdk :) > > pon., 7 lip 2025, 21:44 użytkownik Jens Scheffler > <j_scheff...@gmx.de.invalid> napisał: > > > Cool! Especially the "shared" folder with the ability to have > > N-combinations w/o exploding project repo root! > > > > On 07.07.25 14:43, Ash Berlin-Taylor wrote: > > > Oh, and all of this will be explain in shared/README.md > > > > > >> On 7 Jul 2025, at 13:41, Ash Berlin-Taylor <a...@apache.org> wrote: > > >> > > >> Okay, so it seems we have agreement on the approach here, so I’ll > > continue with this, and on the dev call it was mentioned that > > “airflow-common” wasn’t a great name, so here is my proposal for the file > > structure; > > >> > > >> ``` > > >> / > > >> task-sdk/... > > >> airflow-core/... > > >> shared/ > > >> kuberenetes/ > > >> pyproject.toml > > >> src/ > > >> airflow_kube/__init__.py > > >> core-and-tasksdk/ > > >> pyproject.toml > > >> src/ > > >> airflow_shared/__init__.py > > >> ``` > > >> > > >> Things to note here: the “shared” folder has (the possibility) of > > having multiple different shared “libraries” in it, in this example I am > > supposing a hypothetical shared kuberenetes folder a world in which we > > split the KubePodOperator and the KubeExecutor in to two separate > > distributions (example only, not proposing we do that right now, and that > > will be a separate discussion) > > >> > > >> The other things to note here: > > >> > > >> > > >> - the folder name in shared aims to be “self-documenting”, hence the > > verbose “core-and-tasksdk” to say where the shared library is intended to > > be used. > > >> - the python module itself should almost always have an `airflow_` (or > > maybe `_airflow_`?) prefix so that it does not conflict with anything > else > > we might use. It won’t matter “in production” as those will be vendored > in > > to be imported as `airflow/_vendor/airflow_shared` etc, but avoiding > > conflicts at dev time with the Finder approach is a good safety measure. > > >> > > >> I will start making a real PR for this proposal now, but I’m open to > > feedback (either here, or in the PR when I open it) > > >> > > >> -ash > > >> > > >>> On 4 Jul 2025, at 16:55, Jarek Potiuk <ja...@potiuk.com> wrote: > > >>> > > >>> Yeah we have to try it and test - also building packages happens semi > > >>> frequently when you run `uv sync` (they use some kind of heuristics > to > > >>> decide when) and you can force it with `--reinstall` or `--refresh`. > > >>> Package build also happens every time when you run "ci-image build` > > now in > > >>> breeze so it seems like it will nicely integrate in our workflows. > > >>> > > >>> Looks really cool Ash. > > >>> > > >>> On Fri, Jul 4, 2025 at 5:14 PM Ash Berlin-Taylor <a...@apache.org> > > wrote: > > >>> > > >>>> It’s not just release time, but any time we build a package which > > happens > > >>>> on “every” CI run. The normal unit tests will use code from > > >>>> airflow-common/src/airflow_common; the kube tests which build an > > image will > > >>>> build the dists and vendor in the code from that commit. > > >>>> > > >>>> There is only a single copy of the shared code committed to the > repo, > > so > > >>>> there is never anything to synchronise. > > >>>> > > >>>>> On 4 Jul 2025, at 15:53, Amogh Desai <amoghdesai....@gmail.com> > > wrote: > > >>>>> > > >>>>> Thanks Ash. > > >>>>> > > >>>>> This is really cool and helpful that you were able to test both > > scenarios > > >>>>> -- repo checkout > > >>>>> and also installing from the vendored package and the resolution > > worked > > >>>>> fine too. > > >>>>> > > >>>>> I like this idea compared the to relative import one for few > reasons: > > >>>>> - It feels like it will take some time to adjust to the new coding > > >>>> standard > > >>>>> that we will lay > > >>>>> if we impose relative imports in the shared dist > > >>>>> - We can continue using repo wise absolute import standards, it is > > also > > >>>>> much easier for situations > > >>>>> when we do global search in IDE to find + replace, this could mean > a > > >>>> change > > >>>>> there > > >>>>> - The vendoring work is a proven and established paradigm across > > projects > > >>>>> and would > > >>>>> out of box give us the build tooling we need also > > >>>>> > > >>>>> Nothing too against the relative import but with the evidence > > provided > > >>>>> above, vendored approach > > >>>>> seems to only do us good. > > >>>>> > > >>>>> Regarding synchronizing it, release time should be fine as long as > we > > >>>> have > > >>>>> a good CI workflow to probably > > >>>>> catch such issues per PR if changes are made in shared dist? > > (precommit > > >>>>> would make it really slow i guess) > > >>>>> > > >>>>> If we can run our tests with vendored code we should be mostly > > covered. > > >>>>> > > >>>>> Good effort all! > > >>>>> > > >>>>> Thanks & Regards, > > >>>>> Amogh Desai > > >>>>> > > >>>>> > > >>>>>> On Fri, Jul 4, 2025 at 7:23 PM Ash Berlin-Taylor <a...@apache.org> > > >>>> wrote: > > >>>>>> Okay, I think I’ve got something that works and I’m happy with. > > >>>>>> > > >>>>>> > > >>>>>> > > >>>> > > > https://github.com/astronomer/airflow/tree/shared-vendored-lib-tasksdk-and-core > > >>>>>> This produces the following from `uv build task-sdk` > > >>>>>> - > > >>>>>> > > >>>> > > > https://github.com/user-attachments/files/21058976/apache_airflow_task_sdk-1.1.0.tar.gz > > >>>>>> - > > >>>>>> > > >>>> > > > https://github.com/user-attachments/files/21058996/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip > > >>>>>> (`.whl.zip` as GH won't allow .whl upload, but will .zip) > > >>>>>> > > >>>>>> ``` > > >>>>>> ❯ unzip -l > dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip | > > >>>> grep > > >>>>>> _vendor > > >>>>>> 50 02-02-2020 00:00 airflow/sdk/_vendor/.gitignore > > >>>>>> 2082 02-02-2020 00:00 airflow/sdk/_vendor/__init__.py > > >>>>>> 28 02-02-2020 00:00 airflow/sdk/_vendor/airflow_common.pyi > > >>>>>> 18 02-02-2020 00:00 airflow/sdk/_vendor/vendor.txt > > >>>>>> 785 02-02-2020 00:00 > > >>>>>> airflow/sdk/_vendor/airflow_common/__init__.py > > >>>>>> 10628 02-02-2020 00:00 > > >>>>>> airflow/sdk/_vendor/airflow_common/timezone.py > > >>>>>> ``` > > >>>>>> > > >>>>>> And similarly in the .tar.gz, so our “sdist” is complete too: > > >>>>>> ``` > > >>>>>> ❯ tar -tzf dist/apache_airflow_task_sdk-1.1.0.tar.gz |grep _vendor > > >>>>>> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/.gitignore > > >>>>>> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/__init__.py > > >>>>>> > > apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common.pyi > > >>>>>> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/vendor.txt > > >>>>>> > > >>>>>> > > >>>> > > > apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/__init__.py > > >>>>>> > > >>>> > > > apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/timezone.py > > >>>>>> ``` > > >>>>>> > > >>>>>> The plugin works at build time by including/copying the libs > > specified > > >>>> in > > >>>>>> vendor.txt into place (and let `vendoring` take care of import > > >>>> rewrites.) > > >>>>>> For the imports to continue to work at “dev” time/from a repo > > checkout, > > >>>> I > > >>>>>> have added a import finder to `sys.meta_path`, and since it’s at > the > > >>>> end of > > >>>>>> the list it will only be used if the normal import can’t find > > things. > > >>>>>> > > >>>>>> > > >>>>>> > > >>>> > > > https://github.com/astronomer/airflow/blob/996817782be6071b306a87af9f36fe1cf2d3aaa3/task-sdk/src/airflow/sdk/_vendor/__init__.py > > >>>>>> This doesn’t quite give us the same runtime effect “import > > rewriting” > > >>>>>> affect, as in this approach `airflow_common` is directly loaded > > (i.e. > > >>>>>> airflow.sdk._vendor.airflow_common and airflow_common exist in > > >>>>>> sys.modules), but it does work for everything that I was able to > > test.. > > >>>>>> > > >>>>>> I tested it with the diff at the end of this message. My test > > ipython > > >>>>>> shell: > > >>>>>> > > >>>>>> ``` > > >>>>>> In [1]: from airflow.sdk._vendor.airflow_common.timezone import > foo > > >>>>>> > > >>>>>> In [2]: foo > > >>>>>> Out[2]: 1 > > >>>>>> > > >>>>>> In [3]: import airflow.sdk._vendor.airflow_common > > >>>>>> > > >>>>>> In [4]: import airflow.sdk._vendor.airflow_common.timezone > > >>>>>> > > >>>>>> In [5]: airflow.sdk._vendor.airflow_common.__file__ > > >>>>>> Out[5]: > > >>>>>> > > >>>> > > > '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/__init__.py' > > >>>>>> In [6]: airflow.sdk._vendor.airflow_common.timezone.__file__ > > >>>>>> Out[6]: > > >>>>>> > > >>>> > > > '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/timezone.py' > > >>>>>> ``` > > >>>>>> > > >>>>>> > > >>>>>> And in an standalone environment with the SDK dist I built (it > > needed > > >>>> the > > >>>>>> matching airflow-core right now, but that is nothing to do with > this > > >>>>>> discussion): > > >>>>>> > > >>>>>> ``` > > >>>>>> ❯ _AIRFLOW__AS_LIBRARY=1 uvx --python 3.12 --with > > >>>>>> dist/apache_airflow_core-3.1.0-py3-none-any.whl --with > > >>>>>> dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl ipython > > >>>>>> Python 3.12.7 (main, Oct 16 2024, 07:12:08) [Clang 18.1.8 ] > > >>>>>> Type 'copyright', 'credits' or 'license' for more information > > >>>>>> IPython 9.4.0 -- An enhanced Interactive Python. Type '?' for > help. > > >>>>>> Tip: You can use `%hist` to view history, see the options with > > >>>> `%history?` > > >>>>>> In [1]: import airflow.sdk._vendor.airflow_common.timezone > > >>>>>> > > >>>>>> In [2]: airflow.sdk._vendor.airflow_common.timezone.__file__ > > >>>>>> Out[2]: > > >>>>>> > > >>>> > > > '/Users/ash/.cache/uv/archive-v0/WWq6r65aPto2eJOyPObEH/lib/python3.12/site-packages/airflow/sdk/_vendor/airflow_common/timezone.py’ > > >>>>>> `` > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> ```diff > > >>>>>> diff --git a/airflow-common/src/airflow_common/__init__.py > > >>>>>> b/airflow-common/src/airflow_common/__init__.py > > >>>>>> index 13a83393a9..927b7c6b61 100644 > > >>>>>> --- a/airflow-common/src/airflow_common/__init__.py > > >>>>>> +++ b/airflow-common/src/airflow_common/__init__.py > > >>>>>> @@ -14,3 +14,5 @@ > > >>>>>> # KIND, either express or implied. See the License for the > > >>>>>> # specific language governing permissions and limitations > > >>>>>> # under the License. > > >>>>>> + > > >>>>>> +foo = 1 > > >>>>>> diff --git a/airflow-common/src/airflow_common/timezone.py > > >>>>>> b/airflow-common/src/airflow_common/timezone.py > > >>>>>> index 340b924c66..58384ef20f 100644 > > >>>>>> --- a/airflow-common/src/airflow_common/timezone.py > > >>>>>> +++ b/airflow-common/src/airflow_common/timezone.py > > >>>>>> @@ -36,6 +36,9 @@ _PENDULUM3 = > > >>>>>> version.parse(metadata.version("pendulum")).major == 3 > > >>>>>> # - FixedTimezone(0, "UTC") in pendulum 2 > > >>>>>> utc = pendulum.UTC > > >>>>>> > > >>>>>> + > > >>>>>> +from airflow_common import foo > > >>>>>> + > > >>>>>> TIMEZONE: Timezone > > >>>>>> > > >>>>>> > > >>>>>> ``` > > >>>>>> > > >>>>>>>> On 3 Jul 2025, at 12:43, Jarek Potiuk <ja...@potiuk.com> wrote: > > >>>>>>> I think both approaches are doable: > > >>>>>>> > > >>>>>>> 1) -> We can very easily prevent bad imports by pre-commit when > > >>>> importing > > >>>>>>> from different distributions and make sure we are only doing > > relative > > >>>>>>> imports in the shared modules. We are doing plenty of this > > already. And > > >>>>>> yes > > >>>>>>> it would require relative links we currently do not allow. > > >>>>>>> > > >>>>>>> 2) -> has one disadvantage that someone at some point in time > will > > have > > >>>>>> to > > >>>>>>> decide to synchronize this and if it happens just before release > > (I bet > > >>>>>>> this is going to happen) this will lead to solving problems that > > would > > >>>>>>> normally be solved during PR when you make a change (i.e. > symbolic > > link > > >>>>>> has > > >>>>>>> the advantage that whoever modifies shared code will be > immediately > > >>>>>>> notified in their PR - that they broke something because either > > static > > >>>>>>> checks or mypy or tests fail. > > >>>>>>> > > >>>>>>> Ash, do you have an idea of a process (who and when) does the > > >>>>>>> synchronisation in case of vendoring? Maybe we could solve it if > > it is > > >>>>>> done > > >>>>>>> more frequently and with some regularity? We could potentially > > force > > >>>>>>> re-vendoring at PR time as well any time shared code changes (and > > >>>> prevent > > >>>>>>> it by pre-commit. And I can't think of some place (other than > > releases) > > >>>>>> in > > >>>>>>> our development workflow and that seems to be a bit too late as > > puts an > > >>>>>>> extra effort on fixing potential incompatibilities introduced on > > >>>> release > > >>>>>>> manager and delays the release. WDYT? > > >>>>>>> > > >>>>>>> Re: relative links. I think for a shared library we could > > potentially > > >>>>>> relax > > >>>>>>> this and allow them (and actually disallow absolute links in the > > pieces > > >>>>>> of > > >>>>>>> code that are shared - again, by pre-commit). As I recall, the > only > > >>>>>> reason > > >>>>>>> we forbade the relative links is because of how we are (or maybe > > were) > > >>>>>>> doing DAG parsing and failures resulting from it. So we decided > to > > just > > >>>>>> not > > >>>>>>> allow it to keep consistency. The way how Dag parsing works is > that > > >>>> when > > >>>>>>> you are using importlib to read the Dag from a file, the relative > > >>>> imports > > >>>>>>> do not work as it does not know what they should be relative to. > > But if > > >>>>>>> relative import is done from an imported package, it should be no > > >>>>>> problem, > > >>>>>>> I think - otherwise our Dags would not be able to import any > > library > > >>>> that > > >>>>>>> uses relative imports. > > >>>>>>> > > >>>>>>> Of course consistency might be the reason why we do not want to > > >>>> introduce > > >>>>>>> relative imports. I don't see it as an issue if it is guarded by > > >>>>>> pre-commit > > >>>>>>> though. > > >>>>>>> > > >>>>>>> J. > > >>>>>>> > > >>>>>>> > > >>>>>>> J. > > >>>>>>> > > >>>>>>> > > >>>>>>> czw., 3 lip 2025, 12:11 użytkownik Ash Berlin-Taylor < > > a...@apache.org> > > >>>>>>> napisał: > > >>>>>>> > > >>>>>>>> Oh yes, symlinks will work, with one big caveat: It does mean > you > > >>>> can’t > > >>>>>>>> use absolute imports in one common module to another. > > >>>>>>>> > > >>>>>>>> For example > > >>>>>>>> > > >>>> > > > https://github.com/apache/airflow/blob/4c66ebd06/airflow-core/src/airflow/utils/serve_logs.py#L41 > > >>>>>>>> where we have > > >>>>>>>> > > >>>>>>>> ``` > > >>>>>>>> from airflow.utils.module_loading import import_string > > >>>>>>>> ``` > > >>>>>>>> > > >>>>>>>> if we want to move serve_logs into this common lib that is then > > >>>>>> symlinked > > >>>>>>>> then we wouldn’t be able to have `from > > airflow_common.module_loading > > >>>>>> import > > >>>>>>>> import_string`. > > >>>>>>>> > > >>>>>>>> I can think of two possible solutions here. > > >>>>>>>> > > >>>>>>>> 1) is to allow/require relative imports in this shared lib, so > > `from > > >>>>>>>> .module_loading import import_string` > > >>>>>>>> 2) is to use `vendoring`[1] (from the pip maintainers) which > will > > >>>> handle > > >>>>>>>> import-rewriting for us. > > >>>>>>>> > > >>>>>>>> I’d entirely forgot that symlinks in repos was a thing, so I > > prepared > > >>>> a > > >>>>>>>> minimal POC/demo of what vendoring approach could look like here > > >>>>>>>> > > >>>> > > > https://github.com/apache/airflow/commit/996817782be6071b306a87af9f36fe1cf2d3aaa3 > > >>>>>>>> Now personally I am more than happy with relative imports, but > > >>>> generally > > >>>>>>>> as a project we have avoided them, so I think that limits what > we > > >>>> could > > >>>>>> do > > >>>>>>>> with a symlink based approach. > > >>>>>>>> > > >>>>>>>> -ash > > >>>>>>>> > > >>>>>>>> [1] https://github.com/pradyunsg/vendoring > > >>>>>>>> > > >>>>>>>>> On 3 Jul 2025, at 10:30, Pavankumar Gopidesu < > > >>>> gopidesupa...@gmail.com> > > >>>>>>>> wrote: > > >>>>>>>>> Thanks Ash > > >>>>>>>>> > > >>>>>>>>> Yes agree option 2 would be preferred for me. Making sure we > > have all > > >>>>>> the > > >>>>>>>>> gaurdriles to protect any unwanted behaviour in code sharing > and > > >>>>>>>> executing > > >>>>>>>>> right of tests between the packages. > > >>>>>>>>> > > >>>>>>>>> Agree with others, option 2 would be > > >>>>>>>>> > > >>>>>>>>> On Thu, Jul 3, 2025 at 10:02 AM Amogh Desai < > > >>>> amoghdesai....@gmail.com> > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Thanks for starting this discussion, Ash. > > >>>>>>>>>> > > >>>>>>>>>> I would prefer option 2 here with proper tooling to handle the > > code > > >>>>>>>>>> duplication at *release* time. > > >>>>>>>>>> It is best to have a dist that has all it needs in itself. > > >>>>>>>>>> > > >>>>>>>>>> Option 1 could very quickly get out of hand and if we decide > to > > >>>>>> separate > > >>>>>>>>>> triggerer / > > >>>>>>>>>> dag processor / config etc etc as separate packages, back > > compat is > > >>>>>>>> going > > >>>>>>>>>> to be a nightmare > > >>>>>>>>>> and will bite us harder than we anticipate. > > >>>>>>>>>> > > >>>>>>>>>> Thanks & Regards, > > >>>>>>>>>> Amogh Desai > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> On Thu, Jul 3, 2025 at 1:12 AM Kaxil Naik < > kaxiln...@gmail.com> > > >>>>>> wrote: > > >>>>>>>>>>> I prefer Option 2 as well to avoid matrix of dependencies > > >>>>>>>>>>> > > >>>>>>>>>>> On Thu, 3 Jul 2025 at 01:03, Jens Scheffler > > >>>>>> <j_scheff...@gmx.de.invalid > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>>> I'd also rather prefer option 2 - reason here is it is > rather > > >>>>>>>> pragmatic > > >>>>>>>>>>>> and we no not need to cut another package and have less > > package > > >>>>>> counts > > >>>>>>>>>>>> and dependencies. > > >>>>>>>>>>>> > > >>>>>>>>>>>> I remember some time ago I was checking (together with > Jarek, > > I am > > >>>>>> not > > >>>>>>>>>>>> sure anymore...) if the usage of symlinks would be possible. > > To > > >>>> keep > > >>>>>>>>>> the > > >>>>>>>>>>>> source in one package but "symlink" it into another. If then > > at > > >>>>>> point > > >>>>>>>>>> of > > >>>>>>>>>>>> packaging/release the files are materialized we have 1 set > of > > >>>> code. > > >>>>>>>>>>>> Otherwise if not possible still the redundancy could be > > solved by > > >>>> a > > >>>>>>>>>>>> pre-commit hook - and in Git the files are de-duplicated > > anyway > > >>>>>> based > > >>>>>>>>>> on > > >>>>>>>>>>>> content hash, so this does not hurt. > > >>>>>>>>>>>> > > >>>>>>>>>>>> On 02.07.25 18:49, Shahar Epstein wrote: > > >>>>>>>>>>>>> I support option 2 with proper automation & CI - the > > reasonings > > >>>>>>>>>> you've > > >>>>>>>>>>>>> shown for that make sense to me. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Shahar > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Wed, Jul 2, 2025 at 3:36 PM Ash Berlin-Taylor < > > a...@apache.org > > >>>>>>>>>>> wrote: > > >>>>>>>>>>>>>> Hello everyone, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> As we work on finishing off the code-level separation of > > Task > > >>>> SDK > > >>>>>>>>>> and > > >>>>>>>>>>>> Core > > >>>>>>>>>>>>>> (scheduler etc) we have come across some situations where > we > > >>>> would > > >>>>>>>>>>> like > > >>>>>>>>>>>> to > > >>>>>>>>>>>>>> share code between these. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> However it’s not as straight forward of “just put it in a > > common > > >>>>>>>>>> dist > > >>>>>>>>>>>> they > > >>>>>>>>>>>>>> both depend upon” because one of the goals of the Task SDK > > >>>>>>>>>> separation > > >>>>>>>>>>>> was > > >>>>>>>>>>>>>> to have 100% complete version independence between the > two, > > >>>>>> ideally > > >>>>>>>>>>>> even if > > >>>>>>>>>>>>>> they are built into the same image and venv. Most of the > > reason > > >>>>>> why > > >>>>>>>>>>> this > > >>>>>>>>>>>>>> isn’t straight forward comes down to backwards > > compatibility - > > >>>> if > > >>>>>> we > > >>>>>>>>>>>> make > > >>>>>>>>>>>>>> an change to the common/shared distribution > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> We’ve listed the options we have thought about in > > >>>>>>>>>>>>>> https://github.com/apache/airflow/issues/51545 (but that > > covers > > >>>>>>>>>> some > > >>>>>>>>>>>> more > > >>>>>>>>>>>>>> things that I don’t want to get in to in this discussion > > such as > > >>>>>>>>>>>> possibly > > >>>>>>>>>>>>>> separating operators and executors out of a single > provider > > >>>> dist.) > > >>>>>>>>>>>>>> To give a concrete example of some code I would like to > > share > > >>>>>>>>>>>>>> > > >>>> > > > https://github.com/apache/airflow/blob/84897570bf7e438afb157ba4700768ea74824295/airflow-core/src/airflow/_logging/structlog.py > > >>>>>>>>>>>>>> — logging config. Another thing we will want to share will > > be > > >>>> the > > >>>>>>>>>>>>>> AirflowConfigParser class from airflow.configuration (but > > >>>> notably: > > >>>>>>>>>>> only > > >>>>>>>>>>>> the > > >>>>>>>>>>>>>> parser class, _not_ the default config values, again, lets > > not > > >>>>>> dwell > > >>>>>>>>>>> on > > >>>>>>>>>>>> the > > >>>>>>>>>>>>>> specifics of that) > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> So to bring the options listed in the issue here for > > discussion, > > >>>>>>>>>>> broadly > > >>>>>>>>>>>>>> speaking there are two high-level approaches: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> 1. A single shared distribution > > >>>>>>>>>>>>>> 2. No shared package and copy/duplicate code > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> The advantage of Approach 1 is that we only have the code > > in one > > >>>>>>>>>>> place. > > >>>>>>>>>>>>>> However for me, at least in this specific case of Logging > > config > > >>>>>> or > > >>>>>>>>>>>>>> AirflowConfigParser class is that backwards compatibility > is > > >>>> much > > >>>>>>>>>> much > > >>>>>>>>>>>>>> harder. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> The main advantage of Approach 2 is the the code is > released > > >>>>>>>>>>>> with/embedded > > >>>>>>>>>>>>>> in the dist (i.e. apache-airflow-task-sdk would contain > the > > >>>> right > > >>>>>>>>>>>> version > > >>>>>>>>>>>>>> of the logging config and ConfigParser etc). The downside > is > > >>>> that > > >>>>>>>>>>> either > > >>>>>>>>>>>>>> the code will need to be duplicated in the repo, or better > > yet > > >>>> it > > >>>>>>>>>>> would > > >>>>>>>>>>>>>> live in a single place in the repo, but some tooling (TBD) > > will > > >>>>>>>>>>>>>> automatically handle the duplication, either at commit > > time, or > > >>>> my > > >>>>>>>>>>>>>> preference, at release time. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> For this kind of shared “utility” code I am very strongly > > >>>> leaning > > >>>>>>>>>>>> towards > > >>>>>>>>>>>>>> option 2 with automation, as otherwise I think the > backwards > > >>>>>>>>>>>> compatibility > > >>>>>>>>>>>>>> requirements would make it unworkable (very quickly over > > time > > >>>> the > > >>>>>>>>>>>>>> combinations we would have to test would just be > > unreasonable) > > >>>>>> and I > > >>>>>>>>>>>> don’t > > >>>>>>>>>>>>>> feel confident we can have things as stable as we need to > > really > > >>>>>>>>>>> deliver > > >>>>>>>>>>>>>> the version separation/independency I want to delivery > with > > >>>>>> AIP-72. > > >>>>>>>>>>>>>> So unless someone feels very strongly about this, I will > > come up > > >>>>>>>>>> with > > >>>>>>>>>>> a > > >>>>>>>>>>>>>> draft PR for further discussion that will implement code > > sharing > > >>>>>> via > > >>>>>>>>>>>>>> “vendoring” it at build time. I have an idea of how I can > > >>>> achieve > > >>>>>>>>>> this > > >>>>>>>>>>>> so > > >>>>>>>>>>>>>> we have a single version in the repo and it’ll work there, > > but > > >>>> at > > >>>>>>>>>>>> runtime > > >>>>>>>>>>>>>> we vendor it in to the shipped dist so it lives at > something > > >>>> like > > >>>>>>>>>>>>>> `airflow.sdk._vendor` etc. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> In terms of repo layout, this likely means we would end up > > with: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> airflow-core/pyproject.toml > > >>>>>>>>>>>>>> airflow-core/src/ > > >>>>>>>>>>>>>> airflow-core/tests/ > > >>>>>>>>>>>>>> task-sdk/pyproject.toml > > >>>>>>>>>>>>>> task-sdk/src/ > > >>>>>>>>>>>>>> task-sdk/tests/ > > >>>>>>>>>>>>>> airflow-common/src > > >>>>>>>>>>>>>> airflow-common/tests/ > > >>>>>>>>>>>>>> # Possibly no airflow-common/pyproject.toml, as deps would > > be > > >>>>>>>>>> included > > >>>>>>>>>>>> in > > >>>>>>>>>>>>>> the downstream projects. TBD. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Thoughts and feedback welcomed. > > >>>>>>>>>>>> > > >>>>>> > > --------------------------------------------------------------------- > > >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > >>>>>>>>>>>> For additional commands, e-mail: > dev-h...@airflow.apache.org > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>> > > >>>> > --------------------------------------------------------------------- > > >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > >>>> For additional commands, e-mail: dev-h...@airflow.apache.org > > >>>> > > >>>> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > >> For additional commands, e-mail: dev-h...@airflow.apache.org > > >> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >