Re: [Discussion] Requesting Feedback on Airflow Survey

2023-10-25 Thread Peter DeJoy
Thanks so much for putting this together, Briana. A few thoughts from my end:

• This isn’t particularly actionable, but right now, the survey feels quite 
long. I wonder if we can consolidate some of the questions that are asking for 
similar pieces of information in an effort to get more folks to complete it.
• Similarly, I feel the pick lists are quite long. Can we flatten the bands 
down for some of the questions? For example, on the question on how many 
production deployments a user is running, we could have 1, 2-5, 5-20, 20-100, 
100+ be options.
• I’m not sure about the question “how many people at your company directly 
work on data?” I think we’re going to get poor signal there, as there’s an 
argument to be made that every stakeholder at our respective companies work on 
data!
• I’d like to see backfills added to the pick list for improvements in Airflow 
in the “future” section
• Come to think of it, is there overlap in the “what could be improved” vs 
“what would you like to see net new” questions? For example, I see dataset 
improvement as an option in the net new question but would think I’d find that 
in the improvement section.

I’m sure I’ll have some other thoughts as I dig through- will keep updating 
this thread as things come up!

Thanks again,
Pete
On Oct 25, 2023, 3:32 PM -0400, Jarek Potiuk , wrote:
> Agree with Andrey's suggestions - added a few of mine directly to the docs
> as comments/suggestions. Summary of my comments:
>
> * mentioning 2.5 providers compatibility and reasoning why people staying
> below 2.5
> * I thing suggesting list of services/tools Airflow might interact with to
> choose, will introduce bias and will make it difficult to get "real"
> impression on what is used. I suggest leaving it "freeform".
> * asking questions about importance/ways how they are managing security of
> their deployment (and whether they follow advisories)
> * I think we should also ask question about "other" orchestration tools and
> what made people choose airflow/consider others
>
>
> On Wed, Oct 25, 2023 at 6:58 PM Andrey Anshin 
> wrote:
>
> > Hey Briana,
> >
> > Thanks for sharing questions. Let me share some idea/improvements, but it
> > only my thoughts so it could be non-relevant for create clear survey
> >
> > *Which version of Airflow do you currently use*
> > Maybe it's better keep only this one?
> > - 1.10
> > - <=2.4
> > - 2.5
> > - 2.6
> > - 2.7
> >
> > 1.10 it is so legacy now
> > 2-2.4 not so legacy, but latest providers can't be install on this versions
> > 2.5.x it is the highest version which might install latest providers
> >
> > *Which Metadata Database do you use?*
> >
> > My suggestion is to make this question, if it possible, only for
> > self-hosted Airflow installation. I guess for Managed Airflow it would
> > always be Postgres.
> >
> > An split this question by two different
> >
> > First about type of DB
> >
> > - MySQL (not MariaDB)
> > - Postgres
> > - Microsoft SQL Server
> > - Other: __
> >
> > Second one about version DB version
> >
> > - Version: __
> > - I don't know
> >
> > Every year new version of Postgres released, so if we consider that some
> > users might still use 1.10.x then it could be a chance 9.4 - 9.5 used
> > I think more interesting in this question is Postgres have critical total
> > share value (like 80-90 of all respondent) from which we might start think,
> > is actually we need in the future (Airflow 3.x/4.x) something different
> > rather than Postgres and SQLite (for testing)
> >
> >
> >
> > 
> > Best Wishes
> > *Andrey Anshin*
> >
> >
> >
> > On Wed, 25 Oct 2023 at 19:08, Briana Okyere
> >  wrote:
> >
> > > Hey All,
> > >
> > > For the last few years, we've sent out surveys to get a sense of the
> > state
> > > of this Airflow community, and this year I've been tasked with
> > distributing
> > > it. I'd love to get your feedback before it's pushed live.
> > >
> > > I've made some minor tweaks to the 2022 survey and added the questions to
> > > this google doc: <
> > >
> > >
> > https://docs.google.com/document/d/1FbluXNGq9cI3N9zw1cH4F4cEQfyIQ6tk7tGLo4ZN4AE/edit?usp=sharing
> > > >
> > >
> > > Is there anything you think should be added or removed?
> > >
> > > Please note- it is useful to compare data from previous years to this
> > year,
> > > so I believe the majority of the questions should remain similar-ish.
> > > However, there is always room for improvement!
> > >
> > > Here are last year's results for reference:
> > > 
> > >
> > > Also, if anyone is interested in helping with analysis after the results
> > > are in, please let me know.
> > >
> > > --
> > > Briana Okyere
> > > Community Manager
> > > Email: briana.oky...@astronomer.io
> > > Mobile: +1 415.713.9943
> > > Time zone: US Pacific UTC
> > >
> > > 
> > >
> >


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Oliveira, Niko
+1 (binding)

looking forward to having more native LLM capabilities in Airflow!


From: Aritra Basu 
Sent: Wednesday, October 25, 2023 12:10:00 PM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for Pinecone, 
OpenAI & Cohere to enable first-class LLMOps

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (non binding)

--
Regards,
Aritra Basu

On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis 
wrote:

> +1 (binding)
>
>
>  - ferruzzi
>
>
> 
> From: Jed Cunningham 
> Sent: Wednesday, October 25, 2023 9:54 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> Pinecone, OpenAI & Cohere to enable first-class LLMOps
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> +1 (binding)
>


Re: [Discussion] Requesting Feedback on Airflow Survey

2023-10-25 Thread Jarek Potiuk
Agree with Andrey's suggestions - added a few of mine directly to the docs
as comments/suggestions. Summary of my comments:

* mentioning 2.5 providers compatibility and reasoning why people staying
below 2.5
* I thing suggesting list of services/tools Airflow might interact with to
choose, will introduce bias and will make it difficult to get "real"
impression on what is used. I suggest leaving it "freeform".
* asking questions about importance/ways how they are managing security of
their deployment (and whether they follow advisories)
* I think we should also ask question about "other" orchestration tools and
what made people choose airflow/consider others


On Wed, Oct 25, 2023 at 6:58 PM Andrey Anshin 
wrote:

> Hey Briana,
>
> Thanks for sharing questions. Let me share some idea/improvements, but it
> only my thoughts so it could be non-relevant for create clear survey
>
> *Which version of Airflow do you currently use*
> Maybe it's better keep only this one?
> - 1.10
> - <=2.4
> - 2.5
> - 2.6
> - 2.7
>
> 1.10 it is so legacy now
> 2-2.4 not so legacy, but latest providers can't be install on this versions
> 2.5.x it is the highest version which might install latest providers
>
> *Which Metadata Database do you use?*
>
> My suggestion is to make this question, if it possible, only for
> self-hosted Airflow installation. I guess for Managed Airflow it would
> always be Postgres.
>
> An split this question by two different
>
> First about type of DB
>
> - MySQL (not MariaDB)
> - Postgres
> - Microsoft SQL Server
> - Other: __
>
> Second one about version DB version
>
> - Version: __
> - I don't know
>
> Every year new version of Postgres released, so if we consider that some
> users might still use 1.10.x then it could be a chance 9.4 - 9.5 used
> I think more interesting in this question is Postgres have critical total
> share value (like 80-90 of all respondent) from which we might start think,
> is actually we need in the future (Airflow 3.x/4.x) something different
> rather than Postgres and SQLite (for testing)
>
>
>
> 
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Wed, 25 Oct 2023 at 19:08, Briana Okyere
>  wrote:
>
> > Hey All,
> >
> > For the last few years, we've sent out surveys to get a sense of the
> state
> > of this Airflow community, and this year I've been tasked with
> distributing
> > it. I'd love to get your feedback before it's pushed live.
> >
> > I've made some minor tweaks to the 2022 survey and added the questions to
> > this google doc: <
> >
> >
> https://docs.google.com/document/d/1FbluXNGq9cI3N9zw1cH4F4cEQfyIQ6tk7tGLo4ZN4AE/edit?usp=sharing
> > >
> >
> > Is there anything you think should be added or removed?
> >
> > Please note- it is useful to compare data from previous years to this
> year,
> > so I believe the majority of the questions should remain similar-ish.
> > However, there is always room for improvement!
> >
> > Here are last year's results for reference:
> > 
> >
> > Also, if anyone is interested in helping with analysis after the results
> > are in, please let me know.
> >
> > --
> > Briana Okyere
> > Community Manager
> > Email: briana.oky...@astronomer.io
> > Mobile: +1 415.713.9943
> > Time zone: US Pacific UTC
> >
> > 
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Aritra Basu
+1 (non binding)

--
Regards,
Aritra Basu

On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis 
wrote:

> +1 (binding)
>
>
>  - ferruzzi
>
>
> 
> From: Jed Cunningham 
> Sent: Wednesday, October 25, 2023 9:54 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> Pinecone, OpenAI & Cohere to enable first-class LLMOps
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> +1 (binding)
>


Re: Keep Mssql support

2023-10-25 Thread Jarek Potiuk
It can also be that someone migrates the DB in 2.7.3 (to postgres for
example) and THEN upgrade such DB to 2.8.

And (at least as I see it) we do not have to release any tools for that. In
fact I'd strongly discourage writing tool doing it.

>From the experience of 1.10 - 2.* migration, If we  release a tool, then we
will have to handle (or at least it will be expected from us - maintainers)
all kinds of problems with configurations, encodings, various
configurations parameters of both Postgres and MsSQL etc. etc. We had a lot
of experience with migrating 1.10 -> 2.x where there were some tiny
problems with migration and - since we had upgrade check released
https://pypi.org/project/apache-airflow-upgrade-check/ people expected it
to work and even with most basic problems they could fix themselves they
came back to us with "fix the tool" rather than look at the specific
problems at hand they had and try to find solutions. People could have
fine-tuned and modified their DBs and add their own "fixes" (and they might
not be even aware of that as it was done a year earlier by some admins who
already left etc. etc. Those were precisely the stories we saw when people
migrated to Airflow 2.

I seriously doubt we can write a robust "run me and it will work" migration
tool for this case. And I would be very much against doing it.

My suggestion if someone will be doing it, is to describe how you can do
such migration with as low-level set of tools involving manual steps where
the person doing the migration should be knowledgeable with the DBs and
will be able to inspect and correct the states of the migration following
clear-text information. It just needs to be done by someone to confirm it
CAN be done with some effort from the side of the DB

1) export MsSQL DB to file
2) do set of those manual modifications or search & replace to make it
works for Postgres (or MySQL)
3) Create empty Airflow DB using "airflow migrate" in Postgres (or MySQL)
3) Import the export to Postgres (or MySQL)

Step 2 might require (and it should be very clear) manual adjustments to
specific cases (like encodings etc. The export should be clear-text, with
regular CREATE / INSERT statements so that in case of an problems, the
users (not maintainers) will be able to figure out what other
search/replace they need to do make it works. Plain SQL. No structure
export/import - just data transfer.

The target DB should be just standard DB of airflow created as usual with
`airflow db`.

And the process itself could have limitations It might take hours and GBs
of disk space. Sometimes when it will be failing it might require to start
the migration from scratch. It might neeed to run `airflow db clean` to
limit amount of data in the file and to allow such manual modifications. It
might only focus on migration of some data this way and then do manual
"airflow CLI" export/import of data like connections, users etc. And it
should involve the necessity of diagnosing and possibly manually fixing the
export/import files if needs be, without expecting the imaginary "tool"
doing it automatically.

Our DB is relatively small, the relations are relatively few. But - looking
at examples for the past, it will be far easier for each individual person
to solve any specific problems they have in their own version of MsSQL than
for us to write a robust and generic tool to fix those problems. And our
discussions/slack might be a good place where such users could share their
aproach with each other so that they can learn from each other - we could
create such dedicated slack channel. We would help - of course - as usual,
but without promises that there will be a working "one button" solutions.

That's at least how I would personally approach this. Just explain and
describe general approach and prove that it can work, but also set
expectations that it might require an effort to complete it from the side
of the of the MsSQL database. I think it would be quite unfair to set the
expectations that we can have such a robust one-button migration. I don't
think we can. But maybe that's the experience from seing some of the
problems of people migrating from 1.10 to 2 several years ago - where all
kinds of things (in MySQL usually) went often wrong.

J.


On Wed, Oct 25, 2023 at 7:12 PM Andrey Anshin 
wrote:

> > Need to habe at least a DB scheme upgrade supported for MsSQL in order to
> be able to migrade to a new DB engine in 2.8.0
>
> I think we need prohibit upgrade for unsupported DB
>
> E.g. if dialect not in ('postgresql', 'mysql', 'sqlite') then raise an
> error
> Keep upgrade db schema for unsupported DB might stop us to implement other
> stuff around DB schema, e.g.: https://github.com/apache/airflow/pull/34112
> So my assumption that MsSQL will have support until Airflow 2.7.x
>
> 
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Wed, 25 Oct 2023 at 21:00, Scheffler Jens (XC-DX/ETV5)
>  wrote:
>
> > In response, off topic from Jareks statement: is somebody (already) on

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Ferruzzi, Dennis
+1 (binding)


 - ferruzzi



From: Jed Cunningham 
Sent: Wednesday, October 25, 2023 9:54 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for Pinecone, 
OpenAI & Cohere to enable first-class LLMOps

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 (binding)


Re: Keep Mssql support

2023-10-25 Thread Andrey Anshin
> Need to habe at least a DB scheme upgrade supported for MsSQL in order to
be able to migrade to a new DB engine in 2.8.0

I think we need prohibit upgrade for unsupported DB

E.g. if dialect not in ('postgresql', 'mysql', 'sqlite') then raise an error
Keep upgrade db schema for unsupported DB might stop us to implement other
stuff around DB schema, e.g.: https://github.com/apache/airflow/pull/34112
So my assumption that MsSQL will have support until Airflow 2.7.x


Best Wishes
*Andrey Anshin*



On Wed, 25 Oct 2023 at 21:00, Scheffler Jens (XC-DX/ETV5)
 wrote:

> In response, off topic from Jareks statement: is somebody (already) on the
> migration plan MsSQL->Postgres or do we need to find a volunteer not to
> delay 2.8 release?
> In a matter of preparing would be best if a "tool" is released wirh 2.7.3
> else we need to habe at least a DB scheme upgrade supported for MsSQL in
> order to be able to migrade to a new DB engine in 2.8.0 w/o need of
> transformation of structures for people being stuck on 2.7.3 wirh MsSQL.
>
> Sent from Outlook for iOS
> 
> From: Jarek Potiuk 
> Sent: Wednesday, October 25, 2023 2:21:29 PM
> To: Amogh Desai 
> Cc: dev@airflow.apache.org 
> Subject: Re: Keep Mssql support
>
> > The alternatives suggested by @Jarek Potiuk  are
> something which is doable. We need to realise that these plugins are for
> the community and we can only support it if
> "majority" of the community uses it and is willing to maintain it :)
>
> Just to clarify - it's not about plugins, it's about "core" metastore
> support for MSSQL. The MSSQL provider will remain as is, but Airflow will
> lose the ability of using MsSQL as backend for
> scheduling/tasks/UI/executors/communication with workers via DB.
>
> The overhead / extra maintenance there is to make sure that all the
> queries, performance, locking and anything else airflow uses to make "core"
> works with MsSQL.
>
> In Airflow 2.8 the only remaining backends will be MySQL 8 (as MySQL 5.7
> reaches end of life in a week), and Postgres 12-16 (in 2 weeks Postgres 11
> reaches EOL and we will drop it).
> Those will be the only choices for Airflow 2.8 - providing that we will
> remove MsSQL as planned (condition for that is that we describe a viable
> migration path for MsSQL users).
>
> J.
>
>
>
> On Wed, Oct 25, 2023 at 1:23 PM Amogh Desai 
> wrote:
>
> > I agree with the comments and where this discussion has led to.
> >
> > The alternatives suggested by @Jarek Potiuk  are
> > something which is doable. We need to realise that these plugins are for
> > the community and we can only support it if
> > "majority" of the community uses it and is willing to maintain it :)
> >
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> > On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:
> >
> >> Yeah agreed, I don’t think it is worth keeping the support of MSSQL
> given
> >> the amount of usage vs the maintenance effort required.
> >>
> >> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
> >>
> >> > Yes. Agree with Andrey.  I think our experience from the last few
> years
> >> was
> >> > "very" bad. The number of mssql users is very small. And the time that
> >> > maintainers and community members lose on various problems with it is
> >> huge.
> >> > Quite often every time we added a new feature requiring some new db
> >> > functionality, quite a lot of overhead was spent by the one adding new
> >> > features to solve the problems coming just from MSSQL support. It's
> not
> >> > "existing" issues - it's that it generally slows us down with making
> >> > changes
> >> >
> >> > I think there are two options for you when you. It's not "open
> issues",
> >> > it's the maintenance
> >> >
> >> > * switch to another backend (recommended). And it's not as complex as
> >> you
> >> > think. You can also use managed DB with all that is needed
> >> > (backup/maintenance), you do not have to manage it yourself. There are
> >> some
> >> > excellent postgres options available.
> >> >
> >> > * have your own fork airflow and keep the tests running and make your
> >> copy
> >> > works for MSSQL if you insist on keeping it. Since you already seem to
> >> be
> >> > ready to spend your engineering time on it, that seems doable.
> >> >
> >> > The second option I think might even be a business opportunity - for
> >> your
> >> > company or for anyone who would like to do it. Someone could even
> offer
> >> it
> >> > as a service or as a version to support it for others and make a small
> >> > business out of it if you are really so committed to it I guess,
> >> including
> >> > support for any mssql problems.
> >> >
> >> > That would actually be awesome if someone does it.
> >> >
> >> > J.
> >> >
> >> >
> >> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin <
> >> andrey.ans...@taragol.is>
> >> > wrote:
> >> >
> >> > > I don’t think there is any possibility left to keep MS SQL Server as
> >> DB
> >> > > backend for Airflow.
> >>

Re: Keep Mssql support

2023-10-25 Thread Scheffler Jens (XC-DX/ETV5)
In response, off topic from Jareks statement: is somebody (already) on the 
migration plan MsSQL->Postgres or do we need to find a volunteer not to delay 
2.8 release?
In a matter of preparing would be best if a "tool" is released wirh 2.7.3 else 
we need to habe at least a DB scheme upgrade supported for MsSQL in order to be 
able to migrade to a new DB engine in 2.8.0 w/o need of transformation of 
structures for people being stuck on 2.7.3 wirh MsSQL.

Sent from Outlook for iOS

From: Jarek Potiuk 
Sent: Wednesday, October 25, 2023 2:21:29 PM
To: Amogh Desai 
Cc: dev@airflow.apache.org 
Subject: Re: Keep Mssql support

> The alternatives suggested by @Jarek Potiuk  are
something which is doable. We need to realise that these plugins are for
the community and we can only support it if
"majority" of the community uses it and is willing to maintain it :)

Just to clarify - it's not about plugins, it's about "core" metastore
support for MSSQL. The MSSQL provider will remain as is, but Airflow will
lose the ability of using MsSQL as backend for
scheduling/tasks/UI/executors/communication with workers via DB.

The overhead / extra maintenance there is to make sure that all the
queries, performance, locking and anything else airflow uses to make "core"
works with MsSQL.

In Airflow 2.8 the only remaining backends will be MySQL 8 (as MySQL 5.7
reaches end of life in a week), and Postgres 12-16 (in 2 weeks Postgres 11
reaches EOL and we will drop it).
Those will be the only choices for Airflow 2.8 - providing that we will
remove MsSQL as planned (condition for that is that we describe a viable
migration path for MsSQL users).

J.



On Wed, Oct 25, 2023 at 1:23 PM Amogh Desai 
wrote:

> I agree with the comments and where this discussion has led to.
>
> The alternatives suggested by @Jarek Potiuk  are
> something which is doable. We need to realise that these plugins are for
> the community and we can only support it if
> "majority" of the community uses it and is willing to maintain it :)
>
>
> Thanks & Regards,
> Amogh Desai
>
> On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:
>
>> Yeah agreed, I don’t think it is worth keeping the support of MSSQL given
>> the amount of usage vs the maintenance effort required.
>>
>> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
>>
>> > Yes. Agree with Andrey.  I think our experience from the last few years
>> was
>> > "very" bad. The number of mssql users is very small. And the time that
>> > maintainers and community members lose on various problems with it is
>> huge.
>> > Quite often every time we added a new feature requiring some new db
>> > functionality, quite a lot of overhead was spent by the one adding new
>> > features to solve the problems coming just from MSSQL support. It's not
>> > "existing" issues - it's that it generally slows us down with making
>> > changes
>> >
>> > I think there are two options for you when you. It's not "open issues",
>> > it's the maintenance
>> >
>> > * switch to another backend (recommended). And it's not as complex as
>> you
>> > think. You can also use managed DB with all that is needed
>> > (backup/maintenance), you do not have to manage it yourself. There are
>> some
>> > excellent postgres options available.
>> >
>> > * have your own fork airflow and keep the tests running and make your
>> copy
>> > works for MSSQL if you insist on keeping it. Since you already seem to
>> be
>> > ready to spend your engineering time on it, that seems doable.
>> >
>> > The second option I think might even be a business opportunity - for
>> your
>> > company or for anyone who would like to do it. Someone could even offer
>> it
>> > as a service or as a version to support it for others and make a small
>> > business out of it if you are really so committed to it I guess,
>> including
>> > support for any mssql problems.
>> >
>> > That would actually be awesome if someone does it.
>> >
>> > J.
>> >
>> >
>> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin <
>> andrey.ans...@taragol.is>
>> > wrote:
>> >
>> > > I don’t think there is any possibility left to keep MS SQL Server as
>> DB
>> > > backend for Airflow.
>> > >
>> > > I add Elad's message from the original discussion:
>> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Fr06j306hldg03g2my1pd4nyjxg78b3h4&data=05%7C01%7CJens.Scheffler%40de.bosch.com%7C6e16370a4474475d1b7908dbd554f6ea%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638338333124062597%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QdbRqzbHV0th6QqgrottUfhT%2FdQ1Pn8xOOlNytq4ojo%3D&reserved=0
>> > > Because it cleary describe what is happen with MS SQL as DB backend
>> for
>> > the
>> > > last 1.5 years
>> > >
>> > > > During this time we hoped it would become stable and widely adopted.
>> > > > To m

Re: [Discussion] Requesting Feedback on Airflow Survey

2023-10-25 Thread Andrey Anshin
Hey Briana,

Thanks for sharing questions. Let me share some idea/improvements, but it
only my thoughts so it could be non-relevant for create clear survey

*Which version of Airflow do you currently use*
Maybe it's better keep only this one?
- 1.10
- <=2.4
- 2.5
- 2.6
- 2.7

1.10 it is so legacy now
2-2.4 not so legacy, but latest providers can't be install on this versions
2.5.x it is the highest version which might install latest providers

*Which Metadata Database do you use?*

My suggestion is to make this question, if it possible, only for
self-hosted Airflow installation. I guess for Managed Airflow it would
always be Postgres.

An split this question by two different

First about type of DB

- MySQL (not MariaDB)
- Postgres
- Microsoft SQL Server
- Other: __

Second one about version DB version

- Version: __
- I don't know

Every year new version of Postgres released, so if we consider that some
users might still use 1.10.x then it could be a chance 9.4 - 9.5 used
I think more interesting in this question is Postgres have critical total
share value (like 80-90 of all respondent) from which we might start think,
is actually we need in the future (Airflow 3.x/4.x) something different
rather than Postgres and SQLite (for testing)




Best Wishes
*Andrey Anshin*



On Wed, 25 Oct 2023 at 19:08, Briana Okyere
 wrote:

> Hey All,
>
> For the last few years, we've sent out surveys to get a sense of the state
> of this Airflow community, and this year I've been tasked with distributing
> it. I'd love to get your feedback before it's pushed live.
>
> I've made some minor tweaks to the 2022 survey and added the questions to
> this google doc: <
>
> https://docs.google.com/document/d/1FbluXNGq9cI3N9zw1cH4F4cEQfyIQ6tk7tGLo4ZN4AE/edit?usp=sharing
> >
>
> Is there anything you think should be added or removed?
>
> Please note- it is useful to compare data from previous years to this year,
> so I believe the majority of the questions should remain similar-ish.
> However, there is always room for improvement!
>
> Here are last year's results for reference:
> 
>
> Also, if anyone is interested in helping with analysis after the results
> are in, please let me know.
>
> --
> Briana Okyere
> Community Manager
> Email: briana.oky...@astronomer.io
> Mobile: +1 415.713.9943
> Time zone: US Pacific UTC
>
> 
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Jed Cunningham
+1 (binding)


Re: Airflow Docs Development Issues

2023-10-25 Thread Jarek Potiuk
+1. I think no-one will object to improve the current situation :)

On Wed, Oct 25, 2023 at 5:02 PM utkarsh sharma 
wrote:

> Hey everyone,
>
> If we have a consensus on the suggestions in my previous email, I would
> like to subdivide the task into smaller tickets and distribute them among
> Aritra Basu, Amogh Desai, and myself.
>
> Thanks,
> Utkarsh Sharma
>
> On Tue, Oct 24, 2023 at 10:12 PM Jarek Potiuk  wrote:
>
> > Those look like great ideas.
> >
> > On Tue, Oct 24, 2023 at 4:23 PM utkarsh sharma 
> > wrote:
> >
> > > Just forgot to mention in my previous mail, that I'm suggesting the
> above
> > > changes since the storage is not the primary concern right now but I'm
> > > happy to contribute either way. :)
> > >
> > > On Tue, Oct 24, 2023 at 7:43 PM utkarsh sharma  >
> > > wrote:
> > >
> > > > Hey everyone,
> > > >
> > > > I have a couple of tasks in mind, that might aid in reducing the
> > efforts
> > > > while working with docs. Right now tasks listed below are difficult
> to
> > > > achieve.
> > > >
> > > > 1. Adding a warning based on a specific provider/version of a
> > > > provider/range of providers. Which was also the task that Ryan was
> > > working
> > > > on.
> > > > 2. Altering a page layout or CSS for a specific provider.
> > > >
> > > > The issue while trying to achieve the above tasks is because of the
> > > > pre-prepared static files we get as a final product of building
> > documents
> > > > with *breeze build-docs* in folder docs/_build. The files we get are
> > > > self-sufficient to be hosted and they are really just used directly
> > > leaving
> > > > no room for customization of any sort.
> > > >
> > > >
> > > > My proposal would be to break down this process as follows:
> > > >
> > > > 1. We can prepare partial documents as part of *breeze build-docs*
> > which
> > > > are only responsible for providing HTML to be populated within the
> Body
> > > tag
> > > > for a specific provider, and not the layout of the entire page.
> > > > 2. We then copy partial static files to the Airflow-site repo within
> > > > landing pages/site/layouts/docs. Where the layout of the page will be
> > > > provided by `single.html`, a listing of all the providers will be
> > > provided
> > > > by `list.html`, which are standard hugo
> > > >  features. Also, using static
> > > > files from `sphinx_airflow_theme` which lives in the same repo, makes
> > the
> > > > changes on the CSS easy.
> > > > 3. We can then use Hugo to generate static
> > > > 
> > files
> > > > and push them to the `gh-pages` branch to publish them using GitHub
> > > pages.
> > > >
> > > >
> > > > Doing the above changes will enable us to do the following:
> > > >
> > > > 1. Will give us more control to work on a specific
> > > > provider/provider-version if we want by providing templates -
> > > > https://gohugo.io/templates/lookup-order/
> > > > 2. We will have a specific code to look at depending on the changes
> one
> > > > intends to make, right now if you don't know the flow it's a bit
> > > difficult
> > > > to pinpoint the code to change.
> > > > 1. If we want to make changes to a specific provider's content we can
> > do
> > > > it Airflow's repo docs//*.rst file.
> > > > 2. If we have a change that affects multiple providers or versions we
> > can
> > > > do it on Airflow Website's repo.
> > > >
> > > >
> > > > Thanks,
> > > > Utkarsh Sharma
> > > >
> > > > On Tue, Oct 24, 2023 at 3:45 PM Jarek Potiuk 
> wrote:
> > > >
> > > >> So it looks like we have some helping hands and we need someone to
> > lead
> > > it
> > > >> :) (just saying).
> > > >>
> > > >> On Tue, Oct 24, 2023 at 8:15 AM Amogh Desai <
> amoghdesai@gmail.com
> > >
> > > >> wrote:
> > > >>
> > > >> > +1 (non binding) from me on the thought of moving the older docs
> > (~18
> > > >> > months seems ok) to an archive instead of the repository.
> > > >> >
> > > >> > Coming to the other problem of copying the built docs into
> > > airflow-site
> > > >> for
> > > >> > releases, maybe we can fix that using a script? Open for thoughts
> > > here.
> > > >> >
> > > >> > I would be very happy to help when we start taking this forward, I
> > > have
> > > >> > some experience in airflow-site and docs side as well. Feel free
> to
> > > >> reach
> > > >> > out over email or slack :)
> > > >> >
> > > >> > Thanks & Regards,
> > > >> > Amogh Desai
> > > >> >
> > > >> > On Mon, Oct 23, 2023 at 3:08 AM Aritra Basu <
> > aritrabasu1...@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > This definitely sounds like something that needs doing sooner
> > rather
> > > >> than
> > > >> > > later.
> > > >> > >
> > > >> > > While I'd love to help, I'm not too experienced with this area
> so
> > I
> > > >> might
> > > >> > > not be able to actually propose what changes need doing, but if
> > > >> someone
> > > >> > has
> > > >> > > a path forward on this I can definitely

Re: Keep Mssql support

2023-10-25 Thread agateaaa
Thanks for all the responses.

We also have an on premise offering which means we will have to bundle an
extra database, postgres along with mssql which adds to the complexity that
the customer has to deal with which is not ideal from the product point
of view. I understand your inputs. I will take that back to our team and
see what we can do.

Thank you!

On Wed, Oct 25, 2023 at 8:43 AM Amogh Desai 
wrote:

> Wrong choice of words, didn't mean "plugin" but was referring to a more
> generic term.
>
> Thank you!
>
> On Wed, Oct 25, 2023 at 5:51 PM Jarek Potiuk  wrote:
>
> > > The alternatives suggested by @Jarek Potiuk  are
> > something which is doable. We need to realise that these plugins are for
> > the community and we can only support it if
> > "majority" of the community uses it and is willing to maintain it :)
> >
> > Just to clarify - it's not about plugins, it's about "core" metastore
> > support for MSSQL. The MSSQL provider will remain as is, but Airflow will
> > lose the ability of using MsSQL as backend for
> > scheduling/tasks/UI/executors/communication with workers via DB.
> >
> > The overhead / extra maintenance there is to make sure that all the
> > queries, performance, locking and anything else airflow uses to make
> "core"
> > works with MsSQL.
> >
> > In Airflow 2.8 the only remaining backends will be MySQL 8 (as MySQL 5.7
> > reaches end of life in a week), and Postgres 12-16 (in 2 weeks Postgres
> 11
> > reaches EOL and we will drop it).
> > Those will be the only choices for Airflow 2.8 - providing that we will
> > remove MsSQL as planned (condition for that is that we describe a viable
> > migration path for MsSQL users).
> >
> > J.
> >
> >
> >
> > On Wed, Oct 25, 2023 at 1:23 PM Amogh Desai 
> > wrote:
> >
> >> I agree with the comments and where this discussion has led to.
> >>
> >> The alternatives suggested by @Jarek Potiuk  are
> >> something which is doable. We need to realise that these plugins are for
> >> the community and we can only support it if
> >> "majority" of the community uses it and is willing to maintain it :)
> >>
> >>
> >> Thanks & Regards,
> >> Amogh Desai
> >>
> >> On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:
> >>
> >>> Yeah agreed, I don’t think it is worth keeping the support of MSSQL
> given
> >>> the amount of usage vs the maintenance effort required.
> >>>
> >>> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
> >>>
> >>> > Yes. Agree with Andrey.  I think our experience from the last few
> >>> years was
> >>> > "very" bad. The number of mssql users is very small. And the time
> that
> >>> > maintainers and community members lose on various problems with it is
> >>> huge.
> >>> > Quite often every time we added a new feature requiring some new db
> >>> > functionality, quite a lot of overhead was spent by the one adding
> new
> >>> > features to solve the problems coming just from MSSQL support. It's
> not
> >>> > "existing" issues - it's that it generally slows us down with making
> >>> > changes
> >>> >
> >>> > I think there are two options for you when you. It's not "open
> issues",
> >>> > it's the maintenance
> >>> >
> >>> > * switch to another backend (recommended). And it's not as complex as
> >>> you
> >>> > think. You can also use managed DB with all that is needed
> >>> > (backup/maintenance), you do not have to manage it yourself. There
> are
> >>> some
> >>> > excellent postgres options available.
> >>> >
> >>> > * have your own fork airflow and keep the tests running and make your
> >>> copy
> >>> > works for MSSQL if you insist on keeping it. Since you already seem
> to
> >>> be
> >>> > ready to spend your engineering time on it, that seems doable.
> >>> >
> >>> > The second option I think might even be a business opportunity - for
> >>> your
> >>> > company or for anyone who would like to do it. Someone could even
> >>> offer it
> >>> > as a service or as a version to support it for others and make a
> small
> >>> > business out of it if you are really so committed to it I guess,
> >>> including
> >>> > support for any mssql problems.
> >>> >
> >>> > That would actually be awesome if someone does it.
> >>> >
> >>> > J.
> >>> >
> >>> >
> >>> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin <
> >>> andrey.ans...@taragol.is>
> >>> > wrote:
> >>> >
> >>> > > I don’t think there is any possibility left to keep MS SQL Server
> as
> >>> DB
> >>> > > backend for Airflow.
> >>> > >
> >>> > > I add Elad's message from the original discussion:
> >>> > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
> >>> > > Because it cleary describe what is happen with MS SQL as DB backend
> >>> for
> >>> > the
> >>> > > last 1.5 years
> >>> > >
> >>> > > > During this time we hoped it would become stable and widely
> >>> adopted.
> >>> > > > To my taste MsSQL a backend has left a niche and is *not* worth
> >>> the >
> >>> > > maintenance
> >>> > > of it in our CI.
> >>> > >
> >>> > > I also want to note the following points
> >>> > > - 

Re: Keep Mssql support

2023-10-25 Thread Amogh Desai
Wrong choice of words, didn't mean "plugin" but was referring to a more
generic term.

Thank you!

On Wed, Oct 25, 2023 at 5:51 PM Jarek Potiuk  wrote:

> > The alternatives suggested by @Jarek Potiuk  are
> something which is doable. We need to realise that these plugins are for
> the community and we can only support it if
> "majority" of the community uses it and is willing to maintain it :)
>
> Just to clarify - it's not about plugins, it's about "core" metastore
> support for MSSQL. The MSSQL provider will remain as is, but Airflow will
> lose the ability of using MsSQL as backend for
> scheduling/tasks/UI/executors/communication with workers via DB.
>
> The overhead / extra maintenance there is to make sure that all the
> queries, performance, locking and anything else airflow uses to make "core"
> works with MsSQL.
>
> In Airflow 2.8 the only remaining backends will be MySQL 8 (as MySQL 5.7
> reaches end of life in a week), and Postgres 12-16 (in 2 weeks Postgres 11
> reaches EOL and we will drop it).
> Those will be the only choices for Airflow 2.8 - providing that we will
> remove MsSQL as planned (condition for that is that we describe a viable
> migration path for MsSQL users).
>
> J.
>
>
>
> On Wed, Oct 25, 2023 at 1:23 PM Amogh Desai 
> wrote:
>
>> I agree with the comments and where this discussion has led to.
>>
>> The alternatives suggested by @Jarek Potiuk  are
>> something which is doable. We need to realise that these plugins are for
>> the community and we can only support it if
>> "majority" of the community uses it and is willing to maintain it :)
>>
>>
>> Thanks & Regards,
>> Amogh Desai
>>
>> On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:
>>
>>> Yeah agreed, I don’t think it is worth keeping the support of MSSQL given
>>> the amount of usage vs the maintenance effort required.
>>>
>>> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
>>>
>>> > Yes. Agree with Andrey.  I think our experience from the last few
>>> years was
>>> > "very" bad. The number of mssql users is very small. And the time that
>>> > maintainers and community members lose on various problems with it is
>>> huge.
>>> > Quite often every time we added a new feature requiring some new db
>>> > functionality, quite a lot of overhead was spent by the one adding new
>>> > features to solve the problems coming just from MSSQL support. It's not
>>> > "existing" issues - it's that it generally slows us down with making
>>> > changes
>>> >
>>> > I think there are two options for you when you. It's not "open issues",
>>> > it's the maintenance
>>> >
>>> > * switch to another backend (recommended). And it's not as complex as
>>> you
>>> > think. You can also use managed DB with all that is needed
>>> > (backup/maintenance), you do not have to manage it yourself. There are
>>> some
>>> > excellent postgres options available.
>>> >
>>> > * have your own fork airflow and keep the tests running and make your
>>> copy
>>> > works for MSSQL if you insist on keeping it. Since you already seem to
>>> be
>>> > ready to spend your engineering time on it, that seems doable.
>>> >
>>> > The second option I think might even be a business opportunity - for
>>> your
>>> > company or for anyone who would like to do it. Someone could even
>>> offer it
>>> > as a service or as a version to support it for others and make a small
>>> > business out of it if you are really so committed to it I guess,
>>> including
>>> > support for any mssql problems.
>>> >
>>> > That would actually be awesome if someone does it.
>>> >
>>> > J.
>>> >
>>> >
>>> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin <
>>> andrey.ans...@taragol.is>
>>> > wrote:
>>> >
>>> > > I don’t think there is any possibility left to keep MS SQL Server as
>>> DB
>>> > > backend for Airflow.
>>> > >
>>> > > I add Elad's message from the original discussion:
>>> > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
>>> > > Because it cleary describe what is happen with MS SQL as DB backend
>>> for
>>> > the
>>> > > last 1.5 years
>>> > >
>>> > > > During this time we hoped it would become stable and widely
>>> adopted.
>>> > > > To my taste MsSQL a backend has left a niche and is *not* worth
>>> the >
>>> > > maintenance
>>> > > of it in our CI.
>>> > >
>>> > > I also want to note the following points
>>> > > - MS SQL have unstable tests in Airflow CI, and some cases we even
>>> don't
>>> > > run them for the last couple months (or even longer)
>>> > > - In additional it taking 2x memory than any other backend
>>> > > - Lack of ARM support, this is also quite important because it
>>> prevent
>>> > > maintainers to check some sort of things in their M1/M2 laptops
>>> > > - Additional backend required extra effort for any contributors who
>>> want
>>> > to
>>> > > add new feature that touches DB
>>> > >
>>> > >
>>> > > This has always been an experimental feature which are described in
>>> > AIrflow
>>> > > Release Process:
>>> > >
>>> > >
>>> >
>>> https://airflow.ap

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Vikram Koka
+1 (Binding)

On Wed, Oct 25, 2023 at 7:20 AM Wei Lee  wrote:

> +1 (non-binding)
>
> Best,
> Wei
>
>
> On Wed, Oct 25, 2023 at 10:44 PM Vincent Beck  wrote:
>
> > +1 binding
> >
> > On 2023/10/25 13:32:49 Pierre Jeambrun wrote:
> > > +1 (binding)
> > >
> > > Le mer. 25 oct. 2023 à 13:29, Pankaj Singh 
> a
> > > écrit :
> > >
> > > > +1 (binding)
> > > >
> > > > On Wed, Oct 25, 2023 at 4:52 PM Amogh Desai <
> amoghdesai@gmail.com>
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
> > > > >  wrote:
> > > > >
> > > > > > +1 binding
> > > > > >
> > > > > > On Wed, 25 Oct 2023, 16:39 utkarsh sharma, <
> utkarshar...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Utkarsh Sharma
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > +1 (binding)
> > > > > > > >
> > > > > > > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, <
> > > > andrey.ans...@taragol.is>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1 binding
> > > > > > > > >
> > > > > > > > > 
> > > > > > > > > Best Wishes
> > > > > > > > > *Andrey Anshin*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik <
> > kaxiln...@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello everyone,
> > > > > > > > > >
> > > > > > > > > > Following the discussion about adding five new providers,
> > I am
> > > > > > > calling
> > > > > > > > > for
> > > > > > > > > > an official vote on adding providers for Pinecone,
> OpenAI &
> > > > > Cohere
> > > > > > to
> > > > > > > > the
> > > > > > > > > > Airflow repo.
> > > > > > > > > >
> > > > > > > > > > Discussion thread:
> > > > > > > > > >
> > > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > > > > > > >
> > > > > > > > > > Draft PRs:
> > > > > > > > > >
> > > > > > > > > >- https://github.com/apache/airflow/pull/35094
> > (Pinecone)
> > > > > > > > > >- https://github.com/apache/airflow/pull/35023
> (OpenAI)
> > > > > > > > > >- https://github.com/apache/airflow/pull/34921
> (Cohere)
> > > > > > > > > >
> > > > > > > > > > This is my binding +1 vote.
> > > > > > > > > >
> > > > > > > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > > > > > > and until at least 3 binding votes have been cast.
> > > > > > > > > >
> > > > > > > > > > Please vote accordingly:
> > > > > > > > > >
> > > > > > > > > > [ ] + 1 approve
> > > > > > > > > > [ ] + 0 no opinion
> > > > > > > > > > [ ] - 1 disapprove with the reason
> > > > > > > > > >
> > > > > > > > > > Only votes from PMC members and committers are binding,
> but
> > > > other
> > > > > > > > members
> > > > > > > > > > of the community are encouraged to check the AIP and vote
> > with
> > > > > > > > > > "(non-binding)".
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Kaxil
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>


[Discussion] Requesting Feedback on Airflow Survey

2023-10-25 Thread Briana Okyere
Hey All,

For the last few years, we've sent out surveys to get a sense of the state
of this Airflow community, and this year I've been tasked with distributing
it. I'd love to get your feedback before it's pushed live.

I've made some minor tweaks to the 2022 survey and added the questions to
this google doc: <
https://docs.google.com/document/d/1FbluXNGq9cI3N9zw1cH4F4cEQfyIQ6tk7tGLo4ZN4AE/edit?usp=sharing
>

Is there anything you think should be added or removed?

Please note- it is useful to compare data from previous years to this year,
so I believe the majority of the questions should remain similar-ish.
However, there is always room for improvement!

Here are last year's results for reference:


Also, if anyone is interested in helping with analysis after the results
are in, please let me know.

-- 
Briana Okyere
Community Manager
Email: briana.oky...@astronomer.io
Mobile: +1 415.713.9943
Time zone: US Pacific UTC




Re: Airflow Docs Development Issues

2023-10-25 Thread utkarsh sharma
Hey everyone,

If we have a consensus on the suggestions in my previous email, I would
like to subdivide the task into smaller tickets and distribute them among
Aritra Basu, Amogh Desai, and myself.

Thanks,
Utkarsh Sharma

On Tue, Oct 24, 2023 at 10:12 PM Jarek Potiuk  wrote:

> Those look like great ideas.
>
> On Tue, Oct 24, 2023 at 4:23 PM utkarsh sharma 
> wrote:
>
> > Just forgot to mention in my previous mail, that I'm suggesting the above
> > changes since the storage is not the primary concern right now but I'm
> > happy to contribute either way. :)
> >
> > On Tue, Oct 24, 2023 at 7:43 PM utkarsh sharma 
> > wrote:
> >
> > > Hey everyone,
> > >
> > > I have a couple of tasks in mind, that might aid in reducing the
> efforts
> > > while working with docs. Right now tasks listed below are difficult to
> > > achieve.
> > >
> > > 1. Adding a warning based on a specific provider/version of a
> > > provider/range of providers. Which was also the task that Ryan was
> > working
> > > on.
> > > 2. Altering a page layout or CSS for a specific provider.
> > >
> > > The issue while trying to achieve the above tasks is because of the
> > > pre-prepared static files we get as a final product of building
> documents
> > > with *breeze build-docs* in folder docs/_build. The files we get are
> > > self-sufficient to be hosted and they are really just used directly
> > leaving
> > > no room for customization of any sort.
> > >
> > >
> > > My proposal would be to break down this process as follows:
> > >
> > > 1. We can prepare partial documents as part of *breeze build-docs*
> which
> > > are only responsible for providing HTML to be populated within the Body
> > tag
> > > for a specific provider, and not the layout of the entire page.
> > > 2. We then copy partial static files to the Airflow-site repo within
> > > landing pages/site/layouts/docs. Where the layout of the page will be
> > > provided by `single.html`, a listing of all the providers will be
> > provided
> > > by `list.html`, which are standard hugo
> > >  features. Also, using static
> > > files from `sphinx_airflow_theme` which lives in the same repo, makes
> the
> > > changes on the CSS easy.
> > > 3. We can then use Hugo to generate static
> > > 
> files
> > > and push them to the `gh-pages` branch to publish them using GitHub
> > pages.
> > >
> > >
> > > Doing the above changes will enable us to do the following:
> > >
> > > 1. Will give us more control to work on a specific
> > > provider/provider-version if we want by providing templates -
> > > https://gohugo.io/templates/lookup-order/
> > > 2. We will have a specific code to look at depending on the changes one
> > > intends to make, right now if you don't know the flow it's a bit
> > difficult
> > > to pinpoint the code to change.
> > > 1. If we want to make changes to a specific provider's content we can
> do
> > > it Airflow's repo docs//*.rst file.
> > > 2. If we have a change that affects multiple providers or versions we
> can
> > > do it on Airflow Website's repo.
> > >
> > >
> > > Thanks,
> > > Utkarsh Sharma
> > >
> > > On Tue, Oct 24, 2023 at 3:45 PM Jarek Potiuk  wrote:
> > >
> > >> So it looks like we have some helping hands and we need someone to
> lead
> > it
> > >> :) (just saying).
> > >>
> > >> On Tue, Oct 24, 2023 at 8:15 AM Amogh Desai  >
> > >> wrote:
> > >>
> > >> > +1 (non binding) from me on the thought of moving the older docs
> (~18
> > >> > months seems ok) to an archive instead of the repository.
> > >> >
> > >> > Coming to the other problem of copying the built docs into
> > airflow-site
> > >> for
> > >> > releases, maybe we can fix that using a script? Open for thoughts
> > here.
> > >> >
> > >> > I would be very happy to help when we start taking this forward, I
> > have
> > >> > some experience in airflow-site and docs side as well. Feel free to
> > >> reach
> > >> > out over email or slack :)
> > >> >
> > >> > Thanks & Regards,
> > >> > Amogh Desai
> > >> >
> > >> > On Mon, Oct 23, 2023 at 3:08 AM Aritra Basu <
> aritrabasu1...@gmail.com
> > >
> > >> > wrote:
> > >> >
> > >> > > This definitely sounds like something that needs doing sooner
> rather
> > >> than
> > >> > > later.
> > >> > >
> > >> > > While I'd love to help, I'm not too experienced with this area so
> I
> > >> might
> > >> > > not be able to actually propose what changes need doing, but if
> > >> someone
> > >> > has
> > >> > > a path forward on this I can definitely contribute some time to
> help
> > >> out
> > >> > > given some guidance on what is needed.
> > >> > >
> > >> > > --
> > >> > > Regards,
> > >> > > Aritra Basu
> > >> > >
> > >> > > On Mon, Oct 23, 2023, 2:19 AM Jarek Potiuk 
> > wrote:
> > >> > >
> > >> > > > Some news here.
> > >> > > >
> > >> > > > I caught up with some infra changes that happened while I was
> > >> > travelling
> > >> > > -
> > >> > > > and I have just (with
> 

RE: Limiting (or errorring out) Airflow for Python 3.12 until our dependencies/we catch up

2023-10-25 Thread Damian Shaw
> But it seems to be changing now. Also I think (by listening to some podcast) 
> - Python community starts to be overwhelmed with keeping all the 
> backwards-compatibility and they are changing their approach for future 
> releases to be more aggressive in removals, separate out certain "core" 
> features to outside-libraries and removing them out of the core Python

I appreciate this is a little off topic, but I follow core dev of CPython 
closely and would just like to add I don't think this take is quite accurate. 
IMO Python core dev are focused on Python backwards compatibility in general 
but say a few things happened:

1. Functions that have been marked as deprecated for many releases of Python 
are finally being cleaned up
2. A group of modules which are completely unmaintained got marked as 
deprecated in PEP 594 (the Steering Council made clear no similar PEPs would be 
accepted and each module will be treated on a case-by-case basis)
3. Several large projects are working on the underlying implementation of 
CPython and they are more willing to propose larger changes to the C-API than 
has happened in the past

There has been discussion implementing an official "soft depreciation", which 
will avoid the eventual removal of a function but no longer support any 
development: 
https://discuss.python.org/t/formalize-the-concept-of-soft-deprecation-dont-schedule-removal-in-pep-387-backwards-compatibility-policy/27957

Damian

-Original Message-
From: Jarek Potiuk 
Sent: Wednesday, October 25, 2023 6:15 AM
To: dev@airflow.apache.org
Subject: Re: Limiting (or errorring out) Airflow for Python 3.12 until our 
dependencies/we catch up

> I agree, has there been a new release of Python that has worked with
Airflow without at least some fixing?

Generally yes. Most of the past bumps were mostly test fixing. I can't recall 
any "serious" changes in the code. Maybe in some single providers.
Generally  Python 3.7  -> 3.11 were extremely backwards compatible, there were 
really maybe 2 cases I remember where we had compatibility issues in airflow 
"core" and they were because of security fixes (some edge cases of url parsing 
changed Is one that I remember) - and that was even in patchlevel releases - 
not even in the minor release.

> I understand that Airflow is both a library and an application and I
> do
agree with not putting upper bounds on libraries unless required, but it seems 
like allowing an arbitrary upper bound of Python is never going to be 
practical. If users really need to break this support version and install an 
older version of Airflow with a newer version of Python it isn't difficult to 
build Airflow locally with patches, and they will likely need to add more 
patches to get Airflow to work anyway, so their cost is already built in.

To be honest, It's been practical for now, this is the first time it has 
happened for quite a while. But it seems to be changing now. Also I think (by 
listening to some podcast) - Python community starts to be overwhelmed with 
keeping all the backwards-compatibility and they are changing their approach 
for future releases to be more aggressive in removals, separate out certain 
"core" features to outside-libraries and removing them out of the core Python 
(which is actually quite an inspiration for us in what we do by moving stuff - 
like executors, hopefully FAB in the future and maybe few others from core to 
providers), so we can expect more of such breakages in the moment. Having 
explicit upper-binding will be a good idea.

I think we can do a hybrid solution. We can upper-bind for 2.8 but when we 
release 2.7.3 we can implement raising error if someone wants to use Python 
3.12.  I think that will be the best approach.

J.



On Mon, Oct 23, 2023 at 6:33 PM Damian Shaw 
wrote:

> I agree, has there been a new release of Python that has worked with
> Airflow without at least some fixing?
>
> I understand that Airflow is both a library and an application and I
> do agree with not putting upper bounds on libraries unless required,
> but it seems like allows an arbitrary upper bound of Python is never
> going to be practical. If users really need to break this support
> version and install an older version of Airflow with a newer version
> of Python it isn't difficult to build Airflow locally with patches,
> and they will likely need to add more patches to get Airflow to work
> anyway, so their cost is already built in.
>
> Even if just limiting to 3.11 or lower for 2.7.3 or higher I think the
> benefits are there, Python 3.12 might be supported soon or there might
> be more hidden issues that mean it isn't supported for 9+ months.
>
> Damian
>
> -Original Message-
> From: Pierre Jeambrun 
> Sent: Monday, October 23, 2023 12:13 PM
> To: dev@airflow.apache.org
> Subject: Re: Limiting (or errorring out) Airflow for Python 3.12 until
> our dependencies/we catch up
>
> I think that limiting to <3.12 makes sense. 2.7.2 is already out so
> 

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Wei Lee
+1 (non-binding)

Best,
Wei


On Wed, Oct 25, 2023 at 10:44 PM Vincent Beck  wrote:

> +1 binding
>
> On 2023/10/25 13:32:49 Pierre Jeambrun wrote:
> > +1 (binding)
> >
> > Le mer. 25 oct. 2023 à 13:29, Pankaj Singh  a
> > écrit :
> >
> > > +1 (binding)
> > >
> > > On Wed, Oct 25, 2023 at 4:52 PM Amogh Desai 
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
> > > >  wrote:
> > > >
> > > > > +1 binding
> > > > >
> > > > > On Wed, 25 Oct 2023, 16:39 utkarsh sharma,  >
> > > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Thanks,
> > > > > > Utkarsh Sharma
> > > > > >
> > > > > >
> > > > > > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> > > > > >  wrote:
> > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, <
> > > andrey.ans...@taragol.is>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 binding
> > > > > > > >
> > > > > > > > 
> > > > > > > > Best Wishes
> > > > > > > > *Andrey Anshin*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik <
> kaxiln...@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello everyone,
> > > > > > > > >
> > > > > > > > > Following the discussion about adding five new providers,
> I am
> > > > > > calling
> > > > > > > > for
> > > > > > > > > an official vote on adding providers for Pinecone, OpenAI &
> > > > Cohere
> > > > > to
> > > > > > > the
> > > > > > > > > Airflow repo.
> > > > > > > > >
> > > > > > > > > Discussion thread:
> > > > > > > > >
> > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > > > > > >
> > > > > > > > > Draft PRs:
> > > > > > > > >
> > > > > > > > >- https://github.com/apache/airflow/pull/35094
> (Pinecone)
> > > > > > > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > > > > > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > > > > > > >
> > > > > > > > > This is my binding +1 vote.
> > > > > > > > >
> > > > > > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > > > > > and until at least 3 binding votes have been cast.
> > > > > > > > >
> > > > > > > > > Please vote accordingly:
> > > > > > > > >
> > > > > > > > > [ ] + 1 approve
> > > > > > > > > [ ] + 0 no opinion
> > > > > > > > > [ ] - 1 disapprove with the reason
> > > > > > > > >
> > > > > > > > > Only votes from PMC members and committers are binding, but
> > > other
> > > > > > > members
> > > > > > > > > of the community are encouraged to check the AIP and vote
> with
> > > > > > > > > "(non-binding)".
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Kaxil
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Vincent Beck
+1 binding

On 2023/10/25 13:32:49 Pierre Jeambrun wrote:
> +1 (binding)
> 
> Le mer. 25 oct. 2023 à 13:29, Pankaj Singh  a
> écrit :
> 
> > +1 (binding)
> >
> > On Wed, Oct 25, 2023 at 4:52 PM Amogh Desai 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
> > >  wrote:
> > >
> > > > +1 binding
> > > >
> > > > On Wed, 25 Oct 2023, 16:39 utkarsh sharma, 
> > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks,
> > > > > Utkarsh Sharma
> > > > >
> > > > >
> > > > > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> > > > >  wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, <
> > andrey.ans...@taragol.is>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 binding
> > > > > > >
> > > > > > > 
> > > > > > > Best Wishes
> > > > > > > *Andrey Anshin*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik 
> > > > wrote:
> > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > Following the discussion about adding five new providers, I am
> > > > > calling
> > > > > > > for
> > > > > > > > an official vote on adding providers for Pinecone, OpenAI &
> > > Cohere
> > > > to
> > > > > > the
> > > > > > > > Airflow repo.
> > > > > > > >
> > > > > > > > Discussion thread:
> > > > > > > >
> > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > > > > >
> > > > > > > > Draft PRs:
> > > > > > > >
> > > > > > > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > > > > > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > > > > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > > > > > >
> > > > > > > > This is my binding +1 vote.
> > > > > > > >
> > > > > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > > > > and until at least 3 binding votes have been cast.
> > > > > > > >
> > > > > > > > Please vote accordingly:
> > > > > > > >
> > > > > > > > [ ] + 1 approve
> > > > > > > > [ ] + 0 no opinion
> > > > > > > > [ ] - 1 disapprove with the reason
> > > > > > > >
> > > > > > > > Only votes from PMC members and committers are binding, but
> > other
> > > > > > members
> > > > > > > > of the community are encouraged to check the AIP and vote with
> > > > > > > > "(non-binding)".
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Kaxil
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

-
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org



Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Pierre Jeambrun
+1 (binding)

Le mer. 25 oct. 2023 à 13:29, Pankaj Singh  a
écrit :

> +1 (binding)
>
> On Wed, Oct 25, 2023 at 4:52 PM Amogh Desai 
> wrote:
>
> > +1 (binding)
> >
> > On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
> >  wrote:
> >
> > > +1 binding
> > >
> > > On Wed, 25 Oct 2023, 16:39 utkarsh sharma, 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Thanks,
> > > > Utkarsh Sharma
> > > >
> > > >
> > > > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> > > >  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, <
> andrey.ans...@taragol.is>
> > > > > wrote:
> > > > >
> > > > > > +1 binding
> > > > > >
> > > > > > 
> > > > > > Best Wishes
> > > > > > *Andrey Anshin*
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik 
> > > wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > Following the discussion about adding five new providers, I am
> > > > calling
> > > > > > for
> > > > > > > an official vote on adding providers for Pinecone, OpenAI &
> > Cohere
> > > to
> > > > > the
> > > > > > > Airflow repo.
> > > > > > >
> > > > > > > Discussion thread:
> > > > > > >
> https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > > > >
> > > > > > > Draft PRs:
> > > > > > >
> > > > > > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > > > > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > > > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > > > > >
> > > > > > > This is my binding +1 vote.
> > > > > > >
> > > > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > > > and until at least 3 binding votes have been cast.
> > > > > > >
> > > > > > > Please vote accordingly:
> > > > > > >
> > > > > > > [ ] + 1 approve
> > > > > > > [ ] + 0 no opinion
> > > > > > > [ ] - 1 disapprove with the reason
> > > > > > >
> > > > > > > Only votes from PMC members and committers are binding, but
> other
> > > > > members
> > > > > > > of the community are encouraged to check the AIP and vote with
> > > > > > > "(non-binding)".
> > > > > > >
> > > > > > > Regards,
> > > > > > > Kaxil
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Keep Mssql support

2023-10-25 Thread Jarek Potiuk
> The alternatives suggested by @Jarek Potiuk  are
something which is doable. We need to realise that these plugins are for
the community and we can only support it if
"majority" of the community uses it and is willing to maintain it :)

Just to clarify - it's not about plugins, it's about "core" metastore
support for MSSQL. The MSSQL provider will remain as is, but Airflow will
lose the ability of using MsSQL as backend for
scheduling/tasks/UI/executors/communication with workers via DB.

The overhead / extra maintenance there is to make sure that all the
queries, performance, locking and anything else airflow uses to make "core"
works with MsSQL.

In Airflow 2.8 the only remaining backends will be MySQL 8 (as MySQL 5.7
reaches end of life in a week), and Postgres 12-16 (in 2 weeks Postgres 11
reaches EOL and we will drop it).
Those will be the only choices for Airflow 2.8 - providing that we will
remove MsSQL as planned (condition for that is that we describe a viable
migration path for MsSQL users).

J.



On Wed, Oct 25, 2023 at 1:23 PM Amogh Desai 
wrote:

> I agree with the comments and where this discussion has led to.
>
> The alternatives suggested by @Jarek Potiuk  are
> something which is doable. We need to realise that these plugins are for
> the community and we can only support it if
> "majority" of the community uses it and is willing to maintain it :)
>
>
> Thanks & Regards,
> Amogh Desai
>
> On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:
>
>> Yeah agreed, I don’t think it is worth keeping the support of MSSQL given
>> the amount of usage vs the maintenance effort required.
>>
>> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
>>
>> > Yes. Agree with Andrey.  I think our experience from the last few years
>> was
>> > "very" bad. The number of mssql users is very small. And the time that
>> > maintainers and community members lose on various problems with it is
>> huge.
>> > Quite often every time we added a new feature requiring some new db
>> > functionality, quite a lot of overhead was spent by the one adding new
>> > features to solve the problems coming just from MSSQL support. It's not
>> > "existing" issues - it's that it generally slows us down with making
>> > changes
>> >
>> > I think there are two options for you when you. It's not "open issues",
>> > it's the maintenance
>> >
>> > * switch to another backend (recommended). And it's not as complex as
>> you
>> > think. You can also use managed DB with all that is needed
>> > (backup/maintenance), you do not have to manage it yourself. There are
>> some
>> > excellent postgres options available.
>> >
>> > * have your own fork airflow and keep the tests running and make your
>> copy
>> > works for MSSQL if you insist on keeping it. Since you already seem to
>> be
>> > ready to spend your engineering time on it, that seems doable.
>> >
>> > The second option I think might even be a business opportunity - for
>> your
>> > company or for anyone who would like to do it. Someone could even offer
>> it
>> > as a service or as a version to support it for others and make a small
>> > business out of it if you are really so committed to it I guess,
>> including
>> > support for any mssql problems.
>> >
>> > That would actually be awesome if someone does it.
>> >
>> > J.
>> >
>> >
>> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin <
>> andrey.ans...@taragol.is>
>> > wrote:
>> >
>> > > I don’t think there is any possibility left to keep MS SQL Server as
>> DB
>> > > backend for Airflow.
>> > >
>> > > I add Elad's message from the original discussion:
>> > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
>> > > Because it cleary describe what is happen with MS SQL as DB backend
>> for
>> > the
>> > > last 1.5 years
>> > >
>> > > > During this time we hoped it would become stable and widely adopted.
>> > > > To my taste MsSQL a backend has left a niche and is *not* worth the
>> >
>> > > maintenance
>> > > of it in our CI.
>> > >
>> > > I also want to note the following points
>> > > - MS SQL have unstable tests in Airflow CI, and some cases we even
>> don't
>> > > run them for the last couple months (or even longer)
>> > > - In additional it taking 2x memory than any other backend
>> > > - Lack of ARM support, this is also quite important because it prevent
>> > > maintainers to check some sort of things in their M1/M2 laptops
>> > > - Additional backend required extra effort for any contributors who
>> want
>> > to
>> > > add new feature that touches DB
>> > >
>> > >
>> > > This has always been an experimental feature which are described in
>> > AIrflow
>> > > Release Process:
>> > >
>> > >
>> >
>> https://airflow.apache.org/docs/apache-airflow/stable/release-process.html#experimental-features
>> > > , I would recommend your team focuses on Airflow on Postgres rather
>> than
>> > > hanging on to vague hope that MS SQL keeping in Airflow.
>> > >
>> > > Quite a few companies provide Managed Airflow, see:
>> > > https://ai

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Pankaj Singh
+1 (binding)

On Wed, Oct 25, 2023 at 4:52 PM Amogh Desai 
wrote:

> +1 (binding)
>
> On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
>  wrote:
>
> > +1 binding
> >
> > On Wed, 25 Oct 2023, 16:39 utkarsh sharma, 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > > Utkarsh Sharma
> > >
> > >
> > > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> > >  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, 
> > > > wrote:
> > > >
> > > > > +1 binding
> > > > >
> > > > > 
> > > > > Best Wishes
> > > > > *Andrey Anshin*
> > > > >
> > > > >
> > > > >
> > > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik 
> > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > Following the discussion about adding five new providers, I am
> > > calling
> > > > > for
> > > > > > an official vote on adding providers for Pinecone, OpenAI &
> Cohere
> > to
> > > > the
> > > > > > Airflow repo.
> > > > > >
> > > > > > Discussion thread:
> > > > > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > > >
> > > > > > Draft PRs:
> > > > > >
> > > > > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > > > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > > > >
> > > > > > This is my binding +1 vote.
> > > > > >
> > > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > > and until at least 3 binding votes have been cast.
> > > > > >
> > > > > > Please vote accordingly:
> > > > > >
> > > > > > [ ] + 1 approve
> > > > > > [ ] + 0 no opinion
> > > > > > [ ] - 1 disapprove with the reason
> > > > > >
> > > > > > Only votes from PMC members and committers are binding, but other
> > > > members
> > > > > > of the community are encouraged to check the AIP and vote with
> > > > > > "(non-binding)".
> > > > > >
> > > > > > Regards,
> > > > > > Kaxil
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Keep Mssql support

2023-10-25 Thread Amogh Desai
I agree with the comments and where this discussion has led to.

The alternatives suggested by @Jarek Potiuk  are
something which is doable. We need to realise that these plugins are for
the community and we can only support it if
"majority" of the community uses it and is willing to maintain it :)


Thanks & Regards,
Amogh Desai

On Wed, Oct 25, 2023 at 4:13 PM Kaxil Naik  wrote:

> Yeah agreed, I don’t think it is worth keeping the support of MSSQL given
> the amount of usage vs the maintenance effort required.
>
> On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:
>
> > Yes. Agree with Andrey.  I think our experience from the last few years
> was
> > "very" bad. The number of mssql users is very small. And the time that
> > maintainers and community members lose on various problems with it is
> huge.
> > Quite often every time we added a new feature requiring some new db
> > functionality, quite a lot of overhead was spent by the one adding new
> > features to solve the problems coming just from MSSQL support. It's not
> > "existing" issues - it's that it generally slows us down with making
> > changes
> >
> > I think there are two options for you when you. It's not "open issues",
> > it's the maintenance
> >
> > * switch to another backend (recommended). And it's not as complex as you
> > think. You can also use managed DB with all that is needed
> > (backup/maintenance), you do not have to manage it yourself. There are
> some
> > excellent postgres options available.
> >
> > * have your own fork airflow and keep the tests running and make your
> copy
> > works for MSSQL if you insist on keeping it. Since you already seem to be
> > ready to spend your engineering time on it, that seems doable.
> >
> > The second option I think might even be a business opportunity - for your
> > company or for anyone who would like to do it. Someone could even offer
> it
> > as a service or as a version to support it for others and make a small
> > business out of it if you are really so committed to it I guess,
> including
> > support for any mssql problems.
> >
> > That would actually be awesome if someone does it.
> >
> > J.
> >
> >
> > On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin  >
> > wrote:
> >
> > > I don’t think there is any possibility left to keep MS SQL Server as DB
> > > backend for Airflow.
> > >
> > > I add Elad's message from the original discussion:
> > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
> > > Because it cleary describe what is happen with MS SQL as DB backend for
> > the
> > > last 1.5 years
> > >
> > > > During this time we hoped it would become stable and widely adopted.
> > > > To my taste MsSQL a backend has left a niche and is *not* worth the >
> > > maintenance
> > > of it in our CI.
> > >
> > > I also want to note the following points
> > > - MS SQL have unstable tests in Airflow CI, and some cases we even
> don't
> > > run them for the last couple months (or even longer)
> > > - In additional it taking 2x memory than any other backend
> > > - Lack of ARM support, this is also quite important because it prevent
> > > maintainers to check some sort of things in their M1/M2 laptops
> > > - Additional backend required extra effort for any contributors who
> want
> > to
> > > add new feature that touches DB
> > >
> > >
> > > This has always been an experimental feature which are described in
> > AIrflow
> > > Release Process:
> > >
> > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/release-process.html#experimental-features
> > > , I would recommend your team focuses on Airflow on Postgres rather
> than
> > > hanging on to vague hope that MS SQL keeping in Airflow.
> > >
> > > Quite a few companies provide Managed Airflow, see:
> > > https://airflow.apache.org/ecosystem/#airflow-as-a-service (this is
> not
> > > complete list) and AFAIK none of them use any other backend rather than
> > > Postgres, maybe one exception with Google Composer v1 which seems use
> > > MySQL, even on Azure Data Factory Managed Airflow use Postgres as DB
> > > backend, see:
> > >
> > >
> >
> https://learn.microsoft.com/en-us/azure/data-factory/concept-managed-airflow#architecture
> > >
> > > 
> > > Best Wishes
> > > *Andrey Anshin*
> > >
> > >
> > >
> > > On Mon, 23 Oct 2023 at 23:01, agateaaa  wrote:
> > >
> > > > Hi All:
> > > >
> > > > Mssql support was voted to be dropped.
> > > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
> > > >
> > > > One of our product requirements is that we can only use the Mssql
> > > database.
> > > > The product that uses airflow is installed with a suite of 8-10 other
> > > > products that all use Mssql database as their database. Preferably we
> > do
> > > > not want our customers to install another database like postgres or
> > MySQL
> > > > since it would involve extra overhead on their DBA team to maintain
> > (with
> > > > backup/restore functions) yet another database
> > > >
> > > > This has been already discus

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Amogh Desai
+1 (binding)

On Wed, Oct 25, 2023 at 4:41 PM Phani Kumar
 wrote:

> +1 binding
>
> On Wed, 25 Oct 2023, 16:39 utkarsh sharma,  wrote:
>
> > +1 (non-binding)
> >
> > Thanks,
> > Utkarsh Sharma
> >
> >
> > On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
> >  wrote:
> >
> > > +1 (binding)
> > >
> > > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, 
> > > wrote:
> > >
> > > > +1 binding
> > > >
> > > > 
> > > > Best Wishes
> > > > *Andrey Anshin*
> > > >
> > > >
> > > >
> > > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik 
> wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > Following the discussion about adding five new providers, I am
> > calling
> > > > for
> > > > > an official vote on adding providers for Pinecone, OpenAI & Cohere
> to
> > > the
> > > > > Airflow repo.
> > > > >
> > > > > Discussion thread:
> > > > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > > >
> > > > > Draft PRs:
> > > > >
> > > > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > > >
> > > > > This is my binding +1 vote.
> > > > >
> > > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > > and until at least 3 binding votes have been cast.
> > > > >
> > > > > Please vote accordingly:
> > > > >
> > > > > [ ] + 1 approve
> > > > > [ ] + 0 no opinion
> > > > > [ ] - 1 disapprove with the reason
> > > > >
> > > > > Only votes from PMC members and committers are binding, but other
> > > members
> > > > > of the community are encouraged to check the AIP and vote with
> > > > > "(non-binding)".
> > > > >
> > > > > Regards,
> > > > > Kaxil
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Phani Kumar
+1 binding

On Wed, 25 Oct 2023, 16:39 utkarsh sharma,  wrote:

> +1 (non-binding)
>
> Thanks,
> Utkarsh Sharma
>
>
> On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
>  wrote:
>
> > +1 (binding)
> >
> > On Wed, 25 Oct 2023, 06:32 Andrey Anshin, 
> > wrote:
> >
> > > +1 binding
> > >
> > > 
> > > Best Wishes
> > > *Andrey Anshin*
> > >
> > >
> > >
> > > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik  wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > Following the discussion about adding five new providers, I am
> calling
> > > for
> > > > an official vote on adding providers for Pinecone, OpenAI & Cohere to
> > the
> > > > Airflow repo.
> > > >
> > > > Discussion thread:
> > > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > > >
> > > > Draft PRs:
> > > >
> > > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > > >
> > > > This is my binding +1 vote.
> > > >
> > > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > > and until at least 3 binding votes have been cast.
> > > >
> > > > Please vote accordingly:
> > > >
> > > > [ ] + 1 approve
> > > > [ ] + 0 no opinion
> > > > [ ] - 1 disapprove with the reason
> > > >
> > > > Only votes from PMC members and committers are binding, but other
> > members
> > > > of the community are encouraged to check the AIP and vote with
> > > > "(non-binding)".
> > > >
> > > > Regards,
> > > > Kaxil
> > > >
> > >
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread utkarsh sharma
+1 (non-binding)

Thanks,
Utkarsh Sharma


On Wed, 25 Oct 2023 at 4:10 PM, Pankaj Koti
 wrote:

> +1 (binding)
>
> On Wed, 25 Oct 2023, 06:32 Andrey Anshin, 
> wrote:
>
> > +1 binding
> >
> > 
> > Best Wishes
> > *Andrey Anshin*
> >
> >
> >
> > On Wed, 25 Oct 2023 at 14:25, Kaxil Naik  wrote:
> >
> > > Hello everyone,
> > >
> > > Following the discussion about adding five new providers, I am calling
> > for
> > > an official vote on adding providers for Pinecone, OpenAI & Cohere to
> the
> > > Airflow repo.
> > >
> > > Discussion thread:
> > > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> > >
> > > Draft PRs:
> > >
> > >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> > >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> > >- https://github.com/apache/airflow/pull/34921 (Cohere)
> > >
> > > This is my binding +1 vote.
> > >
> > > The vote will last until 10:30 GMT/UTC on 1st November,
> > > and until at least 3 binding votes have been cast.
> > >
> > > Please vote accordingly:
> > >
> > > [ ] + 1 approve
> > > [ ] + 0 no opinion
> > > [ ] - 1 disapprove with the reason
> > >
> > > Only votes from PMC members and committers are binding, but other
> members
> > > of the community are encouraged to check the AIP and vote with
> > > "(non-binding)".
> > >
> > > Regards,
> > > Kaxil
> > >
> >
>


Re: Keep Mssql support

2023-10-25 Thread Kaxil Naik
Yeah agreed, I don’t think it is worth keeping the support of MSSQL given
the amount of usage vs the maintenance effort required.

On Tue, 24 Oct 2023 at 23:07, Jarek Potiuk  wrote:

> Yes. Agree with Andrey.  I think our experience from the last few years was
> "very" bad. The number of mssql users is very small. And the time that
> maintainers and community members lose on various problems with it is huge.
> Quite often every time we added a new feature requiring some new db
> functionality, quite a lot of overhead was spent by the one adding new
> features to solve the problems coming just from MSSQL support. It's not
> "existing" issues - it's that it generally slows us down with making
> changes
>
> I think there are two options for you when you. It's not "open issues",
> it's the maintenance
>
> * switch to another backend (recommended). And it's not as complex as you
> think. You can also use managed DB with all that is needed
> (backup/maintenance), you do not have to manage it yourself. There are some
> excellent postgres options available.
>
> * have your own fork airflow and keep the tests running and make your copy
> works for MSSQL if you insist on keeping it. Since you already seem to be
> ready to spend your engineering time on it, that seems doable.
>
> The second option I think might even be a business opportunity - for your
> company or for anyone who would like to do it. Someone could even offer it
> as a service or as a version to support it for others and make a small
> business out of it if you are really so committed to it I guess, including
> support for any mssql problems.
>
> That would actually be awesome if someone does it.
>
> J.
>
>
> On Tue, Oct 24, 2023 at 11:44 PM Andrey Anshin 
> wrote:
>
> > I don’t think there is any possibility left to keep MS SQL Server as DB
> > backend for Airflow.
> >
> > I add Elad's message from the original discussion:
> > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
> > Because it cleary describe what is happen with MS SQL as DB backend for
> the
> > last 1.5 years
> >
> > > During this time we hoped it would become stable and widely adopted.
> > > To my taste MsSQL a backend has left a niche and is *not* worth the >
> > maintenance
> > of it in our CI.
> >
> > I also want to note the following points
> > - MS SQL have unstable tests in Airflow CI, and some cases we even don't
> > run them for the last couple months (or even longer)
> > - In additional it taking 2x memory than any other backend
> > - Lack of ARM support, this is also quite important because it prevent
> > maintainers to check some sort of things in their M1/M2 laptops
> > - Additional backend required extra effort for any contributors who want
> to
> > add new feature that touches DB
> >
> >
> > This has always been an experimental feature which are described in
> AIrflow
> > Release Process:
> >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/release-process.html#experimental-features
> > , I would recommend your team focuses on Airflow on Postgres rather than
> > hanging on to vague hope that MS SQL keeping in Airflow.
> >
> > Quite a few companies provide Managed Airflow, see:
> > https://airflow.apache.org/ecosystem/#airflow-as-a-service (this is not
> > complete list) and AFAIK none of them use any other backend rather than
> > Postgres, maybe one exception with Google Composer v1 which seems use
> > MySQL, even on Azure Data Factory Managed Airflow use Postgres as DB
> > backend, see:
> >
> >
> https://learn.microsoft.com/en-us/azure/data-factory/concept-managed-airflow#architecture
> >
> > 
> > Best Wishes
> > *Andrey Anshin*
> >
> >
> >
> > On Mon, 23 Oct 2023 at 23:01, agateaaa  wrote:
> >
> > > Hi All:
> > >
> > > Mssql support was voted to be dropped.
> > > https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4
> > >
> > > One of our product requirements is that we can only use the Mssql
> > database.
> > > The product that uses airflow is installed with a suite of 8-10 other
> > > products that all use Mssql database as their database. Preferably we
> do
> > > not want our customers to install another database like postgres or
> MySQL
> > > since it would involve extra overhead on their DBA team to maintain
> (with
> > > backup/restore functions) yet another database
> > >
> > > This has been already discussed and voted on but is there any way we
> can
> > > keep experimental support if we pitch in to fix any mssql related
> issues?
> > >
> > > List of current mssql issues are here
> > > *
> > >
> > >
> >
> https://github.com/apache/airflow/issues?q=is%3Aissue+label%3Abackend-mssql-experimintal+is%3Aopen
> > >
> > > Are there any other outstanding issues or can you please let us know a
> > way
> > > to identify mssql related bugs/problems that need to be addressed?
> > >
> > > e.g.
> > > * https://github.com/apache/airflow/discussions/35114
> > >
> > >
> > > We are just trying to understand how much effort will be require

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Pankaj Koti
+1 (binding)

On Wed, 25 Oct 2023, 06:32 Andrey Anshin,  wrote:

> +1 binding
>
> 
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Wed, 25 Oct 2023 at 14:25, Kaxil Naik  wrote:
>
> > Hello everyone,
> >
> > Following the discussion about adding five new providers, I am calling
> for
> > an official vote on adding providers for Pinecone, OpenAI & Cohere to the
> > Airflow repo.
> >
> > Discussion thread:
> > https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
> >
> > Draft PRs:
> >
> >- https://github.com/apache/airflow/pull/35094 (Pinecone)
> >- https://github.com/apache/airflow/pull/35023 (OpenAI)
> >- https://github.com/apache/airflow/pull/34921 (Cohere)
> >
> > This is my binding +1 vote.
> >
> > The vote will last until 10:30 GMT/UTC on 1st November,
> > and until at least 3 binding votes have been cast.
> >
> > Please vote accordingly:
> >
> > [ ] + 1 approve
> > [ ] + 0 no opinion
> > [ ] - 1 disapprove with the reason
> >
> > Only votes from PMC members and committers are binding, but other members
> > of the community are encouraged to check the AIP and vote with
> > "(non-binding)".
> >
> > Regards,
> > Kaxil
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Andrey Anshin
+1 binding


Best Wishes
*Andrey Anshin*



On Wed, 25 Oct 2023 at 14:25, Kaxil Naik  wrote:

> Hello everyone,
>
> Following the discussion about adding five new providers, I am calling for
> an official vote on adding providers for Pinecone, OpenAI & Cohere to the
> Airflow repo.
>
> Discussion thread:
> https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
>
> Draft PRs:
>
>- https://github.com/apache/airflow/pull/35094 (Pinecone)
>- https://github.com/apache/airflow/pull/35023 (OpenAI)
>- https://github.com/apache/airflow/pull/34921 (Cohere)
>
> This is my binding +1 vote.
>
> The vote will last until 10:30 GMT/UTC on 1st November,
> and until at least 3 binding votes have been cast.
>
> Please vote accordingly:
>
> [ ] + 1 approve
> [ ] + 0 no opinion
> [ ] - 1 disapprove with the reason
>
> Only votes from PMC members and committers are binding, but other members
> of the community are encouraged to check the AIP and vote with
> "(non-binding)".
>
> Regards,
> Kaxil
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Jarek Potiuk
+1 (binding)

On Wed, Oct 25, 2023 at 12:25 PM Kaxil Naik  wrote:

> Hello everyone,
>
> Following the discussion about adding five new providers, I am calling for
> an official vote on adding providers for Pinecone, OpenAI & Cohere to the
> Airflow repo.
>
> Discussion thread:
> https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp
>
> Draft PRs:
>
>- https://github.com/apache/airflow/pull/35094 (Pinecone)
>- https://github.com/apache/airflow/pull/35023 (OpenAI)
>- https://github.com/apache/airflow/pull/34921 (Cohere)
>
> This is my binding +1 vote.
>
> The vote will last until 10:30 GMT/UTC on 1st November,
> and until at least 3 binding votes have been cast.
>
> Please vote accordingly:
>
> [ ] + 1 approve
> [ ] + 0 no opinion
> [ ] - 1 disapprove with the reason
>
> Only votes from PMC members and committers are binding, but other members
> of the community are encouraged to check the AIP and vote with
> "(non-binding)".
>
> Regards,
> Kaxil
>


[VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-25 Thread Kaxil Naik
Hello everyone,

Following the discussion about adding five new providers, I am calling for
an official vote on adding providers for Pinecone, OpenAI & Cohere to the
Airflow repo.

Discussion thread:
https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp

Draft PRs:

   - https://github.com/apache/airflow/pull/35094 (Pinecone)
   - https://github.com/apache/airflow/pull/35023 (OpenAI)
   - https://github.com/apache/airflow/pull/34921 (Cohere)

This is my binding +1 vote.

The vote will last until 10:30 GMT/UTC on 1st November,
and until at least 3 binding votes have been cast.

Please vote accordingly:

[ ] + 1 approve
[ ] + 0 no opinion
[ ] - 1 disapprove with the reason

Only votes from PMC members and committers are binding, but other members
of the community are encouraged to check the AIP and vote with
"(non-binding)".

Regards,
Kaxil


[LAZY CONSENSUS] Add PgVector & Weaviate Providers to enable first-class LLMOps

2023-10-25 Thread Kaxil Naik
Hello everyone,

Following the discussion about adding five new providers, I am calling for
a lazy consensus on adding two of them which are Open Source (PgVector &
Weaviate) providers to the Airflow repo.

Discussion thread:
https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp

The lazy consensus will run for 3 days till October 28 2023 10:30 am GMT.

Regards,
Kaxil


Re: [DISCUSSION] Add 5 new Providers to enable first-class LLMOps

2023-10-25 Thread Kaxil Naik
Thanks everyone for the discussion, I will create a voting thread and lazy
consensus for the open-sources ones.

Regards,
Kaxil

On Tue, 24 Oct 2023 at 18:19, Pankaj Koti  wrote:

> Hi Andrey,
>
> We discussed internally (at Astronomer) regarding your
> suggestion on the PRs and we will incorporate the suggestion
> in our PRs to remove the callable approach and include
> examples in the docs to achieve the same using TaskFlow
> and using hook methods in those. Thank you for
> the suggestion, I believe that's easier to maintain and also
> gives the power of the TaskFlow API.
>
> On 2023/10/19 16:03:47 Andrey Anshin wrote:
> > Because 4 out 5 new providers have a draft PR I would like to raise a
> > question about which related to all new providers.  Just to avoid the
> same
> > question in all PRs.
> >
> > Do we actually want to make new operators kindish of like
> "PythonOperator"?
> > Maybe I miss some important thing and can't see why it would work better
> > rather than run hooks methods inside of PythonOperator / TaskFlow?
> >
> > For the reference Reference:
> > Add Cohere Provider:
> > https://github.com/apache/airflow/pull/34921#discussion_r1358525838
> > Enable pgvector support for Postgres provider:
> > https://github.com/apache/airflow/pull/34891#discussion_r1362910782
> > Add OpenAI Provider:
> > https://github.com/apache/airflow/pull/35023#discussion_r1365235167
> > Add Weaviate Provider:
> > https://github.com/apache/airflow/pull/35060/files#r1365765741
> >
> > 
> > Best Wishes
> > *Andrey Anshin*
> >
> >
> >
> > On Tue, 17 Oct 2023 at 22:42, Kaxil Naik  wrote:
> >
> > > Hey Everyone,
> > >
> > > As a follow-up to my Keynote talk, Building and deploying LLM
> applications
> > > with Apache Airflow ,
> I
> > > am formally proposing the addition of these 5 providers to the Apache
> > > Airflow repo:
> > >
> > >-
> > >
> > >PgVector 
> > >-
> > >
> > >Weaviate 
> > >-
> > >
> > >Pinecone 
> > >-
> > >
> > >OpenAI 
> > >-
> > >
> > >Cohere 
> > >
> > >
> > > Advancements in LLMs are moving at a rapid pace & transforming the way
> we
> > > work and our industry. Although LLMs are simple to use in prototyping,
> > > using LLM for enterprise applications and for production still
> presents a
> > > lot of challenges. These
> > > <
> > >
> https://speakerdeck.com/kaxil/building-and-deploying-llm-applications-with-apache-airflow?slide=8
> > > >
> > > are some of the same problems that we tackle in Data Engineering, and
> > > Airflow is a natural fit for them.
> > >
> > > We at Astronomer would like to add first-class support for the popular
> LLMs
> > > (OpenAI & Cohere) and vector DBs (PgVector, Weaviate & Pinecone) so
> that
> > > Data Scientists and ML engineers can utilize them natively with
> easy-to-use
> > > Operator & Hook abstractions while providing a native (and
> > > Production-ready) approach for Authentication, retries, logging etc.
> > >
> > > We also think this is vital for the Apache Airflow project as we, the
> > > project, embrace the LLM tide and continue to be a great example of
> > > balancing innovation and maintaining backward-compatibility.
> > >
> > > The first versions of these providers will enable building one of the
> most
> > > common use cases of LLMs i.e. Question and Answering / Chatbots using
> > > Retrieval-augmented generation (RAG) done with the help of embeddings.
> > >
> > > Everyone is welcome and encouraged to contribute once the PRs are
> merged.
> > > Astronomer is committed to maintaining these providers in the Airflow
> repo,
> > > including reviewing PRs, maintaining code quality, testing and keeping
> the
> > > APIs up-to-date.
> > >
> > > Note: PgVector  is an
> open-source
> > > project, so we don’t need a formal vote for it as per our guidelines
> > > <
> > >
> https://github.com/apache/airflow/blob/main/PROVIDERS.rst#accepting-new-community-providers
> > > >.
> > > So please consider this email as seeking a Lazy Consensus for it.
> > >
> > > I will open up a VOTING thread after discussing this for a few days.
> > >
> > > Thanks.
> > >
> > > Regards,
> > >
> > > Kaxil
> > >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>


Re: Limiting (or errorring out) Airflow for Python 3.12 until our dependencies/we catch up

2023-10-25 Thread Jarek Potiuk
> I agree, has there been a new release of Python that has worked with
Airflow without at least some fixing?

Generally yes. Most of the past bumps were mostly test fixing. I can't
recall any "serious" changes in the code. Maybe in some single providers.
Generally  Python 3.7  -> 3.11 were extremely backwards compatible, there
were really maybe 2 cases I remember where we had compatibility issues in
airflow "core" and they were because of security fixes (some edge cases of
url parsing changed Is one that I remember) - and that was even in
patchlevel releases - not even in the minor release.

> I understand that Airflow is both a library and an application and I do
agree with not putting upper bounds on libraries unless required, but it
seems like allowing an arbitrary upper bound of Python is never going to be
practical. If users really need to break this support version and install
an older version of Airflow with a newer version of Python it isn't
difficult to build Airflow locally with patches, and they will likely need
to add more patches to get Airflow to work anyway, so their cost is already
built in.

To be honest, It's been practical for now, this is the first time it has
happened for quite a while. But it seems to be changing now. Also I think
(by listening to some podcast) - Python community starts to be overwhelmed
with keeping all the backwards-compatibility and they are changing their
approach for future releases to be more aggressive in removals, separate
out certain "core" features to outside-libraries and removing them out of
the core Python (which is actually quite an inspiration for us in what we
do by moving stuff - like executors, hopefully FAB in the future and maybe
few others from core to providers), so we can expect more of such breakages
in the moment. Having explicit upper-binding will be a good idea.

I think we can do a hybrid solution. We can upper-bind for 2.8 but when we
release 2.7.3 we can implement raising error if someone wants to use Python
3.12.  I think that will be the best approach.

J.



On Mon, Oct 23, 2023 at 6:33 PM Damian Shaw 
wrote:

> I agree, has there been a new release of Python that has worked with
> Airflow without at least some fixing?
>
> I understand that Airflow is both a library and an application and I do
> agree with not putting upper bounds on libraries unless required, but it
> seems like allows an arbitrary upper bound of Python is never going to be
> practical. If users really need to break this support version and install
> an older version of Airflow with a newer version of Python it isn't
> difficult to build Airflow locally with patches, and they will likely need
> to add more patches to get Airflow to work anyway, so their cost is already
> built in.
>
> Even if just limiting to 3.11 or lower for 2.7.3 or higher I think the
> benefits are there, Python 3.12 might be supported soon or there might be
> more hidden issues that mean it isn't supported for 9+ months.
>
> Damian
>
> -Original Message-
> From: Pierre Jeambrun 
> Sent: Monday, October 23, 2023 12:13 PM
> To: dev@airflow.apache.org
> Subject: Re: Limiting (or errorring out) Airflow for Python 3.12 until our
> dependencies/we catch up
>
> I think that limiting to <3.12 makes sense. 2.7.2 is already out so I'm
> not sure we can do anything for users trying to install 2.7.2 on python
> 3.12.
>
> I believe there is no such thing as a python minor that is out of the box
> working well for airflow. It seems that we always need extra efforts to
> bring in a new version due to small (or big) breaking change. Maybe
> limiting python version should be by default (always), until we explicitly
> move that limit when we integrate a new version. Airflow would not be
> installable on newest version that were not explicitly tested against.
>
> Le lun. 23 oct. 2023 à 14:19, Aritra Basu  a
> écrit :
>
> > I think I'm on the side of giving an error message saying 3.12 is not
> > yet supported in 2.7.3, I would assume anyone seeing that would
> > understand the implication that neither does 2.7.2 and thus they
> > wouldn't try installing it.
> >
> > Though I would also think they would have the same understanding that
> > if
> > 2.7.3 doesn't
> > list 3.12 as supported neither would 2.7.2. But I'd rather be
> > explicitly told 2.7.3 and earlier don't work with 3.12.
> >
> > Thanks and Regards,
> > Aritra Basu
> >
> >
> > On Mon, Oct 23, 2023 at 4:54 PM Jarek Potiuk  wrote:
> >
> > > Hey everyone,
> > >
> > > I've opened a PR https://github.com/apache/airflow/pull/35123  to
> > > limit Airflow to Python < 3.12 though I am not sure if this is the
> > > best idea
> > so I
> > > seek devlist wisdom to decide whether we should do this, or maybe
> > something
> > > else like allowing airflow to be installed but produce a clean error
> > > indicating 3.12 is not (yet) supported.
> > >
> > > Currently Airflow does not work for Python 3.12 - mainly because of
> > > "distutils" removal https://peps.py