Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Jed Cunningham
Just to clarify, I didn't mean we should highlight multiple PRs every
month. In that month none of @eumiro's PRs were individually enough to be
highlighted but in bulk they were. That was an unusual situation though.

I think we should be flexible with it. I'm not opposed to having multiple
per month, but if we do I think they should be equally notable. If we have
a clear winner, we should just highlight that one PR. My 2c.

I'll also call out that we (committers, community members, anyone really!)
should toss protm on good candidates as we are reviewing stuff day-to-day.
That bumps up the score in the script and will help ensure the good stuff
bubbles to the top of the scripts output! A quick search shows we've done
this less than a dozen times so far.


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-27 Thread Hussein Awala
+1 (binding) for:
   - https://github.com/apache/airflow/pull/35094 (Pinecone)
   - https://github.com/apache/airflow/pull/34921 (Cohere)

-0 (binding) for:
 - https://github.com/apache/airflow/pull/35023 (OpenAI)
-> One month ago, openai-python announced its v1.0.0 Beta version (
https://github.com/openai/openai-python/discussions/631), which is a total
rewrite of the library and has a lot of changes according to this
announcement.
IMHO, it would be better to wait for the first major version to avoid
bringing breaking changes to this new provider or handling two
completely different versions for b/c.

On Fri, Oct 27, 2023 at 9:20 PM Rahul Vats  wrote:

> +1 (non binding)
>
> On Sat, 28 Oct, 2023, 00:47 Ryan Hatter,  .invalid>
> wrote:
>
> > +1 (non-binding)
> >
> > On Thu, Oct 26, 2023 at 9:32 AM Oliveira, Niko
>  > >
> > wrote:
> >
> > > +1 (binding)
> > >
> > > looking forward to having more native LLM capabilities in Airflow!
> > >
> > > 
> > > From: Aritra Basu 
> > > Sent: Wednesday, October 25, 2023 12:10:00 PM
> > > To: dev@airflow.apache.org
> > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> > > Pinecone, OpenAI & Cohere to enable first-class LLMOps
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > +1 (non binding)
> > >
> > > --
> > > Regards,
> > > Aritra Basu
> > >
> > > On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis
> > > 
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > >
> > > >  - ferruzzi
> > > >
> > > >
> > > > 
> > > > From: Jed Cunningham 
> > > > Sent: Wednesday, October 25, 2023 9:54 AM
> > > > To: dev@airflow.apache.org
> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> > > > Pinecone, OpenAI & Cohere to enable first-class LLMOps
> > > >
> > > > CAUTION: This email originated from outside of the organization. Do
> not
> > > > click links or open attachments unless you can confirm the sender and
> > > know
> > > > the content is safe.
> > > >
> > > >
> > > >
> > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> > externe.
> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > > pouvez
> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> certain
> > > que
> > > > le contenu ne présente aucun risque.
> > > >
> > > >
> > > >
> > > > +1 (binding)
> > > >
> > >
> >
>


Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Briana Okyere
Hey All,

Thank you so much for your feedback on this. It sounds like we are good to
move forward with our current script that features the Top 5 PRs of the
Month, and then voting for our top 3 from those.

On Fri, Oct 27, 2023 at 12:51 PM Aritra Basu 
wrote:

> I'm not sure on the exact heuristics used by the script but how about all
> PRs with atleast 1 (or 2 or a specific number) votes from the ones picked
> by the script gets featured.
>
> --
> Regards,
> Aritra Basu
>
> On Sat, Oct 28, 2023, 1:04 AM Jarek Potiuk  wrote:
>
> > Yeah. Top PRs of the month sound good.
> >
> > On Fri, Oct 27, 2023 at 6:16 PM Amogh Desai 
> > wrote:
> >
> > > Hi,
> > >
> > > I also think having multiple PRs under PR of the month would be really
> > > nice.
> > >
> > > One way to approach this is:
> > >
> > > What we can do is, collect votes for all the stakeholders for their
> top 3
> > > PRs in the list as Pierre mentioned and then create score for each PR.
> > >
> > > The top X PRs from this list goes in the newsletter.
> > >
> > > Thanks & Regards,
> > > Amogh Desai
> > >
> > > On Fri, Oct 27, 2023, 21:37 Pierre Jeambrun 
> > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I like the idea of highlighting more than just 1 PR in the "PR of the
> > > month
> > > > section", especially when we have a hard time deciding between a few
> > good
> > > > candidates.
> > > >
> > > > IMHO the script does not always select good candidates or sometimes
> > miss
> > > > some good candidates, (because of the heuristic we use) and should
> > still
> > > be
> > > > reviewed by 'human' and put under a vote.
> > > >
> > > > An idea would be to keep the current process we have, but vote for
> lets
> > > > say, up to 3 PR per person, and adapt the newsletter to handle a PR
> > batch
> > > > of the month.
> > > >
> > > > (We can also improve the heuristic so we can blindly trust the X top
> > > > candidates it outputs, but we are not here yet I believe)
> > > >
> > > > Le ven. 27 oct. 2023 à 17:56, Briana Okyere
> > > >  a écrit :
> > > >
> > > > > Hey All,
> > > > >
> > > > > A couple months back, Jed Cunningham proposed that we batch our
> PRs,
> > > > > instead of including just one in the PR of the Month section of the
> > > > Airflow
> > > > > Newsletter. Thread below:
> > > > >
> > > > > 
> > > > >
> > > > > I'd like to put this up to a vote. Do you prefer we only include 1
> PR
> > > > each
> > > > > month? Or, should we include multiple? Currently, we use a script
> to
> > > pull
> > > > > the top 5. Should all 5 be included in each month's issue?
> > > > >
> > > > > Looking forward to hearing your thoughts prior to the release of
> the
> > > > > October issue next week.
> > > > >
> > > > > --
> > > > > Briana Okyere
> > > > > Community Manager
> > > > > Email: briana.oky...@astronomer.io
> > > > > Mobile: +1 415.713.9943
> > > > > Time zone: US Pacific UTC
> > > > >
> > > > > 
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Aritra Basu
I'm not sure on the exact heuristics used by the script but how about all
PRs with atleast 1 (or 2 or a specific number) votes from the ones picked
by the script gets featured.

--
Regards,
Aritra Basu

On Sat, Oct 28, 2023, 1:04 AM Jarek Potiuk  wrote:

> Yeah. Top PRs of the month sound good.
>
> On Fri, Oct 27, 2023 at 6:16 PM Amogh Desai 
> wrote:
>
> > Hi,
> >
> > I also think having multiple PRs under PR of the month would be really
> > nice.
> >
> > One way to approach this is:
> >
> > What we can do is, collect votes for all the stakeholders for their top 3
> > PRs in the list as Pierre mentioned and then create score for each PR.
> >
> > The top X PRs from this list goes in the newsletter.
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> > On Fri, Oct 27, 2023, 21:37 Pierre Jeambrun 
> wrote:
> >
> > > Hello all,
> > >
> > > I like the idea of highlighting more than just 1 PR in the "PR of the
> > month
> > > section", especially when we have a hard time deciding between a few
> good
> > > candidates.
> > >
> > > IMHO the script does not always select good candidates or sometimes
> miss
> > > some good candidates, (because of the heuristic we use) and should
> still
> > be
> > > reviewed by 'human' and put under a vote.
> > >
> > > An idea would be to keep the current process we have, but vote for lets
> > > say, up to 3 PR per person, and adapt the newsletter to handle a PR
> batch
> > > of the month.
> > >
> > > (We can also improve the heuristic so we can blindly trust the X top
> > > candidates it outputs, but we are not here yet I believe)
> > >
> > > Le ven. 27 oct. 2023 à 17:56, Briana Okyere
> > >  a écrit :
> > >
> > > > Hey All,
> > > >
> > > > A couple months back, Jed Cunningham proposed that we batch our PRs,
> > > > instead of including just one in the PR of the Month section of the
> > > Airflow
> > > > Newsletter. Thread below:
> > > >
> > > > 
> > > >
> > > > I'd like to put this up to a vote. Do you prefer we only include 1 PR
> > > each
> > > > month? Or, should we include multiple? Currently, we use a script to
> > pull
> > > > the top 5. Should all 5 be included in each month's issue?
> > > >
> > > > Looking forward to hearing your thoughts prior to the release of the
> > > > October issue next week.
> > > >
> > > > --
> > > > Briana Okyere
> > > > Community Manager
> > > > Email: briana.oky...@astronomer.io
> > > > Mobile: +1 415.713.9943
> > > > Time zone: US Pacific UTC
> > > >
> > > > 
> > > >
> > >
> >
>


Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Jarek Potiuk
Yeah. Top PRs of the month sound good.

On Fri, Oct 27, 2023 at 6:16 PM Amogh Desai 
wrote:

> Hi,
>
> I also think having multiple PRs under PR of the month would be really
> nice.
>
> One way to approach this is:
>
> What we can do is, collect votes for all the stakeholders for their top 3
> PRs in the list as Pierre mentioned and then create score for each PR.
>
> The top X PRs from this list goes in the newsletter.
>
> Thanks & Regards,
> Amogh Desai
>
> On Fri, Oct 27, 2023, 21:37 Pierre Jeambrun  wrote:
>
> > Hello all,
> >
> > I like the idea of highlighting more than just 1 PR in the "PR of the
> month
> > section", especially when we have a hard time deciding between a few good
> > candidates.
> >
> > IMHO the script does not always select good candidates or sometimes miss
> > some good candidates, (because of the heuristic we use) and should still
> be
> > reviewed by 'human' and put under a vote.
> >
> > An idea would be to keep the current process we have, but vote for lets
> > say, up to 3 PR per person, and adapt the newsletter to handle a PR batch
> > of the month.
> >
> > (We can also improve the heuristic so we can blindly trust the X top
> > candidates it outputs, but we are not here yet I believe)
> >
> > Le ven. 27 oct. 2023 à 17:56, Briana Okyere
> >  a écrit :
> >
> > > Hey All,
> > >
> > > A couple months back, Jed Cunningham proposed that we batch our PRs,
> > > instead of including just one in the PR of the Month section of the
> > Airflow
> > > Newsletter. Thread below:
> > >
> > > 
> > >
> > > I'd like to put this up to a vote. Do you prefer we only include 1 PR
> > each
> > > month? Or, should we include multiple? Currently, we use a script to
> pull
> > > the top 5. Should all 5 be included in each month's issue?
> > >
> > > Looking forward to hearing your thoughts prior to the release of the
> > > October issue next week.
> > >
> > > --
> > > Briana Okyere
> > > Community Manager
> > > Email: briana.oky...@astronomer.io
> > > Mobile: +1 415.713.9943
> > > Time zone: US Pacific UTC
> > >
> > > 
> > >
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-27 Thread Rahul Vats
+1 (non binding)

On Sat, 28 Oct, 2023, 00:47 Ryan Hatter, 
wrote:

> +1 (non-binding)
>
> On Thu, Oct 26, 2023 at 9:32 AM Oliveira, Niko  >
> wrote:
>
> > +1 (binding)
> >
> > looking forward to having more native LLM capabilities in Airflow!
> >
> > 
> > From: Aritra Basu 
> > Sent: Wednesday, October 25, 2023 12:10:00 PM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> > Pinecone, OpenAI & Cohere to enable first-class LLMOps
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > +1 (non binding)
> >
> > --
> > Regards,
> > Aritra Basu
> >
> > On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis
> > 
> > wrote:
> >
> > > +1 (binding)
> > >
> > >
> > >  - ferruzzi
> > >
> > >
> > > 
> > > From: Jed Cunningham 
> > > Sent: Wednesday, October 25, 2023 9:54 AM
> > > To: dev@airflow.apache.org
> > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> > > Pinecone, OpenAI & Cohere to enable first-class LLMOps
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > +1 (binding)
> > >
> >
>


Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-27 Thread Ryan Hatter
+1 (non-binding)

On Thu, Oct 26, 2023 at 9:32 AM Oliveira, Niko 
wrote:

> +1 (binding)
>
> looking forward to having more native LLM capabilities in Airflow!
>
> 
> From: Aritra Basu 
> Sent: Wednesday, October 25, 2023 12:10:00 PM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> Pinecone, OpenAI & Cohere to enable first-class LLMOps
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> +1 (non binding)
>
> --
> Regards,
> Aritra Basu
>
> On Wed, Oct 25, 2023, 11:02 PM Ferruzzi, Dennis
> 
> wrote:
>
> > +1 (binding)
> >
> >
> >  - ferruzzi
> >
> >
> > 
> > From: Jed Cunningham 
> > Sent: Wednesday, October 25, 2023 9:54 AM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [VOTE] Add providers for
> > Pinecone, OpenAI & Cohere to enable first-class LLMOps
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > +1 (binding)
> >
>


Re: [DISCUSS] Removing Qubole provider (and adding removal process)

2023-10-27 Thread Aritra Basu
Sounds like a good time to set the process up. +1 from me as well.

--
Regards,
Aritra Basu

On Fri, Oct 27, 2023, 6:42 PM Vincent Beck  wrote:

> I like that. I also think it is important to have a process to remove
> provider if needed. +1
>
> On 2023/10/27 09:00:25 Jarek Potiuk wrote:
> > > I think in the case of Qubole it is pretty easy to remove it from the
> > provider codebase. I'm pretty sure that almost no one even noticed this
> > removal.
> >
> > Yeah. Agree. This one is pretty "obvious" that's why I would like to
> create
> > a process for doing it along the way so that in the future if we have
> other
> > non-obvious cases we can just "follow the process".
> >
> > BTW.That would be really great to have as with it, we will have a
> complete
> > "life-cycle" of the providers.
> >
> > * we know how we approve the new ones
> > * we know how we release new versions (Elad is amazing to release them
> > every few weeks)
> > * we know how we maintain back-compatibility (bumping min-version of
> > Airflow that we regularly do )
> > * we know how to involve stakeholders to system-test them and to make
> them
> > work in a stable way when they connect to external services (Amazon
> works,
> > Google in Progres, Open API and others promised by Astronomer)
> > * we know how to involve stakeholders with mixed-governance in case they
> > want older releases (never happened yet but we know how to do it)
> > * we know how to suspend and resume them when they prove to be
> problematic
> > and pass resolution of that to external stakeholders (happened with
> Yandex
> > - both suspend and resume)
> > * we (will finally) know how to retire them when we decide we do not want
> > to maintain them - except security fixes - any more
> >
> > That will pretty much complete our process of "life-cycle" management for
> > providers.
> >
> > J.
> >
> >
> >
> >
> > On Thu, Oct 26, 2023 at 10:00 PM Jarek Potiuk  wrote:
> >
> > >
> > >> I suggest also removing it from pypi for security reasons. If there
> is a
> > >> security issue with it then the issue will remain with us.
> > >>
> > >>
> > > I am quite sure we still have to handle security issues if someone
> finds
> > > them. releasing such a provider will still be possible using the
> tag/branch
> > > and we will be obliged to release a new version IF it is still used
> and a
> > > security issue is found.
> > > Removal of packages from PyPI does not remove our obligation to fix a
> > > security problem. We have  also source packages released via
> > > downloads.apache.org and archives.apache.org - and those we can't
> remove
> > > either.
> > >
> > > I think this is one of the obligations of the Foundation being the
> > > "steward" of software it releases  - that's why there is also the
> attic PMC
> > > to manage projects that PMC is unable to support (projects are moved to
> > > attic when they fail a roll call from the board with less than 2 PMC
> > > members confirming that they are still there and ready to handle
> releases
> > > if needed. It's also being discussed to be more formal for the CRA
> > > regulations right now - "stewards" of the software put on the market
> should
> > > be responsible for handling security issues in a timely manner). The
> act of
> > > release with 3 +1s of PMC is a legal act of the Foundation placing
> software
> > > on the market and we can't make it "unhappen".
> > >
> > >
> > >> B.
> > >>
> > >> Sent from my iPhone
> > >>
> > >> > On 26 Oct 2023, at 20:20, Jarek Potiuk  wrote:
> > >> >
> > >> > Hello Airflow community,
> > >> >
> > >> > How do we feel about removing the Qubole provider completely
> (leaving
> > >> only
> > >> > old releases in PyPI?
> > >> >
> > >> > On September 1 2023 (
> > >> > https://lists.apache.org/thread/p394d7w7gc7lz61g7qdthl96bc9kprxh)
> the
> > >> > Qubole operator ws suspended.
> > >> >
> > >> > Due to the reasons described in the thread (Qubole got acquired and
> the
> > >> > service is generally abandoned) there is pretty much no chance for
> it
> > >> to be
> > >> > resumed.
> > >> >
> > >> > I'd love to remove it completely and introduce a process where we
> can do
> > >> > similar things in the future for other providers if we decide to do
> so.
> > >> >
> > >> > I checked in the Attic project in the ASF (this is where abandoned
> > >> project
> > >> > of the ASF get moved to) and it seems that just removing part of the
> > >> > project that has an active PMC is not going through attic
> > >> > https://issues.apache.org/jira/browse/ATTIC-218 . We are free to
> > >> define our
> > >> > rules for that and I would like to use the opportunity to hash it
> out
> > >> and
> > >> > propose a process (similarly to suspension) and criteria to remove
> > >> > providers from being maintained by us.
> > >> >
> > >> > It's more than suspension. We will completely stop updating the
> related
> > >> > code (right now some automated changes can still be applied and
> > >> suspended
> > >> > providers can be resum

Re: AIP-49 OpenTelemetry call for action/help

2023-10-27 Thread Aritra Basu
Hi,
I can try helping out, though I haven't much experience working with otel,
so I guess I might be more helpful if there's some implementations already
done to work off of.

--
Regards,
Aritra Basu

On Fri, Oct 27, 2023, 9:35 PM Ferruzzi, Dennis 
wrote:

> Hello friends!  OTel support for metrics reporting has been live for a bit
> now and I am looking forward to the next stage which will be to get Traces
> and Spans implemented.   Howard Yoo has a working proof-of-concept for the
> Traces in the AIP, and I have set up a Project board on GitHub with a
> breakdown of what needs to be done for the Traces, along with links back to
> the relevant parts of the POC.
>
> I'd love to get this done, but honestly work and life have me a bit
> overwhelmed and distracted with other things.  So I'm putting out the call
> for help.   If anyone is looking for a little something to work on, the
> tasks on the board should be fairly "bite-sized" and all include (rough?)
> examples.   The first one or two will be the more difficult as we decide on
> a format and get unit tests implemented, then after that the rest will just
> be "do that again in a different file" and should get much easier.
>
> So this is a call for action, and a call for help.  Many hands make light
> work, as they say, and I'd very much love to see this project through.
>
>
> AIP:
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
> Project Board: https://github.com/orgs/apache/projects/298
> Previous Work [metrics implementation]:
> https://github.com/apache/airflow/pulls?q=is%3Apr+in%3Atitle+%5BAIP-49%5D+
>
>
>
>  - ferruzzi
>


Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Amogh Desai
Hi,

I also think having multiple PRs under PR of the month would be really
nice.

One way to approach this is:

What we can do is, collect votes for all the stakeholders for their top 3
PRs in the list as Pierre mentioned and then create score for each PR.

The top X PRs from this list goes in the newsletter.

Thanks & Regards,
Amogh Desai

On Fri, Oct 27, 2023, 21:37 Pierre Jeambrun  wrote:

> Hello all,
>
> I like the idea of highlighting more than just 1 PR in the "PR of the month
> section", especially when we have a hard time deciding between a few good
> candidates.
>
> IMHO the script does not always select good candidates or sometimes miss
> some good candidates, (because of the heuristic we use) and should still be
> reviewed by 'human' and put under a vote.
>
> An idea would be to keep the current process we have, but vote for lets
> say, up to 3 PR per person, and adapt the newsletter to handle a PR batch
> of the month.
>
> (We can also improve the heuristic so we can blindly trust the X top
> candidates it outputs, but we are not here yet I believe)
>
> Le ven. 27 oct. 2023 à 17:56, Briana Okyere
>  a écrit :
>
> > Hey All,
> >
> > A couple months back, Jed Cunningham proposed that we batch our PRs,
> > instead of including just one in the PR of the Month section of the
> Airflow
> > Newsletter. Thread below:
> >
> > 
> >
> > I'd like to put this up to a vote. Do you prefer we only include 1 PR
> each
> > month? Or, should we include multiple? Currently, we use a script to pull
> > the top 5. Should all 5 be included in each month's issue?
> >
> > Looking forward to hearing your thoughts prior to the release of the
> > October issue next week.
> >
> > --
> > Briana Okyere
> > Community Manager
> > Email: briana.oky...@astronomer.io
> > Mobile: +1 415.713.9943
> > Time zone: US Pacific UTC
> >
> > 
> >
>


Re: [VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Pierre Jeambrun
Hello all,

I like the idea of highlighting more than just 1 PR in the "PR of the month
section", especially when we have a hard time deciding between a few good
candidates.

IMHO the script does not always select good candidates or sometimes miss
some good candidates, (because of the heuristic we use) and should still be
reviewed by 'human' and put under a vote.

An idea would be to keep the current process we have, but vote for lets
say, up to 3 PR per person, and adapt the newsletter to handle a PR batch
of the month.

(We can also improve the heuristic so we can blindly trust the X top
candidates it outputs, but we are not here yet I believe)

Le ven. 27 oct. 2023 à 17:56, Briana Okyere
 a écrit :

> Hey All,
>
> A couple months back, Jed Cunningham proposed that we batch our PRs,
> instead of including just one in the PR of the Month section of the Airflow
> Newsletter. Thread below:
>
> 
>
> I'd like to put this up to a vote. Do you prefer we only include 1 PR each
> month? Or, should we include multiple? Currently, we use a script to pull
> the top 5. Should all 5 be included in each month's issue?
>
> Looking forward to hearing your thoughts prior to the release of the
> October issue next week.
>
> --
> Briana Okyere
> Community Manager
> Email: briana.oky...@astronomer.io
> Mobile: +1 415.713.9943
> Time zone: US Pacific UTC
>
> 
>


AIP-49 OpenTelemetry call for action/help

2023-10-27 Thread Ferruzzi, Dennis
Hello friends!  OTel support for metrics reporting has been live for a bit now 
and I am looking forward to the next stage which will be to get Traces and 
Spans implemented.   Howard Yoo has a working proof-of-concept for the Traces 
in the AIP, and I have set up a Project board on GitHub with a breakdown of 
what needs to be done for the Traces, along with links back to the relevant 
parts of the POC.

I'd love to get this done, but honestly work and life have me a bit overwhelmed 
and distracted with other things.  So I'm putting out the call for help.   If 
anyone is looking for a little something to work on, the tasks on the board 
should be fairly "bite-sized" and all include (rough?) examples.   The first 
one or two will be the more difficult as we decide on a format and get unit 
tests implemented, then after that the rest will just be "do that again in a 
different file" and should get much easier.

So this is a call for action, and a call for help.  Many hands make light work, 
as they say, and I'd very much love to see this project through.


AIP: 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
Project Board: https://github.com/orgs/apache/projects/298
Previous Work [metrics implementation]: 
https://github.com/apache/airflow/pulls?q=is%3Apr+in%3Atitle+%5BAIP-49%5D+



 - ferruzzi


[VOTE] Add Multiple PR's of the Month to the Airflow Newsletter

2023-10-27 Thread Briana Okyere
Hey All,

A couple months back, Jed Cunningham proposed that we batch our PRs,
instead of including just one in the PR of the Month section of the Airflow
Newsletter. Thread below:



I'd like to put this up to a vote. Do you prefer we only include 1 PR each
month? Or, should we include multiple? Currently, we use a script to pull
the top 5. Should all 5 be included in each month's issue?

Looking forward to hearing your thoughts prior to the release of the
October issue next week.

-- 
Briana Okyere
Community Manager
Email: briana.oky...@astronomer.io
Mobile: +1 415.713.9943
Time zone: US Pacific UTC




Re: [DISCUSS] Removing Qubole provider (and adding removal process)

2023-10-27 Thread Vincent Beck
I like that. I also think it is important to have a process to remove provider 
if needed. +1

On 2023/10/27 09:00:25 Jarek Potiuk wrote:
> > I think in the case of Qubole it is pretty easy to remove it from the
> provider codebase. I'm pretty sure that almost no one even noticed this
> removal.
> 
> Yeah. Agree. This one is pretty "obvious" that's why I would like to create
> a process for doing it along the way so that in the future if we have other
> non-obvious cases we can just "follow the process".
> 
> BTW.That would be really great to have as with it, we will have a complete
> "life-cycle" of the providers.
> 
> * we know how we approve the new ones
> * we know how we release new versions (Elad is amazing to release them
> every few weeks)
> * we know how we maintain back-compatibility (bumping min-version of
> Airflow that we regularly do )
> * we know how to involve stakeholders to system-test them and to make them
> work in a stable way when they connect to external services (Amazon works,
> Google in Progres, Open API and others promised by Astronomer)
> * we know how to involve stakeholders with mixed-governance in case they
> want older releases (never happened yet but we know how to do it)
> * we know how to suspend and resume them when they prove to be problematic
> and pass resolution of that to external stakeholders (happened with Yandex
> - both suspend and resume)
> * we (will finally) know how to retire them when we decide we do not want
> to maintain them - except security fixes - any more
> 
> That will pretty much complete our process of "life-cycle" management for
> providers.
> 
> J.
> 
> 
> 
> 
> On Thu, Oct 26, 2023 at 10:00 PM Jarek Potiuk  wrote:
> 
> >
> >> I suggest also removing it from pypi for security reasons. If there is a
> >> security issue with it then the issue will remain with us.
> >>
> >>
> > I am quite sure we still have to handle security issues if someone finds
> > them. releasing such a provider will still be possible using the tag/branch
> > and we will be obliged to release a new version IF it is still used and a
> > security issue is found.
> > Removal of packages from PyPI does not remove our obligation to fix a
> > security problem. We have  also source packages released via
> > downloads.apache.org and archives.apache.org - and those we can't remove
> > either.
> >
> > I think this is one of the obligations of the Foundation being the
> > "steward" of software it releases  - that's why there is also the attic PMC
> > to manage projects that PMC is unable to support (projects are moved to
> > attic when they fail a roll call from the board with less than 2 PMC
> > members confirming that they are still there and ready to handle releases
> > if needed. It's also being discussed to be more formal for the CRA
> > regulations right now - "stewards" of the software put on the market should
> > be responsible for handling security issues in a timely manner). The act of
> > release with 3 +1s of PMC is a legal act of the Foundation placing software
> > on the market and we can't make it "unhappen".
> >
> >
> >> B.
> >>
> >> Sent from my iPhone
> >>
> >> > On 26 Oct 2023, at 20:20, Jarek Potiuk  wrote:
> >> >
> >> > Hello Airflow community,
> >> >
> >> > How do we feel about removing the Qubole provider completely (leaving
> >> only
> >> > old releases in PyPI?
> >> >
> >> > On September 1 2023 (
> >> > https://lists.apache.org/thread/p394d7w7gc7lz61g7qdthl96bc9kprxh) the
> >> > Qubole operator ws suspended.
> >> >
> >> > Due to the reasons described in the thread (Qubole got acquired and the
> >> > service is generally abandoned) there is pretty much no chance for it
> >> to be
> >> > resumed.
> >> >
> >> > I'd love to remove it completely and introduce a process where we can do
> >> > similar things in the future for other providers if we decide to do so.
> >> >
> >> > I checked in the Attic project in the ASF (this is where abandoned
> >> project
> >> > of the ASF get moved to) and it seems that just removing part of the
> >> > project that has an active PMC is not going through attic
> >> > https://issues.apache.org/jira/browse/ATTIC-218 . We are free to
> >> define our
> >> > rules for that and I would like to use the opportunity to hash it out
> >> and
> >> > propose a process (similarly to suspension) and criteria to remove
> >> > providers from being maintained by us.
> >> >
> >> > It's more than suspension. We will completely stop updating the related
> >> > code (right now some automated changes can still be applied and
> >> suspended
> >> > providers can be resumed with simple PR). I would like to have the
> >> "next"
> >> > step after "suspending" - removal.
> >> >
> >> > Roughly - we  send PROPOSAL followed by VOTE (or immediately VOTE in
> >> > obvious cases) with justification, PMC members only have the binding
> >> votes
> >> > (similar as for releases).
> >> >
> >> > Only git history will remain - all the rest will be removed (including
> >> > 

Re: Airflow Docs Development Issues

2023-10-27 Thread Amogh Desai
Yeah, excellent team!

Utkarsh note that I will be on vacation from today till Nov 6. I should be
able to help after that :)

Even during this period i will have slack on mobile, so I can help
asynchronously if needed.


Thanks & Regards,
Amogh Desai

On Fri, Oct 27, 2023, 14:22 Jarek Potiuk  wrote:

> Whoa. Dream team :) .
>
> And of course - if you need any of my input of how it works or get
> stuck with something - feel absolutely free to ping me on slack. While I
> have not developed the build process I probably tinkered and touched it in
> the past in many places and reverse engineered some parts of it so I might
> save you some of the head-scratching.
>
> On Fri, Oct 27, 2023 at 6:35 AM Bowrna Prabhakaran 
> wrote:
>
> > I would also like to join in this efforts.
> >
> >
> > On Fri, Oct 27, 2023 at 8:19 AM Ryan Hatter
> >  wrote:
> >
> > > I'm happy to work on this alongside Utkarsh, Amogh Desai, and Aritra
> Basu
> > > :)
> > > Some thoughts on Utkarsh's proposal (and what him and I have been
> > > discussing offline):
> > >
> > >1. I think we should start with enabling Hugo in the documentation
> > build
> > >process for new releases
> > >   1. This may need to include a way to serve html from S3, as I
> think
> > >   we'll need to build each version for each package
> (apache-airflow &
> > >   providers). If we do this each time, the amount of docs built
> will
> > > grow
> > >   exponentially and we might find ourselves again in a similar
> > > situation
> > >   2. Once this is done, all new docs will be buildable without
> > storing
> > >   the raw html locally
> > >   3. I think a good example (at least for a lot of this process) is
> > how
> > >   the Apache Iceberg docs repo
> > >    is built.
> > >2. Once that's complete, we can implement a process to archive the
> raw
> > >.rst files for docs older than 18 months to S3 along with a way to
> > > download
> > >and build those in the airflow-site repo.
> > >   1. This will result in temporarily having two separate builds:
> > >  1. One for archived docs like we do now
> > >  2. And the build process for new docs developed in (1)
> > >   2. After 18 months, all of the archived docs will be out of the
> > repo,
> > >   and we can move forward with only the build process developed in
> > (1)
> > >
> > >
> > > On Fri, Oct 27, 2023 at 7:55 AM utkarsh sharma  >
> > > wrote:
> > >
> > > > That sounds good, I'll start with creating smaller tickets for the
> > above
> > > > task, which I intend to do by the end of this week.
> > > >
> > > > Thanks,
> > > > Utkarsh Sharma
> > > >
> > > >
> > > > On Thu, Oct 26, 2023 at 4:16 PM Aritra Basu <
> aritrabasu1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Yup, sounds good to me let's go for it!
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Aritra Basu
> > > > >
> > > > > On Thu, Oct 26, 2023, 1:47 PM Amogh Desai <
> amoghdesai@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Go ahead Utkarsh. It would be nice to work with you along this.
> > > > > >
> > > > > > Thanks,
> > > > > > Amogh Desai
> > > > > >
> > > > > > On Wed, Oct 25, 2023 at 10:02 PM Jarek Potiuk 
> > > > wrote:
> > > > > >
> > > > > > > +1. I think no-one will object to improve the current situation
> > :)
> > > > > > >
> > > > > > > On Wed, Oct 25, 2023 at 5:02 PM utkarsh sharma <
> > > > utkarshar...@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey everyone,
> > > > > > > >
> > > > > > > > If we have a consensus on the suggestions in my previous
> > email, I
> > > > > would
> > > > > > > > like to subdivide the task into smaller tickets and
> distribute
> > > them
> > > > > > among
> > > > > > > > Aritra Basu, Amogh Desai, and myself.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Utkarsh Sharma
> > > > > > > >
> > > > > > > > On Tue, Oct 24, 2023 at 10:12 PM Jarek Potiuk <
> > ja...@potiuk.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Those look like great ideas.
> > > > > > > > >
> > > > > > > > > On Tue, Oct 24, 2023 at 4:23 PM utkarsh sharma <
> > > > > > utkarshar...@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Just forgot to mention in my previous mail, that I'm
> > > suggesting
> > > > > the
> > > > > > > > above
> > > > > > > > > > changes since the storage is not the primary concern
> right
> > > now
> > > > > but
> > > > > > > I'm
> > > > > > > > > > happy to contribute either way. :)
> > > > > > > > > >
> > > > > > > > > > On Tue, Oct 24, 2023 at 7:43 PM utkarsh sharma <
> > > > > > > utkarshar...@gmail.com
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey everyone,
> > > > > > > > > > >
> > > > > > > > > > > I have a couple of tasks in mind, that might aid in
> > > reducing
> > > > > the
> > > > > > > > > efforts
> > > > > > > > > > > while working with d

Re: [DISCUSS] Removing Qubole provider (and adding removal process)

2023-10-27 Thread Jarek Potiuk
> I think in the case of Qubole it is pretty easy to remove it from the
provider codebase. I'm pretty sure that almost no one even noticed this
removal.

Yeah. Agree. This one is pretty "obvious" that's why I would like to create
a process for doing it along the way so that in the future if we have other
non-obvious cases we can just "follow the process".

BTW.That would be really great to have as with it, we will have a complete
"life-cycle" of the providers.

* we know how we approve the new ones
* we know how we release new versions (Elad is amazing to release them
every few weeks)
* we know how we maintain back-compatibility (bumping min-version of
Airflow that we regularly do )
* we know how to involve stakeholders to system-test them and to make them
work in a stable way when they connect to external services (Amazon works,
Google in Progres, Open API and others promised by Astronomer)
* we know how to involve stakeholders with mixed-governance in case they
want older releases (never happened yet but we know how to do it)
* we know how to suspend and resume them when they prove to be problematic
and pass resolution of that to external stakeholders (happened with Yandex
- both suspend and resume)
* we (will finally) know how to retire them when we decide we do not want
to maintain them - except security fixes - any more

That will pretty much complete our process of "life-cycle" management for
providers.

J.




On Thu, Oct 26, 2023 at 10:00 PM Jarek Potiuk  wrote:

>
>> I suggest also removing it from pypi for security reasons. If there is a
>> security issue with it then the issue will remain with us.
>>
>>
> I am quite sure we still have to handle security issues if someone finds
> them. releasing such a provider will still be possible using the tag/branch
> and we will be obliged to release a new version IF it is still used and a
> security issue is found.
> Removal of packages from PyPI does not remove our obligation to fix a
> security problem. We have  also source packages released via
> downloads.apache.org and archives.apache.org - and those we can't remove
> either.
>
> I think this is one of the obligations of the Foundation being the
> "steward" of software it releases  - that's why there is also the attic PMC
> to manage projects that PMC is unable to support (projects are moved to
> attic when they fail a roll call from the board with less than 2 PMC
> members confirming that they are still there and ready to handle releases
> if needed. It's also being discussed to be more formal for the CRA
> regulations right now - "stewards" of the software put on the market should
> be responsible for handling security issues in a timely manner). The act of
> release with 3 +1s of PMC is a legal act of the Foundation placing software
> on the market and we can't make it "unhappen".
>
>
>> B.
>>
>> Sent from my iPhone
>>
>> > On 26 Oct 2023, at 20:20, Jarek Potiuk  wrote:
>> >
>> > Hello Airflow community,
>> >
>> > How do we feel about removing the Qubole provider completely (leaving
>> only
>> > old releases in PyPI?
>> >
>> > On September 1 2023 (
>> > https://lists.apache.org/thread/p394d7w7gc7lz61g7qdthl96bc9kprxh) the
>> > Qubole operator ws suspended.
>> >
>> > Due to the reasons described in the thread (Qubole got acquired and the
>> > service is generally abandoned) there is pretty much no chance for it
>> to be
>> > resumed.
>> >
>> > I'd love to remove it completely and introduce a process where we can do
>> > similar things in the future for other providers if we decide to do so.
>> >
>> > I checked in the Attic project in the ASF (this is where abandoned
>> project
>> > of the ASF get moved to) and it seems that just removing part of the
>> > project that has an active PMC is not going through attic
>> > https://issues.apache.org/jira/browse/ATTIC-218 . We are free to
>> define our
>> > rules for that and I would like to use the opportunity to hash it out
>> and
>> > propose a process (similarly to suspension) and criteria to remove
>> > providers from being maintained by us.
>> >
>> > It's more than suspension. We will completely stop updating the related
>> > code (right now some automated changes can still be applied and
>> suspended
>> > providers can be resumed with simple PR). I would like to have the
>> "next"
>> > step after "suspending" - removal.
>> >
>> > Roughly - we  send PROPOSAL followed by VOTE (or immediately VOTE in
>> > obvious cases) with justification, PMC members only have the binding
>> votes
>> > (similar as for releases).
>> >
>> > Only git history will remain - all the rest will be removed (including
>> > extra) - no traces of the provider remain in the next MINOR release
>> (2.8.0
>> > in the case of Quibole). The provider will still be in PyPI and
>> historical
>> > releases will be in https://archive.apache.org . If someone would like
>> to
>> > bring back such a provider, It should go through the same process as a
>> new
>> > provider (voting/consensus). An

Re: Airflow Docs Development Issues

2023-10-27 Thread Jarek Potiuk
Whoa. Dream team :) .

And of course - if you need any of my input of how it works or get
stuck with something - feel absolutely free to ping me on slack. While I
have not developed the build process I probably tinkered and touched it in
the past in many places and reverse engineered some parts of it so I might
save you some of the head-scratching.

On Fri, Oct 27, 2023 at 6:35 AM Bowrna Prabhakaran 
wrote:

> I would also like to join in this efforts.
>
>
> On Fri, Oct 27, 2023 at 8:19 AM Ryan Hatter
>  wrote:
>
> > I'm happy to work on this alongside Utkarsh, Amogh Desai, and Aritra Basu
> > :)
> > Some thoughts on Utkarsh's proposal (and what him and I have been
> > discussing offline):
> >
> >1. I think we should start with enabling Hugo in the documentation
> build
> >process for new releases
> >   1. This may need to include a way to serve html from S3, as I think
> >   we'll need to build each version for each package (apache-airflow &
> >   providers). If we do this each time, the amount of docs built will
> > grow
> >   exponentially and we might find ourselves again in a similar
> > situation
> >   2. Once this is done, all new docs will be buildable without
> storing
> >   the raw html locally
> >   3. I think a good example (at least for a lot of this process) is
> how
> >   the Apache Iceberg docs repo
> >    is built.
> >2. Once that's complete, we can implement a process to archive the raw
> >.rst files for docs older than 18 months to S3 along with a way to
> > download
> >and build those in the airflow-site repo.
> >   1. This will result in temporarily having two separate builds:
> >  1. One for archived docs like we do now
> >  2. And the build process for new docs developed in (1)
> >   2. After 18 months, all of the archived docs will be out of the
> repo,
> >   and we can move forward with only the build process developed in
> (1)
> >
> >
> > On Fri, Oct 27, 2023 at 7:55 AM utkarsh sharma 
> > wrote:
> >
> > > That sounds good, I'll start with creating smaller tickets for the
> above
> > > task, which I intend to do by the end of this week.
> > >
> > > Thanks,
> > > Utkarsh Sharma
> > >
> > >
> > > On Thu, Oct 26, 2023 at 4:16 PM Aritra Basu 
> > > wrote:
> > >
> > > > Yup, sounds good to me let's go for it!
> > > >
> > > > --
> > > > Regards,
> > > > Aritra Basu
> > > >
> > > > On Thu, Oct 26, 2023, 1:47 PM Amogh Desai 
> > > > wrote:
> > > >
> > > > > Go ahead Utkarsh. It would be nice to work with you along this.
> > > > >
> > > > > Thanks,
> > > > > Amogh Desai
> > > > >
> > > > > On Wed, Oct 25, 2023 at 10:02 PM Jarek Potiuk 
> > > wrote:
> > > > >
> > > > > > +1. I think no-one will object to improve the current situation
> :)
> > > > > >
> > > > > > On Wed, Oct 25, 2023 at 5:02 PM utkarsh sharma <
> > > utkarshar...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > If we have a consensus on the suggestions in my previous
> email, I
> > > > would
> > > > > > > like to subdivide the task into smaller tickets and distribute
> > them
> > > > > among
> > > > > > > Aritra Basu, Amogh Desai, and myself.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Utkarsh Sharma
> > > > > > >
> > > > > > > On Tue, Oct 24, 2023 at 10:12 PM Jarek Potiuk <
> ja...@potiuk.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Those look like great ideas.
> > > > > > > >
> > > > > > > > On Tue, Oct 24, 2023 at 4:23 PM utkarsh sharma <
> > > > > utkarshar...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Just forgot to mention in my previous mail, that I'm
> > suggesting
> > > > the
> > > > > > > above
> > > > > > > > > changes since the storage is not the primary concern right
> > now
> > > > but
> > > > > > I'm
> > > > > > > > > happy to contribute either way. :)
> > > > > > > > >
> > > > > > > > > On Tue, Oct 24, 2023 at 7:43 PM utkarsh sharma <
> > > > > > utkarshar...@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey everyone,
> > > > > > > > > >
> > > > > > > > > > I have a couple of tasks in mind, that might aid in
> > reducing
> > > > the
> > > > > > > > efforts
> > > > > > > > > > while working with docs. Right now tasks listed below are
> > > > > difficult
> > > > > > > to
> > > > > > > > > > achieve.
> > > > > > > > > >
> > > > > > > > > > 1. Adding a warning based on a specific provider/version
> > of a
> > > > > > > > > > provider/range of providers. Which was also the task that
> > > Ryan
> > > > > was
> > > > > > > > > working
> > > > > > > > > > on.
> > > > > > > > > > 2. Altering a page layout or CSS for a specific provider.
> > > > > > > > > >
> > > > > > > > > > The issue while trying to achieve the above tasks is
> > because
> > > of
> > > > > the
> > > > > > > > > > pre-prepared static files we get as a final product of
>