A temporary block should be fine if people can still create discussions. On which committers can help create issues if it's reasonable.
Avi On Wed, Jan 22, 2025 at 12:57 PM Jarek Potiuk <ja...@potiuk.com> wrote: > - Iceberg dev to not flood them :) (in bcc:) > > It looks like the flood had been somehow flood-gated - no similar report > for the last 4 hours or so. > > I also started to receive confirmation from Github that they are looking at > the reports, so likely we do not have to do any action now, but I think we > can turn it into deciding about "future" reactions when something like this > happens, so that we can potentially react quickly > > What do others think ? Should we react and block new users from interacting > with Airflow repo if we see it happening again? Maybe temporarily - for a > day or two initially - after reporting some initial reports? Does it sound > reasonable? > > J. > > On Wed, Jan 22, 2025 at 11:35 AM Pavankumar Gopidesu < > gopidesupa...@gmail.com> wrote: > > > +1 from me. > > > > It looks started yesterday, I feel we may get many of these tickets when > > new users starts testing those AI agents. > > > > Regards, > > Pavan Kumar > > > > On Wed, Jan 22, 2025, 10:27 Jarek Potiuk <ja...@potiuk.com> wrote: > > > > > We continue getting new issues - and more of them are by "new users" - > > > created just an hour or so ago. > > > > > > Apparently Github has a way to temporarily limit interactions with the > > repo > > > for new users - see this screenshot: > > > > > > https://ibb.co/WWsr7RB > > > > > > And I think I'd be for enabling it - we will need an INFRA ticket for > > that, > > > because that's not currently configurable via .asf.yaml - and maybe if > > > Iceberg would like to do it as well, we can create a single ticket for > > > that. > > > > > > There is a new framework coming to enable faster implementation and > > testing > > > of .asf.yaml features (this was discussed at the latest roundtable) - > and > > > we can contribute a feature to add it in .asf.yaml soon, but > temporarily > > we > > > might want to ask INFRA to help. > > > > > > WDYT? If I hear a few voices for +1 and no strong opposition I will > open > > a > > > JIRA ticket (and would love to hear what Iceberg friends of ours think > as > > > well :) > > > > > > > > > J. > > > > > > > > > On Wed, Jan 22, 2025 at 10:36 AM Jarek Potiuk <ja...@potiuk.com> > wrote: > > > > > > > Yeah. just closed this one. The pattern where those are coming at the > > > same > > > > time as two unrelated issues to both iceberg and airflow are very. > .... > > > > strange > > > > > > > > On Wed, Jan 22, 2025 at 10:35 AM Elad Kalif <elad...@apache.org> > > wrote: > > > > > > > >> Another one who also opened issues in Airflow and Iceberg > > > >> https://github.com/apache/iceberg/issues/12034 > > > >> https://github.com/apache/airflow/issues/45920 > > > >> > > > >> Same "mistake" with the # Title. > > > >> All of these seem to come with accounts opened months ago, with some > > > minor > > > >> traffic to their own forks so they would appear legit to Github > > > >> > > > >> On Wed, Jan 22, 2025 at 11:23 AM Jarek Potiuk <ja...@potiuk.com> > > wrote: > > > >> > > > >> > Yeah. Again - my guess is that those are "Agentic AI" trials, > where > > > >> someone > > > >> > is deploying fake "agent" accounts acting as "people in the repo > > > would". > > > >> > That's a bit terrifying if this is not contained. > > > >> > > > > >> > On Wed, Jan 22, 2025 at 9:52 AM Fokko Driesprong < > fo...@apache.org> > > > >> wrote: > > > >> > > > > >> > > That's quite a few! I also noticed that they sometimes > self-close > > > the > > > >> > issue > > > >> > > (eg here <https://github.com/apache/iceberg/issues/12032>). > > Closed > > > >> > after 1 > > > >> > > minute, but still flooding my mailbox :D > > > >> > > > > > >> > > So you might have more such issues now than you think. > > > >> > > > > > >> > > > > > >> > > Yes, that's probably the case, still going through my mailbox. > > > >> > > > > > >> > > > > > >> > > Op wo 22 jan 2025 om 09:49 schreef Jarek Potiuk < > ja...@potiuk.com > > >: > > > >> > > > > > >> > > > Example case: > > > >> > > > > > > >> > > > * https://github.com/apache/airflow/issues/45904 - airflow > > > >> > > > * https://github.com/apache/iceberg/issues/12034 - iceberg > > > >> > > > > > > >> > > > Both issues are generic and useless and bring 0 value except > > > noise. > > > >> > > > > > > >> > > > Interesting thing is that many of those users, if you look at > > > their > > > >> > > > history - created. similar number of issues in iceberg and > > airflow > > > >> > about > > > >> > > > the same time. So you might have more such issues now than you > > > >> think. > > > >> > > > > > > >> > > > J. > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > On Wed, Jan 22, 2025 at 9:41 AM Jarek Potiuk < > ja...@potiuk.com> > > > >> wrote: > > > >> > > > > > > >> > > >> I have not counted all of them. there are quite a bit too > many > > - > > > >> and > > > >> > > >> other people closed some of them as well. I got a very > > > rudimentary > > > >> > check > > > >> > > >> and applied "AI Spam" label to some of the issues > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/apache/airflow/issues?q=is%3Aissue%20state%3Aclosed%20AI%20label%3A%22AI%20Spam%22 > > > >> > > . > > > >> > > >> -> so we have had at least 25 such issues in the last 12 > hours. > > > >> > > >> > > > >> > > >> > we also want to make sure that we don't accidentally close > > > issues > > > >> > that > > > >> > > >> don't come from a bot, but just a newcomer to the project. > > > >> > > >> > > > >> > > >> Those reports and patterns look very. very human-like - they > > are > > > >> > > reported > > > >> > > >> infrequently (per user) the description and text seem > > legitimate, > > > >> but > > > >> > > they > > > >> > > >> are wordy and just reading and understanding that those are > > > >> completely > > > >> > > >> useless takes a lot of time. This is part of the problem, > that > > it > > > >> > takes > > > >> > > a > > > >> > > >> lot of energy and time to determine if those are valid or > not - > > > and > > > >> > with > > > >> > > >> such a rate, it's not sustainable just to analyze whether > they > > > are > > > >> > good > > > >> > > or > > > >> > > >> bad. > > > >> > > >> > > > >> > > >> J. > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> On Wed, Jan 22, 2025 at 9:23 AM Fokko Driesprong < > > > fo...@apache.org > > > >> > > > > >> > > >> wrote: > > > >> > > >> > > > >> > > >>> Hey Jarek, > > > >> > > >>> > > > >> > > >>> Thanks for bringing this to our attention. When you talk > about > > > >> > > flooding, > > > >> > > >>> how many are we talking about? I see some suspicious issues > > (eg, > > > >> here > > > >> > > >>> <https://github.com/apache/iceberg/issues/12039>), but not > > > many. > > > >> I > > > >> > > >>> hope this will come to a halt soon because it all additional > > > work, > > > >> > and > > > >> > > we > > > >> > > >>> also want to make sure that we don't accidentally close > issues > > > >> that > > > >> > > don't > > > >> > > >>> come from a bot, but just a newcomer to the project. > > > >> > > >>> > > > >> > > >>> Kind regards, > > > >> > > >>> Fokko > > > >> > > >>> > > > >> > > >>> Op wo 22 jan 2025 om 09:00 schreef Jarek Potiuk < > > > ja...@potiuk.com > > > >> >: > > > >> > > >>> > > > >> > > >>> > Hey Iceberg community, And Airflow community too. > > > >> > > >>> > > > > >> > > >>> > As of yesterday Airflow repo is literally flooded with a > > > number > > > >> of > > > >> > > >>> issues > > > >> > > >>> > that look almost good, except they are clearly AI > generated > > > and > > > >> > make > > > >> > > no > > > >> > > >>> > sense or repeat content from other issues. We noticed that > > the > > > >> > users > > > >> > > >>> who > > > >> > > >>> > create a lot of the "spam AI" issues that are created in > > > Airflow > > > >> > are > > > >> > > >>> also > > > >> > > >>> > creating similar issues for Iceberg. > > > >> > > >>> > > > > >> > > >>> > We got to the point that we are closing and reporting such > > > >> issues > > > >> > to > > > >> > > >>> > GitHub and we are blocking all such users without spending > > too > > > >> much > > > >> > > >>> time on > > > >> > > >>> > it with messages similar to this: > > > >> > > >>> > > > > >> > > >>> > ``` > > > >> > > >>> > This looks totally AI-generated. useless issue report that > > > >> brings > > > >> > no > > > >> > > >>> value > > > >> > > >>> > and makes no sense. We are generally blocking users that > > > sends a > > > >> > lot > > > >> > > of > > > >> > > >>> > spam AI reports generated by bots.. as of yesterday so we > > will > > > >> > report > > > >> > > >>> your > > > >> > > >>> > account and block it unless: > > > >> > > >>> > > > > >> > > >>> > a) you explain how you generated reports > > > >> > > >>> > b) prove you are human > > > >> > > >>> > c) explain why you created the issue > > > >> > > >>> > ``` > > > >> > > >>> > > > > >> > > >>> > My guess is that some company released and is testing an > > > >> "agentic > > > >> > AI" > > > >> > > >>> that > > > >> > > >>> > is "github-targeted" - where people can run the AI agents > on > > > >> their > > > >> > > >>> behalf. > > > >> > > >>> > It does not look like regular bot-spam. > > > >> > > >>> > I think we should all generally crowd-source reporting it > to > > > >> > Github - > > > >> > > >>> and > > > >> > > >>> > hopefully they will find a way to battle those without > > > involving > > > >> > > >>> > maintainers. > > > >> > > >>> > > > > >> > > >>> > I hope it will not last too long. > > > >> > > >>> > > > > >> > > >>> > J. > > > >> > > >>> > > > > >> > > >>> > > > > >> > > >>> > > > > >> > > >>> > ---------- Forwarded message --------- > > > >> > > >>> > From: Jarek Potiuk <ja...@potiuk.com> > > > >> > > >>> > Date: Wed, Jan 22, 2025 at 8:12 AM > > > >> > > >>> > Subject: Re: Very strange (AI generated) issues > > > >> > > >>> > To: <dev@airflow.apache.org> > > > >> > > >>> > > > > >> > > >>> > > > > >> > > >>> > You can also report it directly from the issue (... at the > > top > > > >> and > > > >> > > >>> "report > > > >> > > >>> > content") > > > >> > > >>> > > > > >> > > >>> > On Wed, Jan 22, 2025 at 7:46 AM Amogh Desai < > > > >> > > amoghdesai....@gmail.com> > > > >> > > >>> > wrote: > > > >> > > >>> > > > > >> > > >>> >> Elad, I just managed to report this user. > > > >> > > >>> >> > > > >> > > >>> >> This is how its done: > > > >> > > >>> >> > > > >> > > >>> >> > > > >> > > >>> > > > >> > > > > > >> > > > > >> > > > > > > https://docs.github.com/en/communities/maintaining-your-safety-on-github/reporting-abuse-or-spam#reporting-a-user > > > >> > > >>> >> > > > >> > > >>> >> Thanks & Regards, > > > >> > > >>> >> Amogh Desai > > > >> > > >>> >> > > > >> > > >>> >> > > > >> > > >>> >> On Wed, Jan 22, 2025 at 12:05 PM Elad Kalif < > > > >> elad...@apache.org> > > > >> > > >>> wrote: > > > >> > > >>> >> > > > >> > > >>> >> > There are several reports from this user > > > >> > > >>> >> > > > > >> > > >>> >> > https://github.com/atharv9017 > > > >> > > >>> >> > > > > >> > > >>> >> > > > > >> > > >>> >> > I didnt find a way to report the user account to > github. > > > >> > > >>> >> > > > > >> > > >>> >> > בתאריך יום ד׳, 22 בינו׳ 2025, 06:41, מאת Pavankumar > > > Gopidesu > > > >> < > > > >> > > >>> >> > gopidesupa...@gmail.com>: > > > >> > > >>> >> > > > > >> > > >>> >> > > Yes, still issues are coming. > > > >> > > >>> >> > > > > > >> > > >>> >> > > Regards, > > > >> > > >>> >> > > Pavan > > > >> > > >>> >> > > > > > >> > > >>> >> > > On Wed, Jan 22, 2025 at 4:35 AM Amogh Desai < > > > >> > > >>> amoghdesai....@gmail.com > > > >> > > >>> >> > > > > >> > > >>> >> > > wrote: > > > >> > > >>> >> > > > > > >> > > >>> >> > > > I saw a couple of such SPAM issues too. > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > I also recall some SPAM comments on pull requests > as > > > >> well, > > > >> > so > > > >> > > >>> if any > > > >> > > >>> >> > > > contributor sees any such SPAM message, > > > >> > > >>> >> > > > please report it on Slack so that we can delete it > > and > > > >> > report > > > >> > > >>> it. > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > Thanks & Regards, > > > >> > > >>> >> > > > Amogh Desai > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > On Wed, Jan 22, 2025 at 8:45 AM Zhe You Liu < > > > >> > > >>> zhu424....@gmail.com> > > > >> > > >>> >> > > wrote: > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > > I came across another strange issue: > > > >> > > >>> >> > > > > https://github.com/apache/airflow/issues/45837. > It > > > >> > appears > > > >> > > >>> to be > > > >> > > >>> >> a > > > >> > > >>> >> > > > > copy-paste of > > > >> > > https://github.com/apache/airflow/issues/45661 > > > >> > > >>> with > > > >> > > >>> >> > just > > > >> > > >>> >> > > > the > > > >> > > >>> >> > > > > issue title changed. > > > >> > > >>> >> > > > > > > > >> > > >>> >> > > > > On Wed, Jan 22, 2025 at 6:50 AM Jarek Potiuk < > > > >> > > >>> ja...@potiuk.com> > > > >> > > >>> >> > wrote: > > > >> > > >>> >> > > > > > > > >> > > >>> >> > > > > > I even got to this stage: > > > >> > > >>> >> > > > > > > > > >> > > >>> >> > > > > > > We've received a few new tickets from your > > > account > > > >> > > >>> recently. > > > >> > > >>> >> If > > > >> > > >>> >> > > you'd > > > >> > > >>> >> > > > > > like to add additional information you can add > a > > > >> comment > > > >> > > to > > > >> > > >>> an > > > >> > > >>> >> > > existing > > > >> > > >>> >> > > > > > ticket, or wait a few minutes before opening a > > new > > > >> > ticket. > > > >> > > >>> >> > > > > > > > > >> > > >>> >> > > > > > On Tue, Jan 21, 2025 at 11:49 PM Jarek Potiuk < > > > >> > > >>> ja...@potiuk.com > > > >> > > >>> >> > > > > >> > > >>> >> > > > wrote: > > > >> > > >>> >> > > > > > > > > >> > > >>> >> > > > > > > There are few more that I still saw after > > sending > > > >> it. > > > >> > > >>> There is > > > >> > > >>> >> > > > > something > > > >> > > >>> >> > > > > > > going on bypassing GitHub filters. I hope > they > > > >> will > > > >> > > >>> manage > > > >> > > >>> >> to do > > > >> > > >>> >> > > > > > something > > > >> > > >>> >> > > > > > > about it > > > >> > > >>> >> > > > > > > > > > >> > > >>> >> > > > > > > Last one is > > > >> > > >>> https://github.com/apache/airflow/issues/45867 > > > >> > > >>> >> > > > > > > > > > >> > > >>> >> > > > > > > On Tue, Jan 21, 2025 at 11:46 PM Vikram Koka > > > >> > > >>> >> > > > > > <vik...@astronomer.io.invalid> > > > >> > > >>> >> > > > > > > wrote: > > > >> > > >>> >> > > > > > > > > > >> > > >>> >> > > > > > >> Agreed. > > > >> > > >>> >> > > > > > >> > > > >> > > >>> >> > > > > > >> Thanks for flagging these Jarek! > > > >> > > >>> >> > > > > > >> > > > >> > > >>> >> > > > > > >> > > > >> > > >>> >> > > > > > >> On Tue, Jan 21, 2025 at 2:34 PM Jarek > Potiuk < > > > >> > > >>> >> ja...@potiuk.com> > > > >> > > >>> >> > > > > wrote: > > > >> > > >>> >> > > > > > >> > > > >> > > >>> >> > > > > > >> > Seems that we have a flood of AI generated > > > >> feature > > > >> > > >>> requests > > > >> > > >>> >> > for > > > >> > > >>> >> > > > > > Airflow, > > > >> > > >>> >> > > > > > >> > The issues look somewhat legitimate, with > > > >> somewhat > > > >> > > >>> related > > > >> > > >>> >> > > > content, > > > >> > > >>> >> > > > > > but > > > >> > > >>> >> > > > > > >> > they are wordy and make no sense when you > > read > > > >> > them. > > > >> > > >>> Some > > > >> > > >>> >> > > > examples: > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > * > > > >> https://github.com/apache/airflow/issues/45858 > > > >> > > >>> >> > > > > > >> > * > > > >> https://github.com/apache/airflow/issues/45856 > > > >> > > >>> >> > > > > > >> > * > > > >> https://github.com/apache/airflow/issues/45854 > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > All of them done by accounts with short > > > history > > > >> in > > > >> > GH > > > >> > > >>> and > > > >> > > >>> >> not > > > >> > > >>> >> > > much > > > >> > > >>> >> > > > > > >> activity > > > >> > > >>> >> > > > > > >> > before > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > There were quite a few more. > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > I suggest we close such issues AND report > > > >> authors > > > >> > to > > > >> > > >>> >> GitHub - > > > >> > > >>> >> > > > > > hopefully > > > >> > > >>> >> > > > > > >> we > > > >> > > >>> >> > > > > > >> > can help to battle the AI-generated > traffic > > > >> flood. > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > J. > > > >> > > >>> >> > > > > > >> > > > > >> > > >>> >> > > > > > >> > > > >> > > >>> >> > > > > > > > > > >> > > >>> >> > > > > > > > > >> > > >>> >> > > > > > > > >> > > >>> >> > > > > > > >> > > >>> >> > > > > > >> > > >>> >> > > > > >> > > >>> >> > > > >> > > >>> > > > > >> > > >>> > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > > > > >