Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
Jeff King writes: > This might provide an alternate solution (or vice versa). I kind of like > this one better in that it doesn't require the sender to do anything > differently (but it may be less robust, as it assumes the receiver > reliably de-mangling). I share the assessment. I also feel that relying on Reply-To: would make the result a lot less reliable (I do not have much problem with the use of X-Original-Sender, though). ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On Thu, Oct 10, 2019 at 12:41:32PM -0700, Jonathan Nieder wrote: > > Add support for using the X-Original-Sender or Reply-To headers, as used by > > Google Groups and Mailman respectively, to unmangle the From header when > > necessary. > [...] > Interesting! I'm cc-ing the Git mailing list in case "git am" might > wnat to learn the same support. Neat. There was discussion on a similar issue recently in: https://public-inbox.org/git/305577c2-709a-b632-4056-658277117...@redhat.com/ where a possible solution was to get senders to use in-body From headers even when sending their own patches. This might provide an alternate solution (or vice versa). I kind of like this one better in that it doesn't require the sender to do anything differently (but it may be less robust, as it assumes the receiver reliably de-mangling). -Peff ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On 11/10/19 3:36 pm, Andrew Donnellan wrote: It would be nice if Mailman could adopt X-Original-Sender too. As it is, (which I have gone ahead and reported as https://gitlab.com/mailman/mailman/issues/641) -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On 11/10/19 3:29 pm, Junio C Hamano wrote: Jeff King writes: This might provide an alternate solution (or vice versa). I kind of like this one better in that it doesn't require the sender to do anything differently (but it may be less robust, as it assumes the receiver reliably de-mangling). I share the assessment. I also feel that relying on Reply-To: would make the result a lot less reliable (I do not have much problem with the use of X-Original-Sender, though). It would be nice if Mailman could adopt X-Original-Sender too. As it is, it adds the original sender to Reply-To, but in some cases (where the list is set as reply-to-list, or has a custom reply-to setting) it adds to Cc instead. (In the patch that started this thread, I match the name from the munged From field against the name in Reply-To/Cc for the case where there's multiple Reply-Tos/Ccs.) For the Patchwork use case, I'm quite okay with accepting the risk of using Reply-To, as the alternative is worse, the corner cases are rare, and ultimately a maintainer can still fix the odd stuff-up before applying the patch. -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On Fri, Oct 11, 2019 at 10:01:23AM +1100, Andrew Donnellan wrote: > > This might provide an alternate solution (or vice versa). I kind of like > > this one better in that it doesn't require the sender to do anything > > differently (but it may be less robust, as it assumes the receiver > > reliably de-mangling). > > Yep, it's less robust - but OTOH there's always a long tail of users stuck > on old versions of git for whatever reason and having some logic to detect > DMARC munging may thus still be useful. I think the two features would work together nicely out of the box: if somebody has an in-body from, we'd respect that before looking at email headers anyway. So senders who do the extra work will be covered, and checking other email headers would just improve the fallback case. -Peff ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On Fri, 11 Oct 2019 10:16:40 +1100 Daniel Axtens wrote: > > Andrew Donnellan writes: > > > On 11/10/19 6:41 am, Jonathan Nieder wrote: > >> Interesting! I'm cc-ing the Git mailing list in case "git am" might > >> wnat to learn the same support. > > Argh, that reminds me... this patch only rewrites the name and email > > that is recorded as the Patchwork submitter, it doesn't actually rewrite > > the From: header when you fetch the mbox off Patchwork. > > > > Part of me would really like to keep Patchwork mboxes as close as > > possible to the mbox we ingested, but on the other hand it means the > > mangled address is still going to land in the git repo at the end... so > > I should probably just change it? > > Yes, I think change it. If you're worried, stash the original one in the > headers. Or add a From: line to the start of the body (if one is not already there). -- Cheers, Stephen Rothwell pgpATtDfG0Tsi.pgp Description: OpenPGP digital signature ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
Andrew Donnellan writes: > On 11/10/19 6:41 am, Jonathan Nieder wrote: >> Interesting! I'm cc-ing the Git mailing list in case "git am" might >> wnat to learn the same support. > Argh, that reminds me... this patch only rewrites the name and email > that is recorded as the Patchwork submitter, it doesn't actually rewrite > the From: header when you fetch the mbox off Patchwork. > > Part of me would really like to keep Patchwork mboxes as close as > possible to the mbox we ingested, but on the other hand it means the > mangled address is still going to land in the git repo at the end... so > I should probably just change it? Yes, I think change it. If you're worried, stash the original one in the headers. > > -- > Andrew Donnellan OzLabs, ADL Canberra > a...@linux.ibm.com IBM Australia Limited > > ___ > Patchwork mailing list > Patchwork@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/patchwork ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On 11/10/19 9:54 am, Jeff King wrote: Neat. There was discussion on a similar issue recently in: https://public-inbox.org/git/305577c2-709a-b632-4056-658277117...@redhat.com/ where a possible solution was to get senders to use in-body From headers even when sending their own patches. I think that's a good idea. This might provide an alternate solution (or vice versa). I kind of like this one better in that it doesn't require the sender to do anything differently (but it may be less robust, as it assumes the receiver reliably de-mangling). Yep, it's less robust - but OTOH there's always a long tail of users stuck on old versions of git for whatever reason and having some logic to detect DMARC munging may thus still be useful. -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
Jonathan Nieder wrote: > Eric Wong wrote: > > Konstantin Ryabitsev wrote: > > >> This is actually really fast if you already have a local copy of the > >> repository with most objects. Try this yourself if you have > >> torvalds/linux.git locally: > >> > >> git clone --bare -s torvalds/linux.git test > > > > Yep, -s (--shared) makes cloning really cheap. One of my goals is to get > > > > git clone -s https://example.com/torvalds/linux.git > > > > and > > > > git clone -s https://example.com/torvalds/linux.git/clone.bundle > > > > working. That would make it easier for new contributors to > > setup lightweight clones and pull in history on an as-needed > > basis w/o hacks like shallow cloning. > > Does "git clone --filter=blob:none" do what you're looking for? Oops, haven't seen that new feature :x And haven't tried, pu @ 8d9027fa59b943db96a8a9090ec31d7f0f935596 is broken due to conflicts with hashmap (probably won't have time for a bit to look at it). What I'm hoping to do with "git clone -s" would be client-only and compatible with existing HTTP servers that run "git update-server-info". ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
Konstantin Ryabitsev wrote: > This is actually really fast if you already have a local copy of the > repository with most objects. Try this yourself if you have > torvalds/linux.git locally: > > git clone --bare -s torvalds/linux.git test Yep, -s (--shared) makes cloning really cheap. One of my goals is to get git clone -s https://example.com/torvalds/linux.git and git clone -s https://example.com/torvalds/linux.git/clone.bundle working. That would make it easier for new contributors to setup lightweight clones and pull in history on an as-needed basis w/o hacks like shallow cloning. ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
On Thu, 10 Oct 2019 15:07:29 -0300 Mauro Carvalho Chehab wrote: > Em Thu, 10 Oct 2019 10:41:50 -0400 > Konstantin Ryabitsev escreveu: > > > Hi, all: > > > > I would like to propose a new (large) feature to patchwork with the goal > > to make the process of submitting a patch easier for newbies and people > > generally less familiar with patch-based development. This was discussed > > previously on the workflows list: > > https://lore.kernel.org/workflows/20190930202451.GA14403@pure.paranoia.local/ > > > > How I envision this would work: > > > > - user creates an account (which requires a mail confirmation) > > - they choose a "submit patch" option from the menu > > - the patch submission screen has a succession of screens: > > > > 1. a screen with a single field allowing a user to paste a URL to > > their fork of the git repository. > > This will raise the bar, as it will force all developers to have a > public site to host the tree. I guess only a fraction of the 4k kernel > devs have it... In special, the ones that just want to send us a patch > fixing a bug may have serious troubles implementing that. A lot of people are going the way of gitlab/github. I don't think this raises the bar very high. Anyone can trivially get a public git repository somewhere. We can still have email coming in for those die hards, as long as we have someone to proxy it in. Note, I'm closer to one of the die hards, but I see where the world is heading. > > > Once submitted, patchwork does a > > "git ls-remote" to attempt to get a list of refs and to verify that > > this is indeed a valid git repository > > > > 2. next screen asks the user to select the ref to work from using the > > list obtained from the remote. Once submitted, patchwork performs a > > `git clone --reference` to clone the repository locally using a > > local fork of the same repo to minimize object transfer. This part > > requires that: > >a. patchwork project is configured with a path to a local fork, > > if this feature is enabled for a project > >b. that fork is kept current via some mechanism outside of > > patchwork (e.g. with grokmirror) > >c. there is some sanity-checking during the clone process to > > avoid abuse (e.g. a sane timeout, a tmpdir with limited size, > > etc -- other suggestions welcome) > > That would require a high bandwidth at the machine with as patchwork. Which most solutions may not be able to avoid this. > > Also, doesn't sound a good idea to me, as the server may end by having > tons of open sections, most waiting for tens of minutes, in order to > complete git clone. > > > > > 3. next screen asks the user to pick a starting commit from the log. > > Once submitted, patchwork generates the patch from the commit > > provided to the tip of the branch selected by the user earlier, > > using git format-patch. > > > > 4. next screen asks the user to review the patch to make sure this is > > what they want to submit. Once confirmed, patchwork performs two > > admin-defined optional hooks: > > > >a. a hook to generate a list of cc's (e.g. get_maintainer.pl) > >b. a sanity check hook (e.g. checkpatch.pl) > > > > 5. if sanity checking is defined, next screen shows the output of the > > sanity check hook, asking confirmation to proceed. > > > > 6. next screen shows the user three fields: > > > >a. title of the patch/series > >b. cover letter for the patch/series > >c. message-id of the previous patch revision (can be picked from > > the list of previously submitted series by this user -- > > patchwork should have them already) > >d. number of the revision (can be auto-filled if previous field > > is provided) and other tags to include in [] > > > > 7. next screen shows final review of what would be sent out to the > > list (and cc's, if the hook to get cc's is defined and returned any > > results). Once submitted, patchwork sends the patch/series using > > patchwork's envelope-from and the user's own email in the From: > > header. > > > > 8. once sent successfully, cleanups are performed (also needs to be > > done as part of the regular cron job, for any aborted attempts) > > > > I know this is a pretty big RFE, and I would like to hear your thoughts > > about this. If there is general agreement that this is doable/good idea, > > I may be able to come up with funding for this development as part of > > the overall tooling improvement proposal. > > The procedure itself is not bad, but, if implemented, IMHO, it should, > instead, be something that will run at the machine of the one submitting > the patch. For instance, this could be a perl or python script inside > Kernel's ./script directory that would handle everything locally,
Re: RFE: use patchwork to submit a patch
Em Thu, 10 Oct 2019 10:41:50 -0400 Konstantin Ryabitsev escreveu: > Hi, all: > > I would like to propose a new (large) feature to patchwork with the goal > to make the process of submitting a patch easier for newbies and people > generally less familiar with patch-based development. This was discussed > previously on the workflows list: > https://lore.kernel.org/workflows/20190930202451.GA14403@pure.paranoia.local/ > > How I envision this would work: > > - user creates an account (which requires a mail confirmation) > - they choose a "submit patch" option from the menu > - the patch submission screen has a succession of screens: > > 1. a screen with a single field allowing a user to paste a URL to > their fork of the git repository. This will raise the bar, as it will force all developers to have a public site to host the tree. I guess only a fraction of the 4k kernel devs have it... In special, the ones that just want to send us a patch fixing a bug may have serious troubles implementing that. > Once submitted, patchwork does a > "git ls-remote" to attempt to get a list of refs and to verify that > this is indeed a valid git repository > > 2. next screen asks the user to select the ref to work from using the > list obtained from the remote. Once submitted, patchwork performs a > `git clone --reference` to clone the repository locally using a > local fork of the same repo to minimize object transfer. This part > requires that: >a. patchwork project is configured with a path to a local fork, > if this feature is enabled for a project >b. that fork is kept current via some mechanism outside of > patchwork (e.g. with grokmirror) >c. there is some sanity-checking during the clone process to > avoid abuse (e.g. a sane timeout, a tmpdir with limited size, > etc -- other suggestions welcome) That would require a high bandwidth at the machine with as patchwork. Also, doesn't sound a good idea to me, as the server may end by having tons of open sections, most waiting for tens of minutes, in order to complete git clone. > > 3. next screen asks the user to pick a starting commit from the log. > Once submitted, patchwork generates the patch from the commit > provided to the tip of the branch selected by the user earlier, > using git format-patch. > > 4. next screen asks the user to review the patch to make sure this is > what they want to submit. Once confirmed, patchwork performs two > admin-defined optional hooks: > >a. a hook to generate a list of cc's (e.g. get_maintainer.pl) >b. a sanity check hook (e.g. checkpatch.pl) > > 5. if sanity checking is defined, next screen shows the output of the > sanity check hook, asking confirmation to proceed. > > 6. next screen shows the user three fields: > >a. title of the patch/series >b. cover letter for the patch/series >c. message-id of the previous patch revision (can be picked from > the list of previously submitted series by this user -- > patchwork should have them already) >d. number of the revision (can be auto-filled if previous field > is provided) and other tags to include in [] > > 7. next screen shows final review of what would be sent out to the > list (and cc's, if the hook to get cc's is defined and returned any > results). Once submitted, patchwork sends the patch/series using > patchwork's envelope-from and the user's own email in the From: > header. > > 8. once sent successfully, cleanups are performed (also needs to be > done as part of the regular cron job, for any aborted attempts) > > I know this is a pretty big RFE, and I would like to hear your thoughts > about this. If there is general agreement that this is doable/good idea, > I may be able to come up with funding for this development as part of > the overall tooling improvement proposal. The procedure itself is not bad, but, if implemented, IMHO, it should, instead, be something that will run at the machine of the one submitting the patch. For instance, this could be a perl or python script inside Kernel's ./script directory that would handle everything locally, and then submit the patch via patchwork's REST API. By using the REST API, it would avoid the need of having to do special e-mail setups for the casual developers. Thanks, Mauro ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
On Fri, Oct 11, 2019 at 08:38:49AM +1100, Daniel Axtens wrote: Hi Konstantin, tl;dr: I think a git-to-email bridge is a good step. I'm not sure why patchwork would be the thing to build it on top of, and I'm worried that it would slow us both down. I'm very open to being convinced though. In very broad terms, I chose patchwork because: 1. in people's minds patchwork.kernel.org already deals with patches and mailing lists, so it seems to be a logical place for such tool to live 2. it's already built on top of a powerful web system (django), and we already have it up and running, so this wouldn't require setting up Yet Another Web Framework Service I will readily admit that both of these assertions are pretty tenuous. If I understand correctly, you're using Patchwork as: - user creates an account (which requires a mail confirmation) a) an identity provider, and Well, rather as a tool that already has account management which includes email confirmation at some point. b) a way to integrate with existing concepts of a project and keep metadata about them in 1 place Yes. c) a handy tool for getting previous series by a given user. Yes, it's convenient to already have that user's previously submitted series readily available. d) a 'trusted' source of email. Well, this part isn't really that important. Rather, it's a tool where, to have an account, one must confirm email delivery. Is that right? I just ask because this idea seems a long way from what Patchwork traditionally does. That's not necessarily bad, I just want to make sure I understand, and that if you get funding you're not tying yourself to a platform that doesn't suit your needs. Well, in my mind patchwork: 1. already deals with patches and series, including knowing how to do diff highlights and all the fancy stuff 2. will hopefully gain ability to do interdiffs in the future, so if someone submits a series revision, they can see what actually changed before their previous submission and their new attempt 3. already has a lot of knowledge around git, mboxes, formats, etc. This, of course, is not to say that patchwork is where this *must* happen, but I think it would be *nice* if this is where this happens. :) I'm particularly curious about Patchwork as (a) an identity provider. You wrote: - user creates an account (which requires a mail confirmation) This seems like "optional centralising" on Patchwork - it becomes a central identity provider but it's optional in that you can just send email directly if you prefer. Right, this is a tool to help people allergic to CLI (or who do this infrequently enough that they can never remember all the steps, and oh my god, why isn't there a web tool to hold my hand?). If you're going to do 'lightweight' centralisation like that, why not do it on a platform that already understands git? It's really easy to extract the information you describe in (c) just by querying the patchwork API. You don't need to actually integrate into patchwork, or be logged in, to do that. You lose the ability to load any git remote, but if you have a git remote that isn't github or gitlab, you probably already have a good email flow (e.g. if you repo is on kernel.org). If you really want to use Patchwork as an identity provider, rather than a forge, could we just teach Patchwork how to be an identity broker, and then build things separately, authenticating through Patchwork to confirm a user's identity? That means you could build in whatever language you like and, critically, run on whatever deployment schedule you want. You could also get much better isolation that way, which would be good - I don't want an RCE in the git library to allow someone to wipe out all patchwork data, for example. Well, ve hawe vays of prewenting that (e.g. by transitioning git calls into their own selinux domain which cannot talk to databases). I know this is a pretty big RFE, and I would like to hear your thoughts about this. If there is general agreement that this is doable/good idea, I may be able to come up with funding for this development as part of the overall tooling improvement proposal. As I said up top, I'm not opposed to this per se. I think a git-to-email bridge is a good step. I'm just confused as to why patchwork would be the thing to build it on top of, and I'm worried about how you'd deploy and update this extended Patchwork. I'm very open to being convinced though. Generally, I think patchwork, as a web application that already deals with patches and series is a convenient place for this tool to live, that's basically the extent of my thinking. For sure, it can exist as a separate tool, but then I'd have to set up and maintain that separate tool in addition to patchwork, as opposed to just patchwork. Best, -K ___ Patchwork mailing list Patchwork@lists.ozlabs.org
Re: RFE: use patchwork to submit a patch
Hi Konstantin, tl;dr: I think a git-to-email bridge is a good step. I'm not sure why patchwork would be the thing to build it on top of, and I'm worried that it would slow us both down. I'm very open to being convinced though. > I would like to propose a new (large) feature to patchwork with the goal > to make the process of submitting a patch easier for newbies and people > generally less familiar with patch-based development. This was discussed > previously on the workflows list: > https://lore.kernel.org/workflows/20190930202451.GA14403@pure.paranoia.local/ > > How I envision this would work: > If I understand correctly, you're using Patchwork as: > - user creates an account (which requires a mail confirmation) a) an identity provider, and > - they choose a "submit patch" option from the menu > - the patch submission screen has a succession of screens: > > 1. a screen with a single field allowing a user to paste a URL to > their fork of the git repository. Once submitted, patchwork does a > "git ls-remote" to attempt to get a list of refs and to verify that > this is indeed a valid git repository > > 2. next screen asks the user to select the ref to work from using the > list obtained from the remote. Once submitted, patchwork performs a > `git clone --reference` to clone the repository locally using a > local fork of the same repo to minimize object transfer. This part > requires that: >a. patchwork project is configured with a path to a local fork, > if this feature is enabled for a project b) a way to integrate with existing concepts of a project and keep metadata about them in 1 place >b. that fork is kept current via some mechanism outside of > patchwork (e.g. with grokmirror) >c. there is some sanity-checking during the clone process to > avoid abuse (e.g. a sane timeout, a tmpdir with limited size, > etc -- other suggestions welcome) > > 3. next screen asks the user to pick a starting commit from the log. > Once submitted, patchwork generates the patch from the commit > provided to the tip of the branch selected by the user earlier, > using git format-patch. > > 4. next screen asks the user to review the patch to make sure this is > what they want to submit. Once confirmed, patchwork performs two > admin-defined optional hooks: > >a. a hook to generate a list of cc's (e.g. get_maintainer.pl) >b. a sanity check hook (e.g. checkpatch.pl) > > 5. if sanity checking is defined, next screen shows the output of the > sanity check hook, asking confirmation to proceed. > > 6. next screen shows the user three fields: > >a. title of the patch/series >b. cover letter for the patch/series >c. message-id of the previous patch revision (can be picked from > the list of previously submitted series by this user -- > patchwork should have them already) c) a handy tool for getting previous series by a given user. >d. number of the revision (can be auto-filled if previous field > is provided) and other tags to include in [] > > 7. next screen shows final review of what would be sent out to the > list (and cc's, if the hook to get cc's is defined and returned any > results). Once submitted, patchwork sends the patch/series using > patchwork's envelope-from and the user's own email in the From: > header. d) a 'trusted' source of email. Is that right? I just ask because this idea seems a long way from what Patchwork traditionally does. That's not necessarily bad, I just want to make sure I understand, and that if you get funding you're not tying yourself to a platform that doesn't suit your needs. I'm particularly curious about Patchwork as (a) an identity provider. You wrote: > - user creates an account (which requires a mail confirmation) This seems like "optional centralising" on Patchwork - it becomes a central identity provider but it's optional in that you can just send email directly if you prefer. If you're going to do 'lightweight' centralisation like that, why not do it on a platform that already understands git? It's really easy to extract the information you describe in (c) just by querying the patchwork API. You don't need to actually integrate into patchwork, or be logged in, to do that. You lose the ability to load any git remote, but if you have a git remote that isn't github or gitlab, you probably already have a good email flow (e.g. if you repo is on kernel.org). If you really want to use Patchwork as an identity provider, rather than a forge, could we just teach Patchwork how to be an identity broker, and then build things separately, authenticating through Patchwork to confirm a user's identity? That means you could build in whatever language you like and, critically, run on whatever deployment schedule you
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
On 11/10/19 6:41 am, Jonathan Nieder wrote: Interesting! I'm cc-ing the Git mailing list in case "git am" might wnat to learn the same support. Argh, that reminds me... this patch only rewrites the name and email that is recorded as the Patchwork submitter, it doesn't actually rewrite the From: header when you fetch the mbox off Patchwork. Part of me would really like to keep Patchwork mboxes as close as possible to the mbox we ingested, but on the other hand it means the mangled address is still going to land in the git repo at the end... so I should probably just change it? -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
Eric Wong wrote: > Konstantin Ryabitsev wrote: >> This is actually really fast if you already have a local copy of the >> repository with most objects. Try this yourself if you have >> torvalds/linux.git locally: >> >> git clone --bare -s torvalds/linux.git test > > Yep, -s (--shared) makes cloning really cheap. One of my goals is to get > > git clone -s https://example.com/torvalds/linux.git > > and > > git clone -s https://example.com/torvalds/linux.git/clone.bundle > > working. That would make it easier for new contributors to > setup lightweight clones and pull in history on an as-needed > basis w/o hacks like shallow cloning. Does "git clone --filter=blob:none" do what you're looking for? Thanks, Jonathan ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
(+cc: git) Hi, Konstantin Ryabitsev wrote[1]: > How I envision this would work: > > - user creates an account (which requires a mail confirmation) > - they choose a "submit patch" option from the menu > - the patch submission screen has a succession of screens: Interesting! This reminds me a bit of https://gitgitgadget.github.io (except using patchwork instead of github pull requests as substrate). That leads me to wonder: should these kinds of tools share some code? Are there any subtleties that web-to-mail patch submission interfaces have in common or can learn from each other? Thanks, Jonathan [1] https://lore.kernel.org/workflows/20191010144150.hqiosvwolm3lmzp5@chatter.i7.local/ ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: RFE: use patchwork to submit a patch
On Thu, Oct 10, 2019 at 03:07:29PM -0300, Mauro Carvalho Chehab wrote: - the patch submission screen has a succession of screens: 1. a screen with a single field allowing a user to paste a URL to their fork of the git repository. This will raise the bar, as it will force all developers to have a public site to host the tree. I guess only a fraction of the 4k kernel devs have it... In special, the ones that just want to send us a patch fixing a bug may have serious troubles implementing that. I don't think this will raise the bar, as Github/Gitlab allow for very easy forking of https://github.com/torvalds/linux. This is also not at all aimed at "all developers" -- only those that don't want to use the current CLI workflow and are more comfortable with web tools like Github. 2. next screen asks the user to select the ref to work from using the list obtained from the remote. Once submitted, patchwork performs a `git clone --reference` to clone the repository locally using a local fork of the same repo to minimize object transfer. This part requires that: a. patchwork project is configured with a path to a local fork, if this feature is enabled for a project b. that fork is kept current via some mechanism outside of patchwork (e.g. with grokmirror) c. there is some sanity-checking during the clone process to avoid abuse (e.g. a sane timeout, a tmpdir with limited size, etc -- other suggestions welcome) That would require a high bandwidth at the machine with as patchwork. Also, doesn't sound a good idea to me, as the server may end by having tons of open sections, most waiting for tens of minutes, in order to complete git clone. This is actually really fast if you already have a local copy of the repository with most objects. Try this yourself if you have torvalds/linux.git locally: git clone --bare -s torvalds/linux.git test cd test git remote add arm-soc https://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git fetch arm-soc for-next The whole process takes a second or so and the resulting repo is 328K in size. Of course, this assumes that the remote repository isn't trying to do something nasty, which is why I suggest anti-abuse precautions. I know this is a pretty big RFE, and I would like to hear your thoughts about this. If there is general agreement that this is doable/good idea, I may be able to come up with funding for this development as part of the overall tooling improvement proposal. The procedure itself is not bad, but, if implemented, IMHO, it should, instead, be something that will run at the machine of the one submitting the patch. For instance, this could be a perl or python script inside Kernel's ./script directory that would handle everything locally, and then submit the patch via patchwork's REST API. I think I didn't make clear that this isn't supposed to go straight to patchwork. Patchwork is merely a convenient tool where this happens -- the resulting patch gets mailed out to the mailing list just as the user would have done. -K ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
Re: [PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
Hi, Andrew Donnellan wrote: > To avoid triggering spam filters due to failed signature validation, many > mailing lists mangle the From header to change the From address to be the > address of the list, typically where the sender's domain has a strict DMARC > policy enabled. > > In this case, we should try to unmangle the From header. > > Add support for using the X-Original-Sender or Reply-To headers, as used by > Google Groups and Mailman respectively, to unmangle the From header when > necessary. > > Closes: #64 ("Incorrect submitter when using googlegroups") > Reported-by: Alexandre Belloni > Reported-by: Stephen Rothwell > Signed-off-by: Andrew Donnellan > --- > patchwork/parser.py| 75 ++ > patchwork/tests/test_parser.py | 68 -- > 2 files changed, 130 insertions(+), 13 deletions(-) Interesting! I'm cc-ing the Git mailing list in case "git am" might wnat to learn the same support. Thanks, Jonathan (patch left unsnipped for reference) > diff --git a/patchwork/parser.py b/patchwork/parser.py > index 7dc66bc05a5b..79beac5617b1 100644 > --- a/patchwork/parser.py > +++ b/patchwork/parser.py > @@ -321,12 +321,7 @@ def find_series(project, mail, author): > return _find_series_by_markers(project, mail, author) > > > -def get_or_create_author(mail): > -from_header = clean_header(mail.get('From')) > - > -if not from_header: > -raise ValueError("Invalid 'From' header") > - > +def split_from_header(from_header): > name, email = (None, None) > > # tuple of (regex, fn) > @@ -355,6 +350,65 @@ def get_or_create_author(mail): > (name, email) = fn(match.groups()) > break > > +return (name, email) > + > + > +# Unmangle From addresses that have been mangled for DMARC purposes. > +# > +# To avoid triggering spam filters due to failed signature validation, many > +# mailing lists mangle the From header to change the From address to be the > +# address of the list, typically where the sender's domain has a strict > +# DMARC policy enabled. > +# > +# Unfortunately, there's no standardised way of preserving the original > +# From address. > +# > +# Google Groups adds an X-Original-Sender header. If present, we use that. > +# > +# Mailman preserves the original address by adding a Reply-To, except in the > +# case where the list is set to either reply to list, or reply to a specific > +# address, in which case the original From is added to Cc instead. These > corner > +# cases are dumb, but we try and handle things as sensibly as possible by > +# looking for a name in Reply-To/Cc that matches From. It's inexact but > should > +# be good enough for our purposes. > +def get_original_sender(mail, name, email): > +if name and ' via ' in name: > +# Mailman uses the format " via " > +# Google Groups uses "'' via " > +stripped_name = name[:name.rfind(' via ')].strip().strip("'") > + > +original_sender = clean_header(mail.get('X-Original-Sender', '')) > +if original_sender: > +new_email = split_from_header(original_sender)[1].strip()[:255] > +return (stripped_name, new_email) > + > +addrs = [] > +reply_to_headers = mail.get_all('Reply-To') or [] > +cc_headers = mail.get_all('Cc') or [] > +for header in reply_to_headers + cc_headers: > +header = clean_header(header) > +addrs = header.split(",") > +for addr in addrs: > +new_name, new_email = split_from_header(addr) > +if new_name: > +new_name = new_name.strip()[:255] > +if new_email: > +new_email = new_email.strip()[:255] > +if new_name == stripped_name: > +return (stripped_name, new_email) > + > +# If we can't figure out the original sender, just keep it as is > +return (name, email) > + > + > +def get_or_create_author(mail, project=None): > +from_header = clean_header(mail.get('From')) > + > +if not from_header: > +raise ValueError("Invalid 'From' header") > + > +name, email = split_from_header(from_header) > + > if not email: > raise ValueError("Invalid 'From' header") > > @@ -362,6 +416,9 @@ def get_or_create_author(mail): > if name is not None: > name = name.strip()[:255] > > +if project and email.lower() == project.listemail.lower(): > +name, email = get_original_sender(mail, name, email) > + > # this correctly handles the case where we lose the race to create > # the person and another process beats us to it. (If the record > # does not exist, g_o_c invokes _create_object_from_params which > @@ -1004,7 +1061,7 @@ def parse_mail(mail, list_id=None): > > if not is_comment and (diff or pull_url): # patches or pull requests > # we delay the saving until we know we have a patch. > -author = get_or_create_author(mail) > +author =
Re: [PATCH v2 3/3] /api/events: Add 'actor' field to generated JSON
On Wed, 2019-10-09 at 18:44 +0200, Johan Herland wrote: > Before I post v3, I'll wait for some more responses to the discussion > on patch 2/3 about the best way (or rather: least horrible way) to > determine the active user and forward it into the event creation. I'll try do just that tomorrow. If I don't, it'll be next week (I'm at PyCon IE this weekend). Just FYI. Stephen ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
RFE: use patchwork to submit a patch
Hi, all: I would like to propose a new (large) feature to patchwork with the goal to make the process of submitting a patch easier for newbies and people generally less familiar with patch-based development. This was discussed previously on the workflows list: https://lore.kernel.org/workflows/20190930202451.GA14403@pure.paranoia.local/ How I envision this would work: - user creates an account (which requires a mail confirmation) - they choose a "submit patch" option from the menu - the patch submission screen has a succession of screens: 1. a screen with a single field allowing a user to paste a URL to their fork of the git repository. Once submitted, patchwork does a "git ls-remote" to attempt to get a list of refs and to verify that this is indeed a valid git repository 2. next screen asks the user to select the ref to work from using the list obtained from the remote. Once submitted, patchwork performs a `git clone --reference` to clone the repository locally using a local fork of the same repo to minimize object transfer. This part requires that: a. patchwork project is configured with a path to a local fork, if this feature is enabled for a project b. that fork is kept current via some mechanism outside of patchwork (e.g. with grokmirror) c. there is some sanity-checking during the clone process to avoid abuse (e.g. a sane timeout, a tmpdir with limited size, etc -- other suggestions welcome) 3. next screen asks the user to pick a starting commit from the log. Once submitted, patchwork generates the patch from the commit provided to the tip of the branch selected by the user earlier, using git format-patch. 4. next screen asks the user to review the patch to make sure this is what they want to submit. Once confirmed, patchwork performs two admin-defined optional hooks: a. a hook to generate a list of cc's (e.g. get_maintainer.pl) b. a sanity check hook (e.g. checkpatch.pl) 5. if sanity checking is defined, next screen shows the output of the sanity check hook, asking confirmation to proceed. 6. next screen shows the user three fields: a. title of the patch/series b. cover letter for the patch/series c. message-id of the previous patch revision (can be picked from the list of previously submitted series by this user -- patchwork should have them already) d. number of the revision (can be auto-filled if previous field is provided) and other tags to include in [] 7. next screen shows final review of what would be sent out to the list (and cc's, if the hook to get cc's is defined and returned any results). Once submitted, patchwork sends the patch/series using patchwork's envelope-from and the user's own email in the From: header. 8. once sent successfully, cleanups are performed (also needs to be done as part of the regular cron job, for any aborted attempts) I know this is a pretty big RFE, and I would like to hear your thoughts about this. If there is general agreement that this is doable/good idea, I may be able to come up with funding for this development as part of the overall tooling improvement proposal. Best regards, -K ___ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork
[PATCH] parser: Unmangle From: headers that have been mangled for DMARC purposes
To avoid triggering spam filters due to failed signature validation, many mailing lists mangle the From header to change the From address to be the address of the list, typically where the sender's domain has a strict DMARC policy enabled. In this case, we should try to unmangle the From header. Add support for using the X-Original-Sender or Reply-To headers, as used by Google Groups and Mailman respectively, to unmangle the From header when necessary. Closes: #64 ("Incorrect submitter when using googlegroups") Reported-by: Alexandre Belloni Reported-by: Stephen Rothwell Signed-off-by: Andrew Donnellan --- patchwork/parser.py| 75 ++ patchwork/tests/test_parser.py | 68 -- 2 files changed, 130 insertions(+), 13 deletions(-) diff --git a/patchwork/parser.py b/patchwork/parser.py index 7dc66bc05a5b..79beac5617b1 100644 --- a/patchwork/parser.py +++ b/patchwork/parser.py @@ -321,12 +321,7 @@ def find_series(project, mail, author): return _find_series_by_markers(project, mail, author) -def get_or_create_author(mail): -from_header = clean_header(mail.get('From')) - -if not from_header: -raise ValueError("Invalid 'From' header") - +def split_from_header(from_header): name, email = (None, None) # tuple of (regex, fn) @@ -355,6 +350,65 @@ def get_or_create_author(mail): (name, email) = fn(match.groups()) break +return (name, email) + + +# Unmangle From addresses that have been mangled for DMARC purposes. +# +# To avoid triggering spam filters due to failed signature validation, many +# mailing lists mangle the From header to change the From address to be the +# address of the list, typically where the sender's domain has a strict +# DMARC policy enabled. +# +# Unfortunately, there's no standardised way of preserving the original +# From address. +# +# Google Groups adds an X-Original-Sender header. If present, we use that. +# +# Mailman preserves the original address by adding a Reply-To, except in the +# case where the list is set to either reply to list, or reply to a specific +# address, in which case the original From is added to Cc instead. These corner +# cases are dumb, but we try and handle things as sensibly as possible by +# looking for a name in Reply-To/Cc that matches From. It's inexact but should +# be good enough for our purposes. +def get_original_sender(mail, name, email): +if name and ' via ' in name: +# Mailman uses the format " via " +# Google Groups uses "'' via " +stripped_name = name[:name.rfind(' via ')].strip().strip("'") + +original_sender = clean_header(mail.get('X-Original-Sender', '')) +if original_sender: +new_email = split_from_header(original_sender)[1].strip()[:255] +return (stripped_name, new_email) + +addrs = [] +reply_to_headers = mail.get_all('Reply-To') or [] +cc_headers = mail.get_all('Cc') or [] +for header in reply_to_headers + cc_headers: +header = clean_header(header) +addrs = header.split(",") +for addr in addrs: +new_name, new_email = split_from_header(addr) +if new_name: +new_name = new_name.strip()[:255] +if new_email: +new_email = new_email.strip()[:255] +if new_name == stripped_name: +return (stripped_name, new_email) + +# If we can't figure out the original sender, just keep it as is +return (name, email) + + +def get_or_create_author(mail, project=None): +from_header = clean_header(mail.get('From')) + +if not from_header: +raise ValueError("Invalid 'From' header") + +name, email = split_from_header(from_header) + if not email: raise ValueError("Invalid 'From' header") @@ -362,6 +416,9 @@ def get_or_create_author(mail): if name is not None: name = name.strip()[:255] +if project and email.lower() == project.listemail.lower(): +name, email = get_original_sender(mail, name, email) + # this correctly handles the case where we lose the race to create # the person and another process beats us to it. (If the record # does not exist, g_o_c invokes _create_object_from_params which @@ -1004,7 +1061,7 @@ def parse_mail(mail, list_id=None): if not is_comment and (diff or pull_url): # patches or pull requests # we delay the saving until we know we have a patch. -author = get_or_create_author(mail) +author = get_or_create_author(mail, project) delegate = find_delegate_by_header(mail) if not delegate and diff: @@ -1099,7 +1156,7 @@ def parse_mail(mail, list_id=None): is_cover_letter = True if is_cover_letter: -author = get_or_create_author(mail) +author = get_or_create_author(mail, project) # we don't use 'find_series' here as a cover letter