Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-02-15 Thread Guillem Jover
Hi!

On Sun, 2023-01-15 at 21:12:53 +0100, Lucas Nussbaum wrote:
> > I'm wondering why diverge from the patch metadata guidelines? If there's
> > a desire to change the field semantics, perhaps it would be better to
> > change the guidelines instead? :)
> >·
> > But given this interchange, perhaps we should try to make it more
> > clear or explicit about some of these aspects there!
>·
> Originally I admit that I misread the DEP3 rules.
> Now the implementation is much closer to DEP3. Unfortunately, DEP3 was
> probably not written with automatic processing in mind.

> The current status of the data is the following[1]:

Thanks for collecting and analyzing these!

> With some comments:
>  no  | No Forwarded, No Bug field 
>   | 100145 |   68.12
> => that sticks to the DEP3 specification. (no forwarded: + no bugs: => 
> implicit no)
>·
>  not-needed  |
>   |  19329 |   13.15
> => that's an explicit "Forwarded: not-needed"
>·
>  yes |
>   |  12217 |8.31
> => Those are the cases where it's clear that the bug has been forwarded
>·
>  no  | Explicit Forwarded=no  
>   |  11448 |7.79
> => that's an explicit "Forwarded: no"

All ack.

> The remaining lines are the ones that could require discussion.
> First, it's worth noting that they only account for 2.63% of the
> patches.
>·
>  invalid | Invalid value for Forwarded field  
>   |   2270 |1.54
> => I looked at the values considered invalid[2], and could not really convince
> myself that they should automatically fit in one of the above
> categories.

I see what you mean. Although from skimming over these values, the two
things that become clear are things that I've already wondered about,
forwarding to email addresses (there are "many" instances of those
there), and how to represent "no" due to no-upstream (which TBH it
seems we are lacking a more general way to mark packages as such,
because there is also the debian/watch and several lintian tags that
trigger but should probably not be considered issues when there is no
upstream).

I guess several of these bogus ones could perhaps be fixed up by the
janitor.

>  invalid | Forwarded=yes, but no Bug field
>   |   1119 |0.76
> => I think that we should strongly encourage leaving a public trace when
> forwarding bugs

While I agree that would be ideal, that has the problem I mentioned
when upstream does not have any such public avenue. Since then I
adopted the suggestion by Paul Wise to use a mail address, but from
the previous entry, I think that makes parsing probably tricky. :/

The other reason I've set Forwarded=yes in the past is when I've
submitted the patches and they have been merged very quickly, and
having to go look for the web archive URL references or similar seems
more work than it's worth, when just adding the commit id in the Origin
field is quicker (but see below.)

>  invalid | No forwarded, Debian bug in Bug: field. Bug-Debian: should 
> be used instead.  |283 |0.19
> => quite obviously invalid

Ack.

>  no  | No Forwarded, Upstream bug, but not a URL  
>   |190 |0.13
> => those are mostly patches misusing the Bug: field for the Debian
> bug.[3]. Since the specification requires a URL here, it could be
> considered invalid as well.

Is this perhaps being too strict? Some of the URLs seem to be valid,
just prefixed with whitespace? The rest seem indeed invalid.

>  invalid | Forwarded=yes, Debian bug in Bug: field. Bug-Debian: 
> should be used instead. |  8 |0.01
> => quite obviously invalid

Ack.

> So, all in all, my implementation[4] sounds sane.

I think the guidelines might need some work to add clearly missing
states or declarative information; and then having this kind of
more strict validator (or ideally automatic fixer) will eventually
make the data more correct. Lintian seems rather lenient in its
parsing, but following the guidelines more closely perhaps.

> > I otherwise do not know how we can mark patches as forwarded when
> > for example you send them directly to upstream via email or to a mailing
> > list that has no public archive or similar.
>·
> That's true. Other than blinding accepting "Forwarded: yes", I don't
> know how this could be addressed.

I still see value in a "faith" based "yes", but I can see how this
can be seen as rather unsatisfactory from a global point of view. But
perhaps this is tied with the following point, and "yes" should only
be accepted when there's "proof" from other fields that it got
forwarded (say because it's merged already).

> > I think the Origin field needs 

Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-16 Thread Paul Wise
On Sat, 2023-01-14 at 18:41 +0100, Guillem Jover wrote:

> I otherwise do not know how we can mark patches as forwarded when
> for example you send them directly to upstream via email or to a mailing
> list that has no public archive or similar.

Just as you can mark a BTS bug as forwarded to an email address,
presumably you could do something similar for patches? Then people
wanting to know the status could email that address to find out.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-15 Thread Lucas Nussbaum
Hi Guillem,

> I'm wondering why diverge from the patch metadata guidelines? If there's
> a desire to change the field semantics, perhaps it would be better to
> change the guidelines instead? :)
> 
> But given this interchange, perhaps we should try to make it more
> clear or explicit about some of these aspects there!

Originally I admit that I misread the DEP3 rules.
Now the implementation is much closer to DEP3. Unfortunately, DEP3 was
probably not written with automatic processing in mind.

The current status of the data is the following[1]:

 forwarded_short | explanation  
|  cnt   | percent 
-+--++-
 no  | No Forwarded, No Bug field   
| 100145 |   68.12
 not-needed  |  
|  19329 |   13.15
 yes |  
|  12217 |8.31
 no  | Explicit Forwarded=no
|  11448 |7.79
 invalid | Invalid value for Forwarded field
|   2270 |1.54
 invalid | Forwarded=yes, but no Bug field  
|   1119 |0.76
 invalid | No forwarded, Debian bug in Bug: field. Bug-Debian: should 
be used instead.  |283 |0.19
 no  | No Forwarded, Upstream bug, but not a URL
|190 |0.13
 invalid | Forwarded=yes, Debian bug in Bug: field. Bug-Debian: should 
be used instead. |  8 |0.01

With some comments:
 no  | No Forwarded, No Bug field   
| 100145 |   68.12
=> that sticks to the DEP3 specification. (no forwarded: + no bugs: => implicit 
no)

 not-needed  |  
|  19329 |   13.15
=> that's an explicit "Forwarded: not-needed"

 yes |  
|  12217 |8.31
=> Those are the cases where it's clear that the bug has been forwarded

 no  | Explicit Forwarded=no
|  11448 |7.79
=> that's an explicit "Forwarded: no"


The remaining lines are the ones that could require discussion.
First, it's worth noting that they only account for 2.63% of the
patches.

 invalid | Invalid value for Forwarded field
|   2270 |1.54
=> I looked at the values considered invalid[2], and could not really convince
myself that they should automatically fit in one of the above
categories.

 invalid | Forwarded=yes, but no Bug field  
|   1119 |0.76
=> I think that we should strongly encourage leaving a public trace when
forwarding bugs

 invalid | No forwarded, Debian bug in Bug: field. Bug-Debian: should 
be used instead.  |283 |0.19
=> quite obviously invalid

 no  | No Forwarded, Upstream bug, but not a URL
|190 |0.13
=> those are mostly patches misusing the Bug: field for the Debian
bug.[3]. Since the specification requires a URL here, it could be
considered invalid as well.

 invalid | Forwarded=yes, Debian bug in Bug: field. Bug-Debian: should 
be used instead. |  8 |0.01
=> quite obviously invalid

So, all in all, my implementation[4] sounds sane.

> I otherwise do not know how we can mark patches as forwarded when
> for example you send them directly to upstream via email or to a mailing
> list that has no public archive or similar.

That's true. Other than blinding accepting "Forwarded: yes", I don't
know how this could be addressed.

> I think the Origin field needs to be taken into account too, in
> particular when it has an "upstream" or a "backport" value. As well as
> the Applied-Upstream field.

I would prefer the Forwarded field to be a simple field only allowing
'no', 'not-needed', and maybe 'yes' (with the pointer to where the patch
was forwarded in a separate field).

Or if the Forwarded field's value is supposed to be computed, I think
that we should provide a more detailed specification of the algorithm.

- Lucas

[1] UDD query:
SELECT forwarded_short, invalid_reason, cnt, round(100.0 * cnt / (sum(cnt) OVER 
()),2) as percent
FROM ( select forwarded_short, invalid_reason, count(*) cnt from patches group 
by 1,2 ) foo
order by 3 desc

[2] UDD query:
select forwarded, count(*) from patches where invalid_reason='Invalid value for 
Forwarded field' group by 1 order by 2 desc;

[3] UDD query:
select forwarded, 

Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-14 Thread Mattia Rizzolo
On Thu, Jan 12, 2023 at 12:05:51PM +0100, Lucas Nussbaum wrote:
> The problem I have is that the 'Bug' header is often misused, and used
> for the Debian bug instead of the upstream bug. But I could special-case
> that.

I wonder if we should have lintian flag this.
Something like warn if Bug points to bugs.debian.org, unless… mh, can't
think of when it should be really. Do we have "native" packages that use
quit?

-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
More about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-14 Thread Guillem Jover
On Thu, 2023-01-12 at 12:05:51 +0100, Lucas Nussbaum wrote:
> On 12/01/23 at 01:57 +0100, Guillem Jover wrote:
> > I just noticed though that it does
> > not recognize a "yes" value for the Forwarded field, while the
> > "Patch Tagging Guidelines" has this to say about it:
> > 
> >   * Forwarded (optional)
> > 
> > Any value other than "no" or "not-needed" means that the patch has
> > been forwarded upstream. Ideally the value is an URL proving that
> > it has been forwarded and where one can find more information about
> > its inclusion status.
> > 
> > If the field is missing, its implicit value is "yes" if the "Bug"
> > field is present, otherwise it's "no". The field is really required
> > only if the patch is vendor specific, in that case its value should
> > be "not-needed" to indicate that the patch must not be forwarded
> > upstream (whereas "no" simply means that it has not yet been done).
> > 
> > So it says that any value other than "no" or "not-needed" means
> > forwarded, then it says that if the field is missing it means it is an
> > implicit value of "yes", where I've always interpreted as implicitly
> > stating that "yes" is also a valid value.
> 
> If the field is missing, then its implicit value is 'yes' only if the
> Bug field (which points to the upstream bug) is present.

Sure, the point I was trying to make is that when the field is present
it is stated that any value (other than "no" or "not-needed") is
considered to mean it has been forwarded. And as I had the perception
that you might be trying to be strict on the values accepted, I was
trying to argue that because the value "yes" is mentioned (with quotes,
so I interpret it as one more of the "known values" even if referred
as an implicit value), then when the field is present it should also
be accepted with such "known values".

I otherwise do not know how we can mark patches as forwarded when
for example you send them directly to upstream via email or to a mailing
list that has no public archive or similar.

> > (I also recently amended the patch metadata header template generated
> > by dpkg-source and did not have "yes" as a value there, but I've added
> > it locally now, and will probably queue it for dpkg 1.22.x.)
> 
> The logic in the UDD implementation is documented just below the table:
> > The forwarded value in the table is computed and might differ slightly
> > from the original DEP3 specification. It is yes only if an URL is
> > provided as value. It is invalid if none of the other value could be
> > determined with certainty (yes, no, not-needed).

I'm wondering why diverge from the patch metadata guidelines? If there's
a desire to change the field semantics, perhaps it would be better to
change the guidelines instead? :)

But given this interchange, perhaps we should try to make it more
clear or explicit about some of these aspects there!

> But I could do better, and consider the Bug field in the analysis.

I think the Origin field needs to be taken into account too, in
particular when it has an "upstream" or a "backport" value. As well as
the Applied-Upstream field.

> The problem I have is that the 'Bug' header is often misused, and used
> for the Debian bug instead of the upstream bug. But I could special-case
> that.

Sure, if the Bug field is misused I see no problem with flagging it as
erroneous.

Thanks,
Guillem



Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-12 Thread Lucas Nussbaum
Hi,

On 12/01/23 at 01:57 +0100, Guillem Jover wrote:
> Hi!
> 
> The new patch data is great, thanks!

Thanks!

> I just noticed though that it does
> not recognize a "yes" value for the Forwarded field, while the
> "Patch Tagging Guidelines" has this to say about it:
> 
>   * Forwarded (optional)
> 
> Any value other than "no" or "not-needed" means that the patch has
> been forwarded upstream. Ideally the value is an URL proving that
> it has been forwarded and where one can find more information about
> its inclusion status.
> 
> If the field is missing, its implicit value is "yes" if the "Bug"
> field is present, otherwise it's "no". The field is really required
> only if the patch is vendor specific, in that case its value should
> be "not-needed" to indicate that the patch must not be forwarded
> upstream (whereas "no" simply means that it has not yet been done).
> 
> So it says that any value other than "no" or "not-needed" means
> forwarded, then it says that if the field is missing it means it is an
> implicit value of "yes", where I've always interpreted as implicitly
> stating that "yes" is also a valid value.

If the field is missing, then its implicit value is 'yes' only if the
Bug field (which points to the upstream bug) is present.

> (I also recently amended the patch metadata header template generated
> by dpkg-source and did not have "yes" as a value there, but I've added
> it locally now, and will probably queue it for dpkg 1.22.x.)

The logic in the UDD implementation is documented just below the table:
> The forwarded value in the table is computed and might differ slightly
> from the original DEP3 specification. It is yes only if an URL is
> provided as value. It is invalid if none of the other value could be
> determined with certainty (yes, no, not-needed).

But I could do better, and consider the Bug field in the analysis.

The problem I have is that the 'Bug' header is often misused, and used
for the Debian bug instead of the upstream bug. But I could special-case
that.

Lucas



Bug#1028503: UDD: Unknown "yes" value for Forwarded field in patch metadata

2023-01-11 Thread Guillem Jover
Package: qa.debian.org
Severity: normal
User: qa.debian@packages.debian.org
Usertags: udd

Hi!

The new patch data is great, thanks! I just noticed though that it does
not recognize a "yes" value for the Forwarded field, while the
"Patch Tagging Guidelines" has this to say about it:

  * Forwarded (optional)

Any value other than "no" or "not-needed" means that the patch has
been forwarded upstream. Ideally the value is an URL proving that
it has been forwarded and where one can find more information about
its inclusion status.

If the field is missing, its implicit value is "yes" if the "Bug"
field is present, otherwise it's "no". The field is really required
only if the patch is vendor specific, in that case its value should
be "not-needed" to indicate that the patch must not be forwarded
upstream (whereas "no" simply means that it has not yet been done).

So it says that any value other than "no" or "not-needed" means
forwarded, then it says that if the field is missing it means it is an
implicit value of "yes", where I've always interpreted as implicitly
stating that "yes" is also a valid value.

(I also recently amended the patch metadata header template generated
by dpkg-source and did not have "yes" as a value there, but I've added
it locally now, and will probably queue it for dpkg 1.22.x.)

Thanks,
Guillem