Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-29 Thread Guillem Jover
On Thu, 2026-01-29 at 21:01:22 +0530, Nilesh Patra wrote:
> > On 29 Jan 2026, at 1:04 PM, Guillem Jover  wrote:
> >> On Wed, 2026-01-28 at 16:51:42 -0500, Louis-Philippe Véronneau wrote:
> >> I'm not sure I understand what a "nationally encoded email" is and
> >> how using a non-ascii email leads to creating a non-compatible
> >> .changes?
> 
> Basically, if you use email like "Foo Bar ” (which
> guillem mentioned in [1]) the .changes file preserves this.
> Lintian sees this and the uf8 parsing fails here, saying this is
> “nationally” encoded string.

I think then this might perhaps be a problem in your environment? If I
set a similar address in say pci.ids in both the debian/control and
debian/changelog, when building with dpkg-buildpackage with
LANG=C.UTF-8, I get this output:

  ,---
  $ dpkg-buildpackage --no-sign
  dpkg-buildpackage: info: source package pci.ids
  dpkg-buildpackage: info: source version 0.0~2025.12.16-1
  dpkg-buildpackage: info: source distribution unstable
  dpkg-buildpackage: info: source changed by Güillem Jöver 
  dpkg-buildpackage: info: host architecture amd64
  […]
  dpkg-deb: building package 'pci.ids' in '../pci.ids_0.0~2025.12.16-1_all.deb'.
   dpkg-genbuildinfo -O../pci.ids_0.0~2025.12.16-1_amd64.buildinfo
   dpkg-genchanges -O../pci.ids_0.0~2025.12.16-1_amd64.changes
  dpkg-genchanges: info: including full source code in upload
   dpkg-source --after-build .
  dpkg-buildpackage: info: full upload (original source is included)
   lintian ../pci.ids_0.0~2025.12.16-1_amd64.changes
  E: pci.ids source: bogus-mail-host Maintainer güillem@debiån.org
  E: pci.ids: bogus-mail-host Maintainer güillem@debiån.org
  E: pci.ids changes: bogus-mail-host Changed-By güillem@debiån.org
  E: pci.ids changes: bogus-mail-host Maintainer güillem@debiån.org
  E: pci.ids: bogus-mail-host-in-debian-changelog güillem@debiån.org (for 
version 0.0~2025.12.16-1) [usr/share/doc/pci.ids/changelog.Debian.gz:1]
  I: pci.ids source: unused-override 
override_dh_auto_test-does-not-check-DEB_BUILD_OPTIONS [debian/rules:*] 
[debian/source/lintian-overrides:4]
  N: 0 hints overridden; 1 unused override
  dpkg-buildpackage: error: lintian ../pci.ids_0.0~2025.12.16-1_amd64.changes 
subprocess failed with exit status 2
  `---

Where lintian does not accept the UTF-8 domain, but can parse it
correctly as UTF-8. So the checks in lib/Lintian/Check/Files/Encoding.pm
for non-UTF-8 do not trigger, but the checks for is_domain() in
lib/Lintian/Check/Debian/Changelog.pm and
lib/Lintian/Check/Fields/MailAddress.pm do trigger.

I think the is_domain() check is not correct, because that checks for
a domain name as would be part of say a DNS query (after encoding in
PunyCode), but it does not support IDN which is what people would use
in this context.

> > So, unless I'm missing something, I'd still consider closing this one?
> 
> Mh… since lintian had a check for this and dpkg used to allow ansi
> escaped emails in .changes, I don’t understand how this is not a
> regression.

The check seems to have been prompted by the Debian Policy bug #962277,
which has not seen wording not seconds. The current Debian Policy seems
to disallow this (§5.6.2).

So, while strictly speaking dpkg not doing any validation before and
doing it now, can be considered a regression, I think in the Debian
context this does not really count, because of what Debian Policy says
anyway, and in the dpkg upstream context I think allowing ANSI escapes
sequences (while cute) is a really bad idea.

> I’d agree that this is an edge case, though. If you
> however feel like closing this bug report, I don’t have a problem.
> But we ought to mention somewhere that Maintainers/Uploaders Email
> needs to be a standardised address.

I think this is covered already by Debian Policy, and in the proposed
bug reports #401452, #852677 and #962277.

But the dpkg documentation should be certainly updated to match at
least its own current expectations. Thanks, I'll do that with this
report.

Regards,
Guillem



Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-29 Thread Nilesh Patra


> On 29 Jan 2026, at 1:04 PM, Guillem Jover  wrote:
>> On Wed, 2026-01-28 at 16:51:42 -0500, Louis-Philippe Véronneau wrote:
>> I'm not sure I understand what a "nationally encoded email" is and
>> how using a non-ascii email leads to creating a non-compatible
>> .changes?

Basically, if you use email like "Foo Bar ” (which guillem 
mentioned in [1]) the .changes file preserves this.
Lintian sees this and the uf8 parsing fails here, saying this is “nationally” 
encoded string.

> So, unless I'm missing something, I'd still consider closing this one?

Mh… since lintian had a check for this and dpkg used to allow ansi escaped 
emails in .changes, I don’t
understand how this is not a regression. I’d agree that this is an edge case, 
though. If you however
feel like closing this bug report, I don’t have a problem. But we ought to 
mention somewhere that
Maintainers/Uploaders Email needs to be a standardised address.


[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1126502#14



Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-28 Thread Guillem Jover
Hi!

On Wed, 2026-01-28 at 16:51:42 -0500, Louis-Philippe Véronneau wrote:
> On Wed, 28 Jan 2026 05:57:36 +0530 Nilesh Patra wrote:
> > > On 28 Jan 2026, at 5:27 AM, Guillem Jover wrote:
> > > > On Tue, 2026-01-27 at 22:15:36 +0530, Nilesh Patra wrote:
> > > > > Lintian’s test suite reflects this error:
> > > > >
> > > > > | dpkg-source: warning: --auto-commit is not a valid option for 
> > > > > Dpkg::Source::Package::V3::Native
> > > > > | dpkg-source: error: cannot parse Maintainer field value "Colorful 
> > > > > <"colorful"@43-1.org>”
> > > > > | dpkg-buildpackage: error: dpkg-source -iNEVER_MATCH_ANYTHING 
> > > > > -INEVER_MATCH_ANYTHING --auto-commit --before-build . subprocess 
> > > > > failed with exit status 255
> > > > > | make: *** 
> > > > > [/tmp/autopkgtest-lxc.18_8mz9y/downtmp/autopkgtest_tmp/build-and-evaluate-test-packages/package-sources/checks/fields/terminal-control/colorful/Makefile:39:
> > > > >  colorful_1.0_amd64.changes] Error 255

> > > > Hmm, TBH I don't think I'm comfortable allowing ANSI escapes as part
> > > > of email addresses? This feels unnecessary and a recipe for terminal
> > > > exploits and similar.
> > 
> > This happens to be allowed by RFC 6532 as such.
> 
> I'd be inclined to disregard RFC 6532 for ANSI escapes and have dpkg
> continue to fail when they are used in those fields.
> 
> This would let us remove the `ansi-escape` tag in Lintian.

I think a similar sentiment about not allowing everything permitted
usually in an email context (but I don't think for this specific case)
was brought up on the policy report about updating the email spec. As
well as focusing on usability and readability, and ignoring any weird
encoding that might be needed to be able to actually send via say
SMTP (where that should be left for such programs that need to perform
such tasks).

(In dpkg this is caught by a regex matching on not \cK which I don't see
removing, as that would seem like a bad idea.)

> > >> This is a fallout with #962277.
> > >> >> dpkg should allow domains with non-ascii chars. For now
> > I’ll try to
> > >> drop this test from lintian test suite.
> > > > Otherwise the parser already supports UTF-8 characters in
> > both the
> > > local and domain parts:
> > > > $ perl -Mv5.40 -MDpkg::Email::Address -E '\
> > >my $e = Dpkg::Email::Address->new("Foo Bar "); \
> > >say $e->as_string(); \
> > >  '
> > > Foo Bar 
> > > > So, I'm inclined to close this one.
> > 
> > Lintian expects the .dsc/.changes file to be UTF encoded. If someone uses 
> > the address, it leads to .changes with a nationally encoded email
> > and lintian would fail to parse that /o\
> > 
> > This is getting way more complicated than I did think originally.
> 
> I'm not sure I understand what a "nationally encoded email" is and
> how using a non-ascii email leads to creating a non-compatible
> .changes?

I think in general "national encoding" means things like iso-8859-
or similar encodings. In this case I'm not sure I understand because
dpkg-dev tools should preserve the input encoding on output, and if
this was a problem for the email address itself it would also be a
problem for the name. So either the user has specified something that
is not UTF-8 or they perhaps are running in a non-UTF-8 environment?
In any case this would already be a problem so I don't see it as a
regression. Say with:

  Föo Bår 

So, unless I'm missing something, I'd still consider closing this one?

Thanks,
Guillem



Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-28 Thread Louis-Philippe Véronneau
On Wed, 28 Jan 2026 05:57:36 +0530 Nilesh Patra  
wrote:



> On 28 Jan 2026, at 5:27 AM, Guillem Jover  wrote:
> 
> Hi!
> 
> On Tue, 2026-01-27 at 22:15:36 +0530, Nilesh Patra wrote:

>> Package: dpkg
>> Severity: important
>> X-Debbugs-Cc: [email protected] 
.org
>> Control: affects -1 + src:lintian
>> Version: 1.23.5
> 
>> Lintian’s test suite reflects this error:
>> 
>> | dpkg-source: warning: --auto-commit is not a valid option for Dpkg::Source::Package::V3::Native

>> | dpkg-source: error: cannot parse Maintainer field value "Colorful 
<"colorful"@43-1.org>”
>> | dpkg-buildpackage: error: dpkg-source -iNEVER_MATCH_ANYTHING 
-INEVER_MATCH_ANYTHING --auto-commit --before-build . subprocess failed with exit 
status 255
>> | make: *** 
[/tmp/autopkgtest-lxc.18_8mz9y/downtmp/autopkgtest_tmp/build-and-evaluate-test-packages/package-sources/checks/fields/terminal-control/colorful/Makefile:39:
 colorful_1.0_amd64.changes] Error 255
> 
> Hmm, TBH I don't think I'm comfortable allowing ANSI escapes as part

> of email addresses? This feels unnecessary and a recipe for terminal
> exploits and similar.

This happens to be allowed by RFC 6532 as such.


I'd be inclined to disregard RFC 6532 for ANSI escapes and have dpkg 
continue to fail when they are used in those fields.


This would let us remove the `ansi-escape` tag in Lintian.


>> This is a fallout with #962277.
>> 
>> dpkg should allow domains with non-ascii chars. For now I’ll try to

>> drop this test from lintian test suite.
> 
> Otherwise the parser already supports UTF-8 characters in both the

> local and domain parts:
> 
> $ perl -Mv5.40 -MDpkg::Email::Address -E '\

>my $e = Dpkg::Email::Address->new("Foo Bar "); \
>say $e->as_string(); \
>  '
> Foo Bar 
> 
> So, I'm inclined to close this one.


Lintian expects the .dsc/.changes file to be UTF encoded. If someone uses the 
address, it leads to .changes with a nationally encoded email
and lintian would fail to parse that /o\

This is getting way more complicated than I did think originally.


I'm not sure I understand what a "nationally encoded email" is and how 
using a non-ascii email leads to creating a non-compatible .changes?


--
  ⢀⣴⠾⠻⢶⣦⠀
  ⣾⠁⢠⠒⠀⣿⡁  Louis-Philippe Véronneau
  ⢿⡄⠘⠷⠚⠋   [email protected] / veronneau.org
  ⠈⠳⣄



Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-27 Thread Nilesh Patra



> On 28 Jan 2026, at 5:27 AM, Guillem Jover  wrote:
> 
> Hi!
> 
> On Tue, 2026-01-27 at 22:15:36 +0530, Nilesh Patra wrote:
>> Package: dpkg
>> Severity: important
>> X-Debbugs-Cc: [email protected] 
>> .org
>> Control: affects -1 + src:lintian
>> Version: 1.23.5
> 
>> Lintian’s test suite reflects this error:
>> 
>> | dpkg-source: warning: --auto-commit is not a valid option for 
>> Dpkg::Source::Package::V3::Native
>> | dpkg-source: error: cannot parse Maintainer field value "Colorful 
>> <"colorful"@43-1.org>”
>> | dpkg-buildpackage: error: dpkg-source -iNEVER_MATCH_ANYTHING 
>> -INEVER_MATCH_ANYTHING --auto-commit --before-build . subprocess failed with 
>> exit status 255
>> | make: *** 
>> [/tmp/autopkgtest-lxc.18_8mz9y/downtmp/autopkgtest_tmp/build-and-evaluate-test-packages/package-sources/checks/fields/terminal-control/colorful/Makefile:39:
>>  colorful_1.0_amd64.changes] Error 255
> 
> Hmm, TBH I don't think I'm comfortable allowing ANSI escapes as part
> of email addresses? This feels unnecessary and a recipe for terminal
> exploits and similar.

This happens to be allowed by RFC 6532 as such.

>> This is a fallout with #962277.
>> 
>> dpkg should allow domains with non-ascii chars. For now I’ll try to
>> drop this test from lintian test suite.
> 
> Otherwise the parser already supports UTF-8 characters in both the
> local and domain parts:
> 
> $ perl -Mv5.40 -MDpkg::Email::Address -E '\
>my $e = Dpkg::Email::Address->new("Foo Bar "); \
>say $e->as_string(); \
>  '
> Foo Bar 
> 
> So, I'm inclined to close this one.

Lintian expects the .dsc/.changes file to be UTF encoded. If someone uses the 
address, it leads to .changes with a nationally encoded email
and lintian would fail to parse that /o\

This is getting way more complicated than I did think originally.



Bug#1126502: dpkg does not accept non-ascii chars in email address

2026-01-27 Thread Guillem Jover
Hi!

On Tue, 2026-01-27 at 22:15:36 +0530, Nilesh Patra wrote:
> Package: dpkg
> Severity: important
> X-Debbugs-Cc: [email protected] 
> .org
> Control: affects -1 + src:lintian
> Version: 1.23.5

> Lintian’s test suite reflects this error:
> 
> | dpkg-source: warning: --auto-commit is not a valid option for 
> Dpkg::Source::Package::V3::Native
> | dpkg-source: error: cannot parse Maintainer field value "Colorful 
> <"colorful"@43-1.org>”
> | dpkg-buildpackage: error: dpkg-source -iNEVER_MATCH_ANYTHING 
> -INEVER_MATCH_ANYTHING --auto-commit --before-build . subprocess failed with 
> exit status 255
> | make: *** 
> [/tmp/autopkgtest-lxc.18_8mz9y/downtmp/autopkgtest_tmp/build-and-evaluate-test-packages/package-sources/checks/fields/terminal-control/colorful/Makefile:39:
>  colorful_1.0_amd64.changes] Error 255

Hmm, TBH I don't think I'm comfortable allowing ANSI escapes as part
of email addresses? This feels unnecessary and a recipe for terminal
exploits and similar.

> This is a fallout with #962277.
> 
> dpkg should allow domains with non-ascii chars. For now I’ll try to
> drop this test from lintian test suite.

Otherwise the parser already supports UTF-8 characters in both the
local and domain parts:

  $ perl -Mv5.40 -MDpkg::Email::Address -E '\
  my $e = Dpkg::Email::Address->new("Foo Bar "); \
  say $e->as_string(); \
'
  Foo Bar 

So, I'm inclined to close this one.

Thanks,
Guillem