Control: forcemerge -1 1125088

On Wed, Jul 23, 2025 at 09:54:06AM +0100, Colin Watson wrote:
> Control: tag -1 patch
> 
> On Sun, Apr 20, 2025 at 06:55:38AM +0200, Salvatore Bonaccorso wrote:
> > Running grep-excuses for instance right now against the linux package
> > gives:
> > 
> > linux (6.12.21-1 to 6.12.22-1)
> >    Maintainer: Debian Kernel Team
> >    Depends: linux linux-signed-amd64
> >    Depends: linux linux-signed-arm64
> >    Migration status for linux (6.12.21-1 to 6.12.22-1): Waiting for test 
> > results or another package, or too young (no action required now - check 
> > later)
> >    Issues preventing migration:
> > Wide character in print at /usr/bin/grep-excuses line 364.
> 
> This patch fixes it for me, but my Perl is quite rusty.  (Ideally it might
> use the locale encoding instead of hardcoding UTF-8; but it wouldn't be the
> first script in devscripts to take the latter approach, and the smarter
> approach used in who-permits-upload would cause grep-excuses to gain an
> additional dependency on libencode-locale-perl.)
> 
> diff --git a/scripts/grep-excuses.pl b/scripts/grep-excuses.pl
> index 5f4faeb3..99903e39 100755
> --- a/scripts/grep-excuses.pl
> +++ b/scripts/grep-excuses.pl
> @@ -22,6 +22,7 @@
>  use 5.006;
>  use strict;
>  use warnings;
> +use open ':std', OUT => ':encoding(UTF-8)';
>  use Data::Dumper;
>  use Dpkg::Path qw(find_command);
>  use File::Basename;

Hi,

Above change works for me. Using binmode to set STDOUT mode to
encoding(UTF-8) should have the same result.

I also tried the libencode-locale-perl way and seems I getting "รข"
instead of the bullet operator pts is using, may be I did something
wrong.

And also tried libtext-unidecode-perl, but did not like the result.

Most messages come from the bullet operator used by the PTS, so if we
try only to keep that quiet, attached patch may help (it changes bullet
operator to "o" char). This is not a full fix since maintainers name
can have non-ASCII chars and that would trigger a message, but at
least should make those messages way more unusual. If we really do not want
to unconditionally hardcode UTF-8 it may be an option, with the known
problems.

While we are at this, I am merging an equivalent bug report.

-- 
Agustin
--- grep-excuses.orig	2026-03-21 21:20:18.221692524 +0100
+++ grep-excuses	2026-03-22 00:56:13.709989639 +0100
@@ -361,6 +361,7 @@
         $excuse                      =~ s@</?[^>]+>@@g;
         $excuse                      =~ s@&lt;@<@g;
         $excuse                      =~ s@&gt;@>@g;
+        $excuse                      =~ s@\x{2219}@o@g; # Bullet operator used by pts.d.o
         print "    $excuse\n";
     }
 }

Reply via email to