Bug#1035734: Does not search for terms in latin1 encoding anymore

2023-05-08 Thread Axel Beckert
Control: tag -1 + moreinfo

Hi Mowgli,

Klaus Ethgen wrote:
> I usually have latin1-Encoding everywhere. The output of translate is
> good but the input does not allow latin1-umlauts like ä, ö, ü, ... I can
> input them but they always returns nothing.

Interesting, thanks for the bug report.

> Older versions worked well but I cannot say, when the incompatibility
> was implemented. At least version 0.6 worked well.

Can you check which Debian package of 0.6? All recent (non-comment)
UTF-8 related changes I see in the git log seemed to have been in
0.6-6 and 0.6-7. Which actually were long ago (2005).

Just a thought: Could it be that this change in Debian's locales could
have caused some unexpected side effect on already configured
non-UTF-8 locales?

  locales (2.31-14) unstable; urgency=low

* Starting with locales 2.31-14, non UTF-8 locales are deprecated and not
  offered anymore in the debconf dialog, except for the ones already
  configured. Nevertheless users of non UTF-8 locales are encouraged to
  switch their system to an UTF-8 locale.

  Please note that iconv still supports conversion to and from non UTF-8
  charset. For instance reading a file using an ISO-8859-15 charset can be
  done with: iconv --from-code=ISO-8859-15 foobar.txt

   -- Aurelien Jarno   Tue, 17 Aug 2021 16:27:59 +0200 

> The strange thing is, that I see no reason for the bug. Even when I do
> `bash -x /usr/bin/translate ...` it DOES iconv my input. But if I just
> do `/usr/bin/translate ...`, it doesn't. This bug is a full riddle for
> me.
[…]
> Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set

Or could this LC_CTYPE with C.UTF-8 be the reason? Because this would
likely cause $UTF8 to be set in the script:

  UTF8=0
  if locale 2>&1 | grep -E -q "UTF-8?$"; then
  UTF8=1
  fi

Actually I just now noticed a typo in that regexp. It likely should be
"UTF-?8$" and not "UTF-8?$" (i.e. the dash is optional, not the digit
eight). That typo is now fixed in git. (I though don't think that this
typo caused your issue. It could have caused issues the other way
around: Not properly detected UTF-8 locales.)

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#1035734: Does not search for terms in latin1 encoding anymore

2023-05-08 Thread Klaus Ethgen
Package: translate

Version: 1.1.3
Severity: normal

I usually have latin1-Encoding everywhere. The output of translate is
good but the input does not allow latin1-umlauts like ä, ö, ü, ... I can
input them but they always returns nothing.

Older versions worked well but I cannot say, when the incompatibility
was implemented. At least version 0.6 worked well.

The strange thing is, that I see no reason for the bug. Even when I do
`bash -x /usr/bin/translate ...` it DOES iconv my input. But if I just
do `/usr/bin/translate ...`, it doesn't. This bug is a full riddle for
me.

As translate is of mayor use for me so please reallow non-UTF-8 input.

-- System Information:
Debian Release: 12.0
  APT prefers experimental
  APT policy: (1, 'experimental')
merged-usr: no
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.1.12 (SMP w/8 CPU threads)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_CPU_OUT_OF_SPEC, 
TAINT_FIRMWARE_WORKAROUND, TAINT_OOT_MODULE
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages translate depends on:
ii  trans-de-en  1.9-6

translate recommends no packages.

Versions of packages translate suggests:
pn  ding  

-- no debconf information

-- 
Klaus Ethgen   http://www.ethgen.ch/
pub  4096R/4E20AF1C 2011-05-16Klaus Ethgen 
Fingerprint: 85D4 CA42 952C 949B 1753  62B3 79D0 B06F 4E20 AF1C


signature.asc
Description: PGP signature