Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On 24/01/2026 14:57, Pádraig Brady wrote: On 24/01/2026 14:17, [email protected] wrote: Op 24-01-2026 om 13:45 schreef Pádraig Brady: On 22/01/2026 16:39, [email protected] wrote: If you come up with a script, you'll have to send it out to each of the translators. Don't use me as an intermediate. Ok I'll have a look at doing that, and possibly proposing updated translations. It may not be in place for this release, but should be in place for the next. Well... what should I do meanwhile with the current POT file? If I announce it to the translators now, there will be some who will do all the work to update the strings, only to find out upon the next release that most of that work could have been a lot easier. How about postponing the 9.10 release for two weeks and work together on a script in these last days of January? That does sound reasonable. I asked claude sonnet 4.5 to code a script to split the combined options in the existing po files to separate translations per option. It basically one shotted it in about 10 seconds. At that stage there are only whitespace differences and msgmerge is able to fuzzy match those fine. Testing the attached script with the latest pl.po for example: $ mkdir -p po/test $ cd po/test $ wget https://translationproject.org/latest/coreutils/pl.po $ ./split.py pl.po | msguniq --no-wrap > pl-split.po I tested that this merged in well with the latest pot: $ msgmerge pl-split.po ../coreutils.pot > pl-new.po split.py will keep the older single line layout in translations, but that can be adjusted over time if desired, and the highlighting code works fine with that layout anyway. So I presume we could just update all latest po files with this script, then upload all these "split" po files without involving translators? Then before release we can download and merge as usual. cheers, Padraig#!/usr/bin/env python3 import sys import re def split_po_entries(lines): i = 0 while i < len(lines): line = lines[i] if line.strip() == 'msgid ""': start_i = i msgid_lines = [] i += 1 while i < len(lines) and lines[i].startswith('"'): msgid_lines.append(lines[i]) i += 1 if i < len(lines) and lines[i].strip() == 'msgstr ""': msgstr_lines = [] i += 1 while i < len(lines) and lines[i].startswith('"'): msgstr_lines.append(lines[i]) i += 1 def is_option(line): if line.startswith('" --'): return True if line.startswith('" -'): text = line[4:] if text.startswith('M '): return False if len(text) > 0 and text[0] != ' ': return True if re.match(r'^" \S+ -\S\S \S+ ', line): return True if re.match(r'^" [a-zA-Z0-9_]+=[a-zA-Z0-9_]+ ', line): return True return False has_options = any(is_option(line) for line in msgid_lines) if has_options: first_non_empty = None for j, line in enumerate(msgid_lines): if line.strip() not in ('""', '"\\n"'): first_non_empty = j break if first_non_empty is not None and is_option(msgid_lines[first_non_empty]): msgid_lines = msgid_lines[first_non_empty:] msgstr_lines = msgstr_lines[first_non_empty:] if first_non_empty < len(msgstr_lines) else msgstr_lines msgid_groups = [] msgstr_groups = [] msgid_indices = [0] for j in range(1, len(msgid_lines)): if is_option(msgid_lines[j]): msgid_indices.append(j) msgid_indices.append(len(msgid_lines)) msgstr_indices = [0] for j in range(1, len(msgstr_lines)): if is_option(msgstr_lines[j]): msgstr_indices.append(j) msgstr_indices.append(len(msgstr_lines)) for k in range(len(msgid_indices) - 1): msgid_groups.append(msgid_lines[msgid_indices[k]:msgid_indices[k+1]]) for k in range(len(msgstr_indices) - 1): msgstr_groups.append(msgstr_lines[msgstr_indices[k]:msgstr_indices[k+1]])
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On 24/01/2026 14:17, [email protected] wrote: Op 24-01-2026 om 13:45 schreef Pádraig Brady: On 22/01/2026 16:39, [email protected] wrote: If you come up with a script, you'll have to send it out to each of the translators. Don't use me as an intermediate. Ok I'll have a look at doing that, and possibly proposing updated translations. It may not be in place for this release, but should be in place for the next. Well... what should I do meanwhile with the current POT file? If I announce it to the translators now, there will be some who will do all the work to update the strings, only to find out upon the next release that most of that work could have been a lot easier. How about postponing the 9.10 release for two weeks and work together on a script in these last days of January? That does sound reasonable. thank you, Padraig
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Op 24-01-2026 om 13:45 schreef Pádraig Brady: On 22/01/2026 16:39, [email protected] wrote: If you come up with a script, you'll have to send it out to each of the translators. Don't use me as an intermediate. Ok I'll have a look at doing that, and possibly proposing updated translations. It may not be in place for this release, but should be in place for the next. Well... what should I do meanwhile with the current POT file? If I announce it to the translators now, there will be some who will do all the work to update the strings, only to find out upon the next release that most of that work could have been a lot easier. How about postponing the 9.10 release for two weeks and work together on a script in these last days of January? -- Regards, Benno
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On 22/01/2026 16:39, [email protected] wrote: Op 22-01-2026 om 16:57 schreef Pádraig Brady: Well I did CC you on the original proposal 1 month ago. I read that with less than half an eye, and thought it was just about making URLs clickable. When something needs my attention, please email me directly -- not in a CC. For white space only changes, might tooling help with that? Yes. It would. But I don't see how a script could manage with both the reformatting and the splitting. If you come up with a script, you'll have to send it out to each of the translators. Don't use me as an intermediate. Ok I'll have a look at doing that, and possibly proposing updated translations. It may not be in place for this release, but should be in place for the next. thanks, Padraig
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Pádraig Brady writes: > On 22/01/2026 15:44, [email protected] wrote: >> Op 22-01-2026 om 14:50 schreef Pádraig Brady: >> >>> doc: all: use option highlighting and more standard alignment >> Arggh!!! What happened?! Diffing the POT file with the previous version >> gives me four thousand eight hundred lines!! >> Looking at that commit (2dad24adc0), the first change -- in basename.c -- >> does not change any wording but just reformats the lines in two ways: >> to take up more space (which one does not want in a --help text) and >> to slice up the text into separate strings (one per option). And from >> a quick scan, all changes are like this. The horror! The horror! This >> invalidates _all_ existing translations. What incredible disrespect for >> the translators. > Well I did CC you on the original proposal 1 month ago. > For white space only changes, might tooling help with that? FWIW, ISTM that the splitting change could be reverted (even though I consider it a generally better approach for those to be split up logically), if oputs is slightly adjusted to iterate options in a string that contains multiple options. ... but, I'd be surprised if a script couldn't process .po files to perform the same split as in the .pot -- Arsen Arsenović signature.asc Description: PGP signature
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On 23/01/2026 10:38, Benno Schulenberg wrote: I _strongly_ urge the maintainers to _revert_ these string changes. They are a world of hurt -- for very, very little gain, as far as I can see. I mentioned some of the advantages of the new format in: https://github.com/coreutils/coreutils/commit/cab15fc4e Given we were churning the strings anyway due to the separation of one translation per options, I thought it the best time to adjust the format. Long term this should ease translation burden I expect, with translators having to worry a lot less about layout, and less changing translations due to them being more isolated. These are command-line tools, not the clickety-clickety world of the web. Documentation is primarily consumed on the web these days thanks, Padraig.
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On Fri, Jan 23, 2026 at 11:38 AM Benno Schulenberg
wrote:
>AttributeError: can't set attribute 'fuzzy'
I don't get this error.
polib is just a single file, I have version 1.2.0, this very exact file:
https://github.com/izimobil/polib/blob/1.2.0/polib.py
Can you please try replacing your polib with this one?
I have python 3.13.7 (Ubuntu 25.10), in case this also makes a difference.
If you can't get it working, I'd be happy to attach the result of my
script, for all languages, in a tarball.
> Commenting out that line makes the script succeed, and it reports:
Well, not marking these autogenerated new strings as fuzzy will have
consequences you're obviously aware of.
> Looking at the --help strings for `wc`...
>
>msgid ""
>" -c, --bytes\n"
>" print the byte counts\n"
>msgstr " -t, --text in tekstmodus lezen (standaard)\n"
In these cases, the old translation is no longer present in the .po
file, that's why my script couldn't recover them.
Please see an updated version of the script. It takes three
parameters, in this "chronological" order:
- the old .po file, which you should grab from before the hyperlink
change (e.g. from coreutils 9.9 tarball)
- the current .po file
- the output file
It takes from the second file which translations need to be updated,
but also uses the first file as "inspiration" to find a candidate
translation.
It resurrects 535 Dutch messages for me, including:
#: src/wc.c:187
#, fuzzy
msgid ""
" -c, --bytes\n"
" print the byte counts\n"
msgstr ""
" -t, --text in tekstmodus lezen (standaard)\n"
"\n"
"Met onderstaande opties kunt u kiezen welke aantallen weergeven worden,\n"
"altijd in deze volgorde: regels, woorden, tekens, bytes, maximum
regellengte.\n"
"\n"
" -c, --byteshet aantal bytes tonen\n"
" -m, --charshet aantal tekens tonen\n"
" -l, --lineshet aantal regels tonen (in feite het aantal
LF-tekens)\n"
which, as I've stated in my previous mail (and is expected from this
script), still needs to be cleaned up, but at least the previous
translation is in there.
> What a useful script should do: [...]
> Not easy.
What I'm aiming for is a reasonable compromise between developing that
script vs. easing translator work. I hope I could get pretty close to
that.
Surely I could spend a week or two coming up with the perfect script
perfectly extracting and reformatting all the translations, but I
won't (and I guess no one else will).
e.
#!/usr/bin/python3
# https://github.com/izimobil/polib/
# https://pypi.org/project/polib/
import polib
import sys
if len(sys.argv) != 4:
print('Usage: coreutils-help-hyperlink-po-fixer-v2.py input-old-po-file input-current-po-file output-po-file\n')
sys.exit(1)
po0 = polib.pofile(sys.argv[1])
po1 = polib.pofile(sys.argv[2])
for entry in po0 + po1:
entry.msgid_squeezed = " ".join(entry.msgid.split())
count = 0
for entry in po1:
if entry.obsolete:
continue
if entry.msgstr != "" and not entry.fuzzy:
continue
if 'print the newline counts' in entry.msgid_squeezed:
print('yup here')
print(entry.msgid)
print(entry.msgstr)
found = False
for e2 in po0 + po1:
if len(entry.msgid_squeezed) < len(e2.msgid_squeezed) and entry.msgid_squeezed in e2.msgid_squeezed:
entry.msgstr += e2.msgstr
entry.fuzzy = True
found = True
break
if found:
count += 1
po1.save(sys.argv[3])
print(f'Resurrected {count} hopefully useful fuzzy translations')
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Op 22-01-2026 om 20:19 schreef Egmont Koblinger: The attached script replaces some of the empty translations to fuzzy ones. It looks for an obsoleted longer msgid that contains the new msgid as a substring, allowing whitespace changes; Thanks for the attempt. But it's far from useful. and if found then this becomes the fuzzy translation. Running the script here says: Traceback (most recent call last): File "/home/ben/...hyperlink-po-fixer.py", line 29, in entry.fuzzy = True AttributeError: can't set attribute 'fuzzy' Commenting out that line makes the script succeed, and it reports: Resurrected 313 hopefully useful fuzzy translations Looking at the --help strings for `wc`... msgid "" " -c, --bytes\n" " print the byte counts\n" msgstr " -t, --text in tekstmodus lezen (standaard)\n" msgid "" " -m, --chars\n" " print the character counts\n" msgstr " [-]echo invoertekens echoën\n" msgid "" " -l, --lines\n" " print the newline counts\n" msgstr " -l, --login de inlogprocessen tonen\n" Those are useless, because not in any way translations of the msgid. What a useful script should do: look for msgids that describe an option (that is: those that start with " -" or more spaces and "--") and then find that same option plus its description (folded into a single line) in the obsolete msgids, and when found, extract the corresponding option plus its description (everything upto the next option) into the new msgstr. Not easy. I _strongly_ urge the maintainers to _revert_ these string changes. They are a world of hurt -- for very, very little gain, as far as I can see. These are command-line tools, not the clickety-clickety world of the web. You still have to manually remove the irrelevant options and manually reformat, but that's an easy boring mechanical task. For ten, maybe twenty strings that would be okay. But for three hundred or more strings... that is torture. -- Regards, Benno
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Hi,
> Arggh!!! What happened?! Diffing the POT file with the previous version
> gives me four thousand eight hundred lines!!
Two main changes happened in the .po files: Longer translations were
split to more shorter ones (which will help maintainability in the
long run), and the strings were reformatted.
I've hacked together a quick script that recoveres most of the translations.
The attached script replaces some of the empty translations to fuzzy
ones. It looks for an obsoleted longer msgid that contains the new
msgid as a substring, allowing whitespace changes; and if found then
this becomes the fuzzy translation.
You still have to manually remove the irrelevant options and manually
reformat, but that's an easy boring mechanical task. You don't need
to look elsewhere for the old translation, or re-translate.
I hope you find it useful. I have just barely tested, let me know if
it breaks something. For most languages it turns about 300 empty
translations into such fuzzy but usable ones.
Usage: proggyname input-po-file output-po-file
Needs pyton3-polib, most likely available in your distro.
e.
#!/usr/bin/python3
# https://github.com/izimobil/polib/
# https://pypi.org/project/polib/
import polib
import sys
if len(sys.argv) != 3:
print('Usage: coreutils-help-hyperlink-po-fixer.py input-po-file output-po-file\n')
sys.exit(1)
po = polib.pofile(sys.argv[1])
for entry in po:
entry.msgid_squeezed = " ".join(entry.msgid.split())
count = 0
for entry in po:
if entry.msgstr != "" or entry.fuzzy or entry.obsolete:
continue
found = False
for e2 in po:
if len(entry.msgid_squeezed) < len(e2.msgid_squeezed) and entry.msgid_squeezed in e2.msgid_squeezed:
entry.msgstr += e2.msgstr
entry.fuzzy = True
found = True
# break??
if found:
count += 1
po.save(sys.argv[2])
print(f'Resurrected {count} hopefully useful fuzzy translations')
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Op 22-01-2026 om 16:57 schreef Pádraig Brady: Well I did CC you on the original proposal 1 month ago. I read that with less than half an eye, and thought it was just about making URLs clickable. When something needs my attention, please email me directly -- not in a CC. For white space only changes, might tooling help with that? Yes. It would. But I don't see how a script could manage with both the reformatting and the splitting. If you come up with a script, you'll have to send it out to each of the translators. Don't use me as an intermediate. -- Regards, Benno
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
On 22/01/2026 15:44, [email protected] wrote: Op 22-01-2026 om 14:50 schreef Pádraig Brady: doc: all: use option highlighting and more standard alignment Arggh!!! What happened?! Diffing the POT file with the previous version gives me four thousand eight hundred lines!! Looking at that commit (2dad24adc0), the first change -- in basename.c -- does not change any wording but just reformats the lines in two ways: to take up more space (which one does not want in a --help text) and to slice up the text into separate strings (one per option). And from a quick scan, all changes are like this. The horror! The horror! This invalidates _all_ existing translations. What incredible disrespect for the translators. Well I did CC you on the original proposal 1 month ago. For white space only changes, might tooling help with that? thanks, Padraig
Re: new snapshot available: coreutils-9.9.258-49420.tar.xz
Op 22-01-2026 om 14:50 schreef Pádraig Brady: doc: all: use option highlighting and more standard alignment Arggh!!! What happened?! Diffing the POT file with the previous version gives me four thousand eight hundred lines!! Looking at that commit (2dad24adc0), the first change -- in basename.c -- does not change any wording but just reformats the lines in two ways: to take up more space (which one does not want in a --help text) and to slice up the text into separate strings (one per option). And from a quick scan, all changes are like this. The horror! The horror! This invalidates _all_ existing translations. What incredible disrespect for the translators. -- Benno
