On 25/01/2026 15:29, [email protected] wrote:
Op 25-01-2026 om 15:41 schreef Egmont Koblinger:
"simply"? It sounds anything but simple to me.
[...]
your only anchors are the "-a", "-b" and "--color" words in this example,
which you presumably locate using some heuristics (needs to begin with a
dash? or needs to begin after no more than a certain amount of leading
spaces? does it stop at the first '=' or '[' sign?), and then there are
exceptions to these rules (like the output of `[ --help` contains many
options not beginning with a slash)...
You apparently haven't looked at the script. Claude did a fine job
creating the heuristics for catching all options -- it even included
the `dd` options that don't start with a dash _and_ excluded the two
"-M " cases that aren't an option.
I don't have too much trust that it can properly split _all_ of them
without producing a single faulty translation.
Okay. Here is the result of the first version of the split.py script
run on the latest coreutils.hu.po (from 2018):
https://translationproject.org/latest/coreutils/HU.po
Compare with the lowercase version to see the changes. Let us know
if you spot any splitting mistake.
Another existing issue I noticed in a few places in hu.po
(and some other translations), is inadequate separation
between the --option and description.
This will impact man page layout if one was generating
those for various languages. Also it would impact
the new option highlighting in --help.
The following sed tweaks all the po files appropriately.
Benno, is this something you might run when updating the po set?
# Add an extra space if there is only one
sed -i -E 's/^(msgid )?"( -., | {4,6}--)([^ \]+) ([^ -\])/\1"\2\3 \4/' *.po
Or maybe this is more appropriate to enforce in coreutils
when importing new translations ?
thanks,
Padraig