Hi everyone and Wolfgang in particular ;)

I prepared a code snippet that is able to remove some untranslatable
strings from the manuals.
I tested it locally (as a separate script) on the Debian Edu bullseye
manual, but more rigorous testing is probably advisable before taking it
into production.
I suppose the snippet could be appended to scripts/get_manual, although I
could not test this here.

---- begin code snippet -----

# create $name-stripped.xml
# wich will remove some non-translatable strings
# ---remove untranslatable image names--- #
echo "removing image names"
sed -e 's#<imagedata.*</imageobject>#</imageobject>#g' $xmlfile > $name-
stripped.xml
# ---remove paragraphs that just have a <ulink> and no other text--- #
echo "removing link paragraphs"
    #---# first copy those paragraphs to a tempfile #---#
    TMPFILE3=$(mktemp)
    cat $xmlfile | sed -n '/^<para><ulink/p' | sed -n '/> *$/p'  >
$TMPFILE3
    #---# then replace those links with an empty string #---#
    #---# and keep only the <para> tag to prevent xml from being broken #
---#
    while read line ;
        do sed -i "s#$line#<para>#" $name-stripped.xml
        done < $TMPFILE3
# ---remove FIXME: paragraphs--- #
# ---(currently that colon is missing in some FIXME paragraphs)--- #
echo "removing FIXME: paragraphs"
sed -i '/^FIXME\:/d' $name-stripped.xml

---- end code snippet -----

For this to be useful, also po4a.cfg needs a small addition.
It should look like this (with the added pot_in line):
[po_directory] .

[type: docbook] debian-edu-bullseye-manual.xml \
        pot_in:debian-edu-bullseye-manual-stripped.xml \
        $lang:$lang.xml \
        add_$lang:?./$lang.add \
        opt:"-o nodefault='<inlinemediaobject> <imagedata>' -o
untranslated='<listitem> <inlinemediaobject> <imagedata>' -M UTF-8 -k 15"

With this enabled, the Debian Edu bullseye manual counts 1154 translatable 
stringsinstead of 1210 strings now.More untranslatable strings could be moved 
out with some adaptations to the wiki(e.g. reword paragraphs that contain a 
link, so that those links could become separate paragraphs in a meaningful way).
-- 
Kind regards,
Frans Spiesschaert


Reply via email to