Am 27.11.11 23:47, schrieb Karl Pflästerer:
> Am 27.11.11 23:23, schrieb Yannick Torrès:
>> 2011/11/27 Karl Pflästerer<[email protected]>:
>>> Hi,
>>
>> Hi,
>>
>>> forgive me if I ask something which had already been discussed, but I've
>>> seen nothing in the archives.
>>>
>>> I try to help translating some of the docs and saw here
>>> https://edit.php.net/ this box:
>>>
>>> Check for errors in /language-snippets.ent
>>>
>>> The content for that box seems to get computed from tha class
>>> http://svn.php.net/repository/web/doc-editor/trunk/php/ToolsError.php
>>>
>>> There is a method attributLinkTag()
>>>
>>> To compare the linkend atrribute of the<link> tags it uses a regex.
>>>
>>> $reg = '/<link\s*?linkend=("|\')(.*?)("|\')\s*?>/s';
>>>
>>> You see between<link and the linkend attribute only whitespace is allowed.
>>> But for example in the german translation (and also in the english
>>> documentation) some<link> tags have another attribute between the
>>> element
>>> name and "linkend".
>>
>> Could you give me an example please of this case ?
>
>> From en/language-snippets.ent
>
> <!ENTITY seealso.array.sorting 'The<link
> xmlns="http://docbook.org/ns/docbook" linkend="array.sorting">comparison of
> array sorting functions</link>'>
>
> <!ENTITY seealso.callback 'information about the<link
> xmlns="http://docbook.org/ns/docbook"
> linkend="language.types.callback">callback</link> type'>
>
> In the german translation are more examples (some of them IMHO wrong, since
> they duplicate the xmlns attribute), but I'm not sure if such a simple
> difference should trigger such an error.
>
>>
>>> An easy fix would be
>>> $reg = '/<link[^<>]+linkend=("|\')(.*?)("|\')[^<>]*>/s';
>>>
>>> But that would solve only have of the problem; the other problem is that
>>> the
>>> check script needs the same order of entities in both files and it
>>> compares
>>> only the position of the found links in both match arrays. So e.g. one
>>> link
>>> more in the translation will give false matches for all following entries.
>>
>> Yes it is.
>> The goal here is to check each file and warn when there is only one
>> difference even if this is an ordre problem (this can be a translation
>> error too).
>
> Ok. (for a file with only entity definition order shouldn't matter or?)
>
>>
>>> Does it make sense to rewrite that algorithm, so that it compares each
>>> entity in the english original and the translation so we get better
>>> errors?
>>
>> You mean to avoid order check ?
>> Perhaps we can do this yes : check the number of this tag, and check
>> if there is all of this tag, even if the order is not respected.
>
> I thought to perhaps check each entity definition; so not to do a simple
> preg_match_all and compare $match_en[1] to $match_lang[1] but to compare the
> linkend attribute of entity definition in en and $lang.
>
> Then the error could be: Difference in linkend attribute in entity xyz.
To be a little bit more concrete, here is a code example (that's just a POC):
<?php
function extract_linkend ($s) {
$rx_linkend = '
/
<(?: link | xref)
[^<>]+
linkend=(?:"|\') (.*?) (?:"|\')
[^<>]*
>
/xs';
$rx_entities = '/(<!ENTITY\s+(\S+).+?)(?=(?:<!ENTITY|$))/s';
preg_match_all($rx_entities, $s, $m_entities, PREG_SET_ORDER);
$linkend_by_entity = array();
foreach ($m_entities as $entity) {
preg_match_all($rx_linkend, $entity[1], $m_linkend);
if ($m_linkend[1])
$linkend_by_entity[$entity[2]] = $m_linkend[1];
};
return $linkend_by_entity;
}
$link_de = extract_linkend(file_get_contents('language-snippets.ent'));
$link_en = extract_linkend(file_get_contents('../en/language-snippets.ent'));
$diff = array_udiff_assoc($link_en, $link_de,
function ($en, $lang) { return array_diff($en,
$lang) ? 1 : 0; } );
foreach ($diff as $entity => $linkends) {
echo "Entity: $entity\n";
echo 'EN: ' . join('; ', $linkends), "\n";
echo 'DE: ' . join('; ', $link_de[$entity]), "\n\n";
}
If I run that (with the de translation), I get:
Entity: ini.php.constants
EN: configuration.changes.modes
DE: ini
Entity: mysqli.available.mysqlnd
EN: book.mysqlnd
DE: mysqli.overview.mysqlnd
That could be helpful (IMHO).
KP