Am 27.11.11 23:47, schrieb Karl Pflästerer:
> Am 27.11.11 23:23, schrieb Yannick Torrès:
>>  2011/11/27 Karl Pflästerer<k...@rl.pflaesterer.de>:
>>>   Hi,
>>
>>  Hi,
>>
>>>   forgive me if I ask something which had already been discussed, but I've
>>>   seen nothing in the archives.
>>>
>>>   I try to help translating some of the docs and saw here
>>>   https://edit.php.net/ this box:
>>>
>>>   Check for errors in /language-snippets.ent
>>>
>>>   The content for that box seems to get computed from tha class
>>>   http://svn.php.net/repository/web/doc-editor/trunk/php/ToolsError.php
>>>
>>>   There is a method attributLinkTag()
>>>
>>>   To compare the linkend atrribute of the<link>   tags it uses a regex.
>>>
>>>   $reg = '/<link\s*?linkend=("|\')(.*?)("|\')\s*?>/s';
>>>
>>>   You see between<link and the linkend attribute only whitespace is allowed.
>>>   But for example in the german translation (and also in the english
>>>   documentation) some<link>   tags have another attribute between the 
>>> element
>>>   name and "linkend".
>>
>>  Could you give me an example please of this case ?
> 
>> From en/language-snippets.ent
> 
> <!ENTITY seealso.array.sorting 'The<link 
> xmlns="http://docbook.org/ns/docbook"; linkend="array.sorting">comparison of 
> array sorting functions</link>'>
> 
> <!ENTITY seealso.callback 'information about the<link 
> xmlns="http://docbook.org/ns/docbook"; 
> linkend="language.types.callback">callback</link>  type'>
> 
> In the german translation are more examples (some of them IMHO wrong, since
> they duplicate the xmlns attribute), but I'm not sure if such a simple
> difference should trigger such an error.
> 
>>
>>>   An easy fix would be
>>>   $reg = '/<link[^<>]+linkend=("|\')(.*?)("|\')[^<>]*>/s';
>>>
>>>   But that would solve only have of the problem; the other problem is that 
>>> the
>>>   check script needs the same order of entities in both files and it 
>>> compares
>>>   only the position of the found links in both match arrays. So e.g. one 
>>> link
>>>   more in the translation will give false matches for all following entries.
>>
>>  Yes it is.
>>  The goal here is to check each file and warn when there is only one
>>  difference even if this is an ordre problem (this can be a translation
>>  error too).
> 
> Ok. (for a file with only entity definition order shouldn't matter or?)
> 
>>
>>>   Does it make sense to rewrite that algorithm, so that it compares each
>>>   entity in the english original and the translation so we get better 
>>> errors?
>>
>>  You mean to avoid order check ?
>>  Perhaps we can do this yes : check the number of this tag, and check
>>  if there is all of this tag, even if the order is not respected.
> 
> I thought to perhaps check each entity definition; so not to do a simple
> preg_match_all and compare $match_en[1] to $match_lang[1] but to compare the
> linkend attribute of entity definition in en and $lang.
> 
> Then the error could be: Difference in linkend attribute in entity xyz.

To be a little bit more concrete, here is a code example (that's just a POC):

<?php

function extract_linkend ($s) {

  $rx_linkend = '
    /
    <(?: link | xref)
     [^<>]+
     linkend=(?:"|\') (.*?) (?:"|\')
     [^<>]*
    >
   /xs';

  $rx_entities = '/(<!ENTITY\s+(\S+).+?)(?=(?:<!ENTITY|$))/s';

  preg_match_all($rx_entities, $s, $m_entities, PREG_SET_ORDER);
  $linkend_by_entity = array();
  foreach ($m_entities as $entity) {
    preg_match_all($rx_linkend, $entity[1], $m_linkend);
    if ($m_linkend[1])
      $linkend_by_entity[$entity[2]] = $m_linkend[1];
  };
  return $linkend_by_entity;
}


$link_de = extract_linkend(file_get_contents('language-snippets.ent'));
$link_en = extract_linkend(file_get_contents('../en/language-snippets.ent'));

$diff = array_udiff_assoc($link_en, $link_de,
                           function ($en, $lang) { return array_diff($en, 
$lang) ? 1 : 0; } );

foreach ($diff as $entity => $linkends) {
  echo "Entity: $entity\n";
  echo 'EN: ' . join('; ', $linkends), "\n";
  echo 'DE: ' . join('; ', $link_de[$entity]), "\n\n";
}


If I run that (with the de translation), I get:

Entity: ini.php.constants
EN: configuration.changes.modes
DE: ini

Entity: mysqli.available.mysqlnd
EN: book.mysqlnd
DE: mysqli.overview.mysqlnd

That could be helpful (IMHO).

  KP

Reply via email to