Francis Tyers <fty...@prompsit.com> writes:

> El dc 05 de 01 de 2011 a les 09:32 +0100, en/na Kevin Brubeck Unhammer
> va escriure:
>> Hi,
>> 
>> Is there a bug in 
>> 
>>         <modify-case>
>>           <clip pos="1" side="tl" part="lemh"/>
>>           <lit v="aa"/>
>>         </modify-case>
>> 
>> when the input is all uppercase, or am I using it wrong?
>> 
>> 
>>     wget http://apertium.codepad.org/GdrOe3nL/raw.txt -O problem.t1x
>>     wget http://apertium.codepad.org/wo597sse/raw.txt -O problem.dix
>>     lt-comp lr problem.dix problem.dix.bin
>>     apertium-preprocess-transfer problem.t1x problem.t1x.bin
>>     echo '^GUOKTE<Num>$' | apertium-transfer problem.t1x problem.t1x.bin 
>> problem.dix.bin 
>> 
>> 
>> gives
>> 
>> 
>> ^det<det><qnt>{^tO<det><qnt>$}$
>> 
>> 
>> whereas I was expecting to see
>> 
>> 
>> ^det<det><qnt>{^to<det><qnt>$}$
>
> I think the code that deals with this is in transfer.cc 
>
> string
> Transfer::copycase(string const &source_word, string const &target_word)
>
> I'm struggling to make heads or tails of that though. In the en-ca
> rules, you find:
>
>               <modify-case>
>                 <clip pos="1" side="tl" part="lem"/>
>                 <lit v="aa"/>
>               </modify-case>
>
> and in the es-ca rules too. So I guess you are calling it right.
>
> It would seem to be a bug of some description.

s_word == "aa", t_word == "TO"
then for s_word: firstupper is false, uppercase is false, sizeone is false

  if(!uppercase || (sizeone && uppercase))
  {
    result = t_word;
    result[0] = towlower(result[0]);
    //result = StringUtils::tolower(t_word);
  }
  else
  {
    result = StringUtils::toupper(t_word);
  }
  
  if(firstupper)
  {
    result[0] = towupper(result[0]);
  }

gives us "tO" (first test passes). If we change the first test to 

  if(!uppercase || (sizeone && uppercase))
  {
    result = t_word;
    //result[0] = towlower(result[0]);
    result = StringUtils::tolower(t_word);
  }

we get the expected "to". Does anyone know why we would want to only
lowercase the first character? 



On a related note, why is sizeone&&uppercase treated as if it were
lowercase? Isn't it safer to simply ignore sizeone words passed to
modify-case? E.g.

  if(!sizeone){
    if(!uppercase) { tolower }
    else { toupper }
    if(firstupper) { toupper [0] }
  }



-Kevin

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to