Travis Vitek wrote:
> 
> 
> 
> Martin Sebor wrote:
>> 
>> My only requirement is to get those tests to pass in a reasonable
>> amount of time (i.e., without timing out), and without compromising
>> their effectiveness.
>> 
>>  > Do
>>  > we want to give up on the locale name matching, or do we want to
>> include
>>  > zh_CN in the list of locales to test? What about matching the
>> encoding?
>>  > Should we ignore all of this and just find one locale for each value
>> of
>>  > MB_CUR_MAX from 1 to MB_LEN_MAX and run the test on them?
>> 
>> Maybe. I'll let you propose what makes the most sense to you :)
>> 
>> Martin
>> 
> 
> Well, the AIX I'm testing on has 683 installed locale files. Of those,
> many are links to locales with different names. For example, we have
> 
>     $ locale -a | grep "_CN" | grep -v "\."
>     ZH_CN
>     Zh_CN
>     zh_CN
>     $ ls -l /usr/lib/nls/loc/ZH_CN
>     lrwxrwxrwx   1 bin    bin      bin              28 Feb  8 2008 
> /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/ZH_CN.UTF-8
>     $ ls -l /usr/lib/nls/loc/Zh_CN
>     lrwxrwxrwx   1 bin    bin      bin              28 Feb  8 2008 
> /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/Zh_CN.GB18030
>     $ ls -l /usr/lib/nls/loc/zh_CN
>     lrwxrwxrwx   1 bin    bin      bin              28 Feb  8 2008 
> /usr/lib/nls/loc/ZH_CN -> /usr/lib/nls/loc/zh_CN.IBM-eucCN
> 
> The locales that are mapped to [ZH_CN.UTF-8, Zh_CN.GB18030,
> zh_CN.IBM-eucCN] also appear in the locale list, so we have many
> duplicated locales. So, for an immediate reduction in the number of tested
> locales, we could eliminate these duplicates. How to tell if a locale is a
> duplicate? I'm not sure.
> 
> Another option would be to ignore all locales that don't match the regular
> expression "[a-z][a-z]_[A-Z][A-Z]([EMAIL PROTECTED])?$" or the fnmatch 
> expressions
> "[a-z][a-z]_[A-Z][A-Z]" and "[EMAIL PROTECTED]". The C/POSIX
> locales don't match this, but we can explicitly allow them.
> 
> This alone cuts the number of locales down significantly, though it does
> affect other platforms. Here is a small table showing the total number of
> locales, and the number of locales that match the above regular
> expression.
> 
>         Okay Total
> AIX      226   603
> Compaq    33    40
> HP-UX    142   160
> Irix      39    60
> Linux    479   582
> Solaris  223   331
> 
> Another option is to build up a list of all installed locales [their names
> and other properties], and then provide a mechanism to search through, or
> iterate over that list. If you want to run a test on all locales that have
> a name matching some expression, you write a function or function object
> to return true on match. You pass that to the rw_locales_match() routine,
> and it gives you the first match. Call again to get the next match or
> null.
> 
>     for (const rw_locale_entry* e = rw_locales_match(0, fun);
>          e; e = rw_locales_match(e, fun))
>     {
>     }
> 
> If you want to select only locales with mb_cur_max of 4, you either write
> a filter, or you explicitly iterate over the list. If we really decide
> that it is necessary to write up a SQL type language for selecting
> locales, then that system can be implemented on top of this.
> 
> Travis
> 

Ah, my primitive scheme above isn't quite good enough. The time to run the
22.locale.ctype.is test was 28m35s, and I've reduced it down to 6m28s with
an 11s build on AIX. The test would have timed out at 5 minutes.

Now that I've seen that, it makes me wonder about the other proposal and the
SQL-like query string idea. If we get a locale from the system, we don't
have access to the original data that was in the ASCII source file. We just
get the data presented from the C/C++ locale. This means that we have to
discover information about the locale [like the mb_cur_max value]. This may
take considerable time.

Travis

-- 
View this message in context: 
http://www.nabble.com/low-hanging-fruit-while-cleaning-up-test-failures-tp13634803p14821525.html
Sent from the stdcxx-dev mailing list archive at Nabble.com.

Reply via email to