Re: [rfc] lstrcmpi: order still wrong (was "Re: Regression in lstrcmpiA (occurred in late June, NLS related)" from 2003 year)

2009-07-06 Thread Yuriy Kaminskiy
On 04.07.2009 23:55, Yuriy Kaminskiy wrote:
> Yuriy Kaminskiy wrote:
> I'm wrong - I don't have working windows installation at hands and cannot 
> check
> that.
Well, no answer so far; I thought "should write test, code is more welcomed than
just words", and noticed that such test already present, but disabled :-E.
That's wrong. If test report breakage, it should not be simply silenced and
forgotten for 6 years.
See "[rfc] [kernel32/tests] enable sort order test" series in *-patches





Re: [rfc] lstrcmpi: order still wrong (was "Re: Regression in lstrcmpiA (occurred in late June, NLS related)" from 2003 year)

2009-07-04 Thread Yuriy Kaminskiy
Yuriy Kaminskiy wrote:
>I've stumbled over problem with lstrcmpi sorting is still wrong. Some
> japanese game engine uses binary search on presorted array, and fails
> with a-la "object not found" errors.
[...]
> proper order should be "_" < "0" (ok) and "." < "_" (fails with vanilla
> wine).
  Well, after private email, I think I should stress out, that while I /believe/
that sort order in winxp does not depend on locale, it is also /possible/ that
I'm wrong - I don't have working windows installation at hands and cannot check
that.
  [Nevertheless, I've ran that game in japanese locale [ja_JP.UTF-8], and /if/
sort order in winxp depend on locale, sort order in wine should be fixed to
depend on locale too: still bug, but slightly different ;-)].





[rfc] lstrcmpi: order still wrong (was "Re: Regression in lstrcmpiA (occurred in late June, NLS related)" from 2003 year)

2009-07-03 Thread Yuriy Kaminskiy
Hello!
   Previous thread on this topic:
http://www.mail-archive.com/wine-devel@winehq.org/msg01080.html
   I've stumbled over problem with lstrcmpi sorting is still wrong. Some
japanese game engine uses binary search on presorted array, and fails
with a-la "object not found" errors.
   Judging by object order in archive,
=== cut ===
...
conf_p.MGD- (would fail with strcasecmp, ok with wine)
conf01.MGD--/
...
title.MGD-- fails with vanilla wine
title_p.MGD--/
...
=== cut ===
proper order should be "_" < "0" (ok) and "." < "_" (fails with vanilla
wine).
   I've replaced collation weight of '_' with 0x02560111, and now these
games run fine; but that's dirty hack, of cause, and should not be
applied to upstream: 1) it is modifies generated file; 2) weight for "_"
chosen arbitrary and can cause conflicts somewhere else (or, rather, not
can, but certainly will - there are other symbols with weight
0x0256???); 3) weight for other "_"-like chars should be modified too.
   Hope you can suggest better solution.
   FWIW, I've checked mentioned in previous thread unicode-2.1.9d8
tables - same mismatch, will not work too.
   I think, only proper way is somehow extract this table from windows
(either directly by LCMapStringW(LC_MAP_SORTKEY), or sorting array of
a[i]=i; with CompareStringW and using that order). I'm not a lawyer, but
really doubt that such reproduced table can be considered copyrightable
anywhere. How can anyone make compatible reimplementation without
reproducing in some way this table?
-- 





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-07 Thread Troy Rollo
On Fri, 3 Oct 2003 21:38, Dmitry Timoshkov wrote:
> I've asked a question regarding unicode support and sorting on
> microsoft.public.win32.programmer.international (26-28 Jun 2003)
> and have the following answers (UCA == Unicode Collation Algorithm):

Based on the lines on inquiry this opened up, the tables would almost 
certainly be within Feist in the US (and similarly probably OK to copy in 
Canada), but would definitely be within "industrious collection" copyright 
protection in Australia, New Zealand and the UK.

Of course if we can identify a unicode.org version that's much closer to the 
Microsoft tables so that only minor adjustments are necessary, the 
industrious collection copyright can be bypassed.

If that proves not to be possible, then the only choice legally is likely to 
be to use the closest version (or amalgam) of the unicode.org tables, but 
provide a facility to allow people in the US to substitute a Windows version 
of the *.nls files (found, for example in c:\winnt\system32 - sortkey.nls, 
for instance, is simply 65536 entries of four bytes in length with the 
expected format).




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-03 Thread Dmitry Timoshkov
"Troy Rollo" <[EMAIL PROTECTED]> wrote:

> The 2.1.9d8 file seems after a quick look to be closer to the Crossover 
> version of the table - for example, it has many of the different types of 
> space characters sorted near 0020, which is an aspect of the Crossover table 
> not present in the table based on allkeys.txt (3.1.1), so the theory that 
> Microsoft's results are just based on an earlier version of the standard 
> table is starting to look like it has merit.

I've asked a question regarding unicode support and sorting on
microsoft.public.win32.programmer.international (26-28 Jun 2003)
and have the following answers (UCA == Unicode Collation Algorithm):

"Michael (michka) Kaplan [MS]" <[EMAIL PROTECTED]> wrote:

> Collation on Windows does not use the UCA -- it predates the UCA and it
> supports more languages. It is architecurally prepared to handle more
> languages in the future, and frankly no one wanted to cut the functionality
> enough to make it UCA-compatible. :-)

and another one:

> No, it is not. Unicode's weights have been a part of the UCA,  which was
> first a DRAFT Unicode Technical Report in March of 1997. It did not lose its
> DRAFT status until November of 1999 and not a Unicode Technical Standard
> until August of 1999.
> 
> Windows, on the other hand, has had its architecture in place since NT 3.1
> shipped, over a decade ago. How could it be based on the Unicode sort weight
> tables, which did not exist at that time even in draft form?

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-03 Thread Shachar Shemesh
Troy Rollo wrote:

The 2.1.9d8 file seems after a quick look to be closer to the Crossover 
version of the table - for example, it has many of the different types of 
space characters sorted near 0020, which is an aspect of the Crossover table 
not present in the table based on allkeys.txt (3.1.1), so the theory that 
Microsoft's results are just based on an earlier version of the standard 
table is starting to look like it has merit.
 

Logically, it doesn't make sense that they did anything else. After all 
- why would they?

Even if it's not the case, there may be several possible workarounds for 
this issue. I have a lawer I can consult about this matter, but let's 
rule out the Unicode 2.0 theory first. I have access to the Unicode 2.0 
(printed) book, if that's any help to anyone.

Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page & resume - http://www.shemesh.biz/




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Fri, 3 Oct 2003 14:02, Dimitrie O. Paun wrote:
> This doesn't make any sense.

Well when the High Court of Australia considered it they said it was 
unsatisfactory, which is their way of saying "it sucks, but that's the way it 
is."

>It means that we can _never_ have correct
> behaviour, no matter what we do, even if we magically come up with the
> same table. This is insane.

In some cases it amounts to that.  This is why it's important to try to come 
up with some way of expressing the contents of the table without the table, 
or of finding objective rules that can generate the table.

Having compared a few versions of the allkeys database it seems that there 
have been some changes to the ordering of characters between versions, which 
leads me to wonder if Microsoft were just using an earlier version of the 
table. Microsoft's documentation suggests they adhere to version 2.0 of the 
Unicode standard, whereas the allkeys.txt file immediately accessible on the 
unicode.org web site is version 3.1.1.

Here's the versions I can find: 

2.1.9d8 http://www.unicode.org/reports/tr10/basekeys.txt
2.1.9d8 http://www.unicode.org/reports/tr10/compkeys.txt
3.1.1   http://www.unicode.org/reports/tr10/allkeys-3.1.1.txt
3.1.1d3 http://www.unicode.org/reports/tr10/allkeys-3.1.1d3.txt
3.0.0d5 http://www.unicode.org/reports/tr10/allkeys-4.0.0d5.txt

The 2.1.9d8 file seems after a quick look to be closer to the Crossover 
version of the table - for example, it has many of the different types of 
space characters sorted near 0020, which is an aspect of the Crossover table 
not present in the table based on allkeys.txt (3.1.1), so the theory that 
Microsoft's results are just based on an earlier version of the standard 
table is starting to look like it has merit.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dimitrie O. Paun
On October 2, 2003 07:30 pm, Troy Rollo wrote:
> Yes, this is a problem for copyright. The result still counts as copied, at
> least in Australia, the UK and New Zealand.

This doesn't make any sense. It means that we can _never_ have correct
behaviour, no matter what we do, even if we magically come up with the
same table. This is insane.

-- 
Dimi.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Fri, 3 Oct 2003 08:47, Dimitrie O. Paun wrote:

> I said to run
> the above on Windows and on Wine (which is based on the unicode.org
> tables). Compare the results, and generate the differences. Use that as a
> 'patch' to future unicode.org table updates.

Yes, this is a problem for copyright. The result still counts as copied, at 
least in Australia, the UK and New Zealand. It's arguable in the United 
States that given Microsoft's position you could bring it within Feist, but 
if you're using a mechanism that relies on the contents of the table and will 
necessarily produce the same table, it counts as copying.

Incidentally, going through the differences, is the value for character code 
0x34 correct in the Crossover version? All the other characters in the Basic 
Latin range that have differences are punctuation characters (in fact all the 
Basic Latin range punctuation characters have differences). 0x34, however is 
the digit '4', and it would seem odd that it would differ in ways the other 
digits don't.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dimitrie O. Paun
On Fri, 3 Oct 2003, Troy Rollo wrote:

> On Fri, 3 Oct 2003 08:21, Dimitrie O. Paun wrote:
> > Why is that? We're talking here about lstrcmpiA() behaviour, why would a
> > test for
> >
> > For all x,y in Unicode:
> > print x,y,lstrcmpiA(x,y)
> >
> > violate the copyright?
> 
> I think the suggestion was that the regression tests be used to fabricate the 
> table and then include the resulting fabricated table in Wine. If so, the 
> result would still be copied, although by an indirect means.

I don't think the result is still copied, if so than you would never be
able to run tests. But this is not what I suggested anyway. I said to run
the above on Windows and on Wine (which is based on the unicode.org tables).
Compare the results, and generate the differences. Use that as a 'patch'
to future unicode.org table updates.

-- 
Dimi.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Fri, 3 Oct 2003 08:21, Dimitrie O. Paun wrote:
> Why is that? We're talking here about lstrcmpiA() behaviour, why would a
> test for
>
> For all x,y in Unicode:
>   print x,y,lstrcmpiA(x,y)
>
> violate the copyright?

I think the suggestion was that the regression tests be used to fabricate the 
table and then include the resulting fabricated table in Wine. If so, the 
result would still be copied, although by an indirect means.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Fri, 3 Oct 2003 06:00, Shachar Shemesh wrote:
> Dmitry Timoshkov wrote:

> >Exactly. I have something like that here, the only difference is that
> >I'm dumping full unicode range 0-0x, not only first 96 characters.
>
> Isn't the full unicode range significantly larger than 0-0x? What
> about agregates? CJK etc?

The full unicode range (UCS4) is represented by a 32 bit number. Windows uses 
UTF-16 (not UCS2 as the documentation I think suggests), in which characters 
in the range dc00-dfff are used in two word sequences to represent the UCS4 
characters 0x1 to 0x10. Thus to deal with the full range of 
characters Windows can theoretically represent you'd have to have a table 
with 0x11-0x400 = 0x10fc00 entries.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dimitrie O. Paun
On Fri, 3 Oct 2003, Troy Rollo wrote:

> This doesn't help avoid the copyright on the table if you in fact reproduce 
> the table.

Why is that? We're talking here about lstrcmpiA() behaviour, why would a
test for

For all x,y in Unicode:
print x,y,lstrcmpiA(x,y)

violate the copyright?

-- 
Dimi.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Thu, 2 Oct 2003 21:49, Jakob Eriksson wrote:
> Wouldn't the clean-room way be to write regression tests that pass on
> Windows?

This doesn't help avoid the copyright on the table if you in fact reproduce 
the table.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Troy Rollo
On Thu, 2 Oct 2003 19:34, Dmitry Timoshkov wrote:
> > Can we perhaps write a tool that dumps those tables on a running MS
> > system as header files that wine can use? Would this be allowable?
>
> I really hope that we could find a solution without doing that.

Indeed - since doing that would compromise redistribution in Australia. There 
is a seminal case in which a table contained in a computer program was held 
to have copyright separately to the computer program itself. Thus to be 
distributable here (at least), the table either needs to be capable of 
generation or computation from established objective rules (which would tend 
to negate copyright), or a method of reproducing the result without the table 
would need to be devised.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Shachar Shemesh
Dmitry Timoshkov wrote:

"Jeff Smith" <[EMAIL PROTECTED]> wrote:

 

You mean something like:
   

[skipped]

Exactly. I have something like that here, the only difference is that
I'm dumping full unicode range 0-0x, not only first 96 characters.
 

Isn't the full unicode range significantly larger than 0-0x? What 
about agregates? CJK etc?

Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page & resume - http://www.shemesh.biz/




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dimitrie O. Paun
On October 2, 2003 10:19 am, Dmitry Timoshkov wrote:
> That's the approach we have chosen so far.

So, what's the problem with doing something like so:

For all x,y in Unicode
print x,y,lstrcmpi(x,y)

(It will generate maybe close to 30GB of output, but it's OK)

Run this on Windows and Wine, compare the result, and generate
a sort of patch file to apply to the unicode.org tables. For
added points, we can run this on multiple versions of Windows,
and only look at things that are immutable between versions...

-- 
Dimi.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dmitry Timoshkov
"Jeff Smith" <[EMAIL PROTECTED]> wrote:

> You mean something like:

[skipped]

Exactly. I have something like that here, the only difference is that
I'm dumping full unicode range 0-0x, not only first 96 characters.

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Jeff Smith
--- Dmitry Timoshkov <[EMAIL PROTECTED]> wrote:
> "Jakob Eriksson" <[EMAIL PROTECTED]> wrote:
> 
> > >Dmitry> The source of all of this is the difference between MS and
> > >Dmitry> unicode.org sort weight tables. There is no an easy way to make
> > >Dmitry> unicode.org database look like the MS one unfortunately...
> > >
> > >Can we perhaps write a tool that dumps those tables on a running MS system
> > >as header files that wine can use? Would this be allowable?
> > >  
> > >
> > 
> > Wouldn't the clean-room way be to write regression tests that pass on 
> > Windows?
> 
> That's the approach we have chosen so far.
> 
> -- 
> Dmitry.

You mean something like:

===
#include 

unsigned char test_strings[96][2];

int xyz (const void * y, const void * z)
{
return lstrcmpi(y, z);
}

int main(int argc, char *argv[])
{
int i;

for (i=0; i<96; i++)
sprintf (test_strings[i], "%c", i+0x20);
qsort (&test_strings[0][0], 96, 2, xyz);
for (i=0; i<96; i++) {
printf ("  0x%02x '%s'", test_strings[i][0], test_strings[i]);
if ((i == 95) || (lstrcmpi(test_strings[i], test_strings[i+1])))
printf ("\n");
}

return 0;
}
===
[On Windows 2000 Pro]
  0x7f '⌂'
  0x27 '''
  0x2d '-'
  0x20 ' '
  0x21 '!'
  0x22 '"'
  0x23 '#'
  0x24 '$'
  0x25 '%'
  0x26 '&'
  0x28 '('
  0x29 ')'
  0x2a '*'
  0x2c ','
  0x2e '.'
  0x2f '/'
  0x3a ':'
  0x3b ';'
  0x3f '?'
  0x40 '@'
  0x5b '['
  0x5c '\'
  0x5d ']'
  0x5e '^'
  0x5f '_'
  0x60 '`'
  0x7b '{'
  0x7c '|'
  0x7d '}'
  0x7e '~'
  0x2b '+'
  0x3c '<'
  0x3d '='
  0x3e '>'
  0x30 '0'
  0x31 '1'
  0x32 '2'
  0x33 '3'
  0x34 '4'
  0x35 '5'
  0x36 '6'
  0x37 '7'
  0x38 '8'
  0x39 '9'
  0x61 'a'  0x41 'A'
  0x62 'b'  0x42 'B'
  0x43 'C'  0x63 'c'
  0x44 'D'  0x64 'd'
  0x45 'E'  0x65 'e'
  0x66 'f'  0x46 'F'
  0x47 'G'  0x67 'g'
  0x48 'H'  0x68 'h'
  0x69 'i'  0x49 'I'
  0x4a 'J'  0x6a 'j'
  0x6b 'k'  0x4b 'K'
  0x6c 'l'  0x4c 'L'
  0x6d 'm'  0x4d 'M'
  0x6e 'n'  0x4e 'N'
  0x6f 'o'  0x4f 'O'
  0x50 'P'  0x70 'p'
  0x51 'Q'  0x71 'q'
  0x72 'r'  0x52 'R'
  0x53 'S'  0x73 's'
  0x74 't'  0x54 'T'
  0x75 'u'  0x55 'U'
  0x76 'v'  0x56 'V'
  0x77 'w'  0x57 'W'
  0x58 'X'  0x78 'x'
  0x59 'Y'  0x79 'y'
  0x5a 'Z'  0x7a 'z'
===

 -- Jeff Smith



__
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com



Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dmitry Timoshkov
"Jakob Eriksson" <[EMAIL PROTECTED]> wrote:

> >Dmitry> The source of all of this is the difference between MS and
> >Dmitry> unicode.org sort weight tables. There is no an easy way to make
> >Dmitry> unicode.org database look like the MS one unfortunately...
> >
> >Can we perhaps write a tool that dumps those tables on a running MS system
> >as header files that wine can use? Would this be allowable?
> >  
> >
> 
> Wouldn't the clean-room way be to write regression tests that pass on 
> Windows?

That's the approach we have chosen so far.

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Jakob Eriksson
Uwe Bonnes wrote:

   Dmitry> The source of all of this is the difference between MS and
   Dmitry> unicode.org sort weight tables. There is no an easy way to make
   Dmitry> unicode.org database look like the MS one unfortunately...
Can we perhaps write a tool that dumps those tables on a running MS system
as header files that wine can use? Would this be allowable?
 

Wouldn't the clean-room way be to write regression tests that pass on 
Windows?

regards,
Jakob




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dmitry Timoshkov
"Uwe Bonnes" <[EMAIL PROTECTED]> wrote:

> Dmitry> The source of all of this is the difference between MS and
> Dmitry> unicode.org sort weight tables. There is no an easy way to make
> Dmitry> unicode.org database look like the MS one unfortunately...
> 
> Can we perhaps write a tool that dumps those tables on a running MS system
> as header files that wine can use? Would this be allowable?

I really hope that we could find a solution without doing that.

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Dmitry Timoshkov
"Troy Rollo" <[EMAIL PROTECTED]> wrote:

> Well right now it's not using any table at all - it's just going through to 
> strncmpiW, which is essentially a word-by-word comparison. Presumably the 
> issue now is copyright on the MS version of the table. Do you have anything 
> written down on the differences that you can give me so I can look for 
> work-arounds?

I'm attaching current diff between CX Office and WineHQ CVS edited manually
to remove not related parts, ignoring that in dlls/kernel/tests/locale.c
some parts missing in the CX Office CVS got removed. The diff is provided
solely for demonstrating what exactly fixes were made and for testing,
it's not ready yet for inclusion into the WIneHQ due to reasons explained
earlier.

Some areas of interest are CompareString test suite, changes for unicode
collation table, and changes in the CompareString implementation.

P.S.
Sorry, I compressed the diff since only few of you all might be interested
to look at the really boring details...

-- 
Dmitry.


compare_string.diff.gz
Description: GNU Zip compressed data


Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-02 Thread Uwe Bonnes
> "Dmitry" == Dmitry Timoshkov <[EMAIL PROTECTED]> writes:

Dmitry> "Troy Rollo" <[EMAIL PROTECTED]> wrote:
>> Yes, but it this also means it worked for ASCII-7. Right now it
>> doesn't even work for that. This creates problems for some
>> applications, such as those that incorrectly use lstrcmpA to do
>> binary searches on internal ordered keyword tables where the keywords
>> can include punctuation characters or underscores. It means they fail
>> to find some of their keywords, the result being spurious error
>> results. Since the ASCII-7 range is the same regardless of character
>> set, this wrong use of lstrcmpA happens to work on Windows if all the
>> keywords in such a table are limited to that range.

Dmitry> The source of all of this is the difference between MS and
Dmitry> unicode.org sort weight tables. There is no an easy way to make
Dmitry> unicode.org database look like the MS one unfortunately...

Can we perhaps write a tool that dumps those tables on a running MS system
as header files that wine can use? Would this be allowable?

Bye
-- 
Uwe Bonnes[EMAIL PROTECTED]

Institut fuer Kernphysik  Schlossgartenstrasse 9  64289 Darmstadt
- Tel. 06151 162516  Fax. 06151 164321 --



Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Troy Rollo
On Thu, 2 Oct 2003 12:47, Dmitry Timoshkov wrote:

> The source of all of this is the difference between MS and unicode.org
> sort weight tables. There is no an easy way to make unicode.org database
> look like the MS one unfortunately...

Well right now it's not using any table at all - it's just going through to 
strncmpiW, which is essentially a word-by-word comparison. Presumably the 
issue now is copyright on the MS version of the table. Do you have anything 
written down on the differences that you can give me so I can look for 
work-arounds?




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Dmitry Timoshkov
"Troy Rollo" <[EMAIL PROTECTED]> wrote:

> Yes, but it this also means it worked for ASCII-7. Right now it doesn't even 
> work for that. This creates problems for some applications, such as those 
> that incorrectly use lstrcmpA to do binary searches on internal ordered 
> keyword tables where the keywords can include punctuation characters or 
> underscores. It means they fail to find some of their keywords, the result 
> being spurious error results. Since the ASCII-7 range is the same regardless 
> of character set, this wrong use of lstrcmpA happens to work on Windows if 
> all the keywords in such a table are limited to that range.

The source of all of this is the difference between MS and unicode.org
sort weight tables. There is no an easy way to make unicode.org database
look like the MS one unfortunately...

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Troy Rollo
On Wed, 1 Oct 2003 18:25, Dmitry Timoshkov wrote:
> > The older behaviour was
> > consistent with Win2k.
>
> ... and only with Latin1 locale, failing with others.

Yes, but it this also means it worked for ASCII-7. Right now it doesn't even 
work for that. This creates problems for some applications, such as those 
that incorrectly use lstrcmpA to do binary searches on internal ordered 
keyword tables where the keywords can include punctuation characters or 
underscores. It means they fail to find some of their keywords, the result 
being spurious error results. Since the ASCII-7 range is the same regardless 
of character set, this wrong use of lstrcmpA happens to work on Windows if 
all the keywords in such a table are limited to that range.




Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Dmitry Timoshkov
"Troy Rollo" <[EMAIL PROTECTED]> wrote:

> When lstrcmpiA was moved from ole2nls.c to locale.c, (around 28th June) the 
> results of comparisons in some cases became reversed. For example, the 
> underscore now returns as greater than alphabetic characters, whereas it used 
> to return as less than alphabetic characters.

Yes, I'm aware of the problem. Current CX Office CVS has fixes for
all the differences we have found so far. Unfortunately a proper
fix requires a change of the unicode sort weight tables generated
automatically from the unicode.org data base and we (Alexandre, me,
other people at Codeweavers) don't know yet how to make it fit with
future imports from unicode.org.

Unicode weight tables from MS and unicode.org have huge amount of
differences in many absolutely unexpected places...

> The older behaviour was 
> consistent with Win2k.

... and only with Latin1 locale, failing with others.

-- 
Dmitry.





Re: Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Troy Rollo
Further investigation reveals another problem in lstrcmpiA: MSDN documents 
this function as executing what it describes as a "word sort", which results 
in the words "co-op" and "coop" sorting to the same place. This is almost a 
correct description of what happens (if the strings come out to be the same 
after the word sort it appears that it does a regular comparison as well). 
The attached files demonstrate the divergence of wine in this regard as well 
as the original regression.


#include 
#include 

char *test_strings1[] =
{
	"_",
	"A",
	"a",
	"z",
	"Z",
	0
};

char *test_strings2[] =
{
	"coop",
	"co-op",
	"co-op a",
	"coop a",
	"co-op b",
	"coop b",
	0
};


void
test_string(char *pch, char **test_strings)
{
	char **ppch = test_strings;

	while (*ppch)
	{
		printf("%s\t%s\t%d\n", pch, *ppch, lstrcmpiA(pch, *ppch));
		++ppch;
	}
}

void
do_test(char **test_strings)
{
	char **ppch = test_strings;

	while (*ppch)
		test_string(*ppch++, test_strings);
}

int
main(int argc, char **argv)
{
	do_test(test_strings1);
	do_test(test_strings2);

	return 0;
}
_   _   0
_   A   -1
_   a   -1
_   z   -1
_   Z   -1
A   _   1
A   A   0
A   a   0
A   z   -1
A   Z   -1
a   _   1
a   A   0
a   a   0
a   z   -1
a   Z   -1
z   _   1
z   A   1
z   a   1
z   z   0
z   Z   0
Z   _   1
Z   A   1
Z   a   1
Z   z   0
Z   Z   0
coopcoop0
coopco-op   -1
coopco-op a -1
coopcoop a  -1
coopco-op b -1
coopcoop b  -1
co-op   coop1
co-op   co-op   0
co-op   co-op a -1
co-op   coop a  -1
co-op   co-op b -1
co-op   coop b  -1
co-op a coop1
co-op a co-op   1
co-op a co-op a 0
co-op a coop a  1
co-op a co-op b -1
co-op a coop b  -1
coop a  coop1
coop a  co-op   1
coop a  co-op a -1
coop a  coop a  0
coop a  co-op b -1
coop a  coop b  -1
co-op b coop1
co-op b co-op   1
co-op b co-op a 1
co-op b coop a  1
co-op b co-op b 0
co-op b coop b  1
coop b  coop1
coop b  co-op   1
coop b  co-op a 1
coop b  coop a  1
coop b  co-op b -1
coop b  coop b  0
_   _   0
_   A   1
_   a   1
_   z   1
_   Z   1
A   _   -1
A   A   0
A   a   0
A   z   -1
A   Z   -1
a   _   -1
a   A   0
a   a   0
a   z   -1
a   Z   -1
z   _   -1
z   A   1
z   a   1
z   z   0
z   Z   0
Z   _   -1
Z   A   1
Z   a   1
Z   z   0
Z   Z   0
coopcoop0
coopco-op   1
coopco-op a 1
coopcoop a  -1
coopco-op b 1
coopcoop b  -1
co-op   coop-1
co-op   co-op   0
co-op   co-op a -1
co-op   coop a  -1
co-op   co-op b -1
co-op   coop b  -1
co-op a coop-1
co-op a co-op   1
co-op a co-op a 0
co-op a coop a  -1
co-op a co-op b -1
co-op a coop b  -1
coop a  coop1
coop a  co-op   1
coop a  co-op a 1
coop a  coop a  0
coop a  co-op b 1
coop a  coop b  -1
co-op b coop-1
co-op b co-op   1
co-op b co-op a 1
co-op b coop a  -1
co-op b co-op b 0
co-op b coop b  -1
coop b  coop1
coop b  co-op   1
coop b  co-op a 1
coop b  coop a  1
coop b  co-op b 1
coop b  coop b  0


Regression in lstrcmpiA (occurred in late June, NLS related)

2003-10-01 Thread Troy Rollo
When lstrcmpiA was moved from ole2nls.c to locale.c, (around 28th June) the 
results of comparisons in some cases became reversed. For example, the 
underscore now returns as greater than alphabetic characters, whereas it used 
to return as less than alphabetic characters. The older behaviour was 
consistent with Win2k.

The output below is from the following source:

---begin test program---
#include 
#include 

char *test_strings[] =
{
"_",
"A",
"a",
"z",
"Z",
0
};


void
test_string(char *pch)
{
char **ppch = test_strings;

while (*ppch)
{
printf("%s\t%s\t%d\n", pch, *ppch, lstrcmpiA(pch, *ppch));
++ppch;
}
}
int
main(int argc, char **argv)
{
char **ppch = test_strings;

while (*ppch)
test_string(*ppch++);
return 0;
}
---end test program---

---Wine output from immediately before the change---
_   _   0sorts
_   A   -1
_   a   -1
_   z   -1
_   Z   -1
A   _   1
A   A   0
A   a   0
A   z   -1
A   Z   -1
a   _   1
a   A   0
a   a   0
a   z   -1
a   Z   -1
z   _   1
z   A   1
z   a   1
z   z   0
z   Z   0
Z   _   1
Z   A   1
Z   a   1
Z   z   0
Z   Z   0
---End---

---Wine output from immediately after the change---
_   _   0
_   A   1
_   a   1
_   z   1
_   Z   1
A   _   -1
A   A   0
A   a   0
A   z   -1
A   Z   -1
a   _   -1
a   A   0
a   a   0
a   z   -1
a   Z   -1
z   _   -1
z   A   1
z   a   1
z   z   0
z   Z   0
Z   _   -1
Z   A   1
Z   a   1
Z   z   0
Z   Z   0
~---End---