On Tue, 3 May 2022 21:42:35 GMT, Phil Race <p...@openjdk.org> wrote:

>> Toshio Nakamura has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Moved the fix to WFontConfiguration
>
> It looks to me as if we specify a latin font as the text component font, some 
> windows fall back behaviour insists
> on a minimum size for the Japanese fallback font.
> And the way to avoid that is to specify a locale (Japanese) font instead 
> which is what used to happen.
> 
> -------
> Naoto suggested :
> -sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol
> +sequence.allfonts.UTF-8.ja=japanese,alphabetic,dingbats,symbol
> 
> This did't work for me because it isn't picking up that line anyway
> 
> So what I see is that MS Mincho isn't even in the list of names it is 
> considering !
> Because we are finding this :-
> sequence.allfonts=alphabetic/default,dingbats,symbol
> 
> I see Toshio says he saw the UTF-8 entry being used, but I don't see that.
> So I need to understand why not the UTF-8 entry - note that I have set my 
> system locale to JA now.
> The consequence of this is that the fallback sequence is what provides 
> Japanese and
> so it is from the Chinese MingLiu-ExtB font which I do have installed.
> 
> 
> Toshio is right that what matters here for the native text component is what 
> is picked up in
> the logic around WFontConfiguration.getTextComponentFontName()
> 
> The helper method for getTextComponentFontName() is findFontWithCharset()
> 
> That has a bit of a questionable behaviour in that it returns the *last* font 
> in the
> list that matches the charset it wants.
> So *hypothetically* if we had the charset as DEFAULT_CHARSET
> MS Mincho,DEFAULT_CHARSET
> Times New Roman,DEFAULT_CHARSET
> and  if we had
> Times New Roman,DEFAULT_CHARSET
> MS Mincho,SHIFTJIS_CHARSET
> 
> then in both cases we'd get Times and still have the problem
> The latter case seems to actually happen - and so even though the font is 
> there, we ignore it.
> Clearly what we want is the "locale" font, and we are using encoding to 
> identify any one
> that matches but this breaks down in UTF8.
> Toshio pointed out that code in WFontConfiguration initTables() basically says
> if we found a font tagged as "japanese" then its subsetCharMap entry is 
> SHIFTJIS_CHARSET
> and this used to work because this also mapped windows-31j to 
> SHIFTJIS_CHARSET.
> But what do you map UTF-8 to ? The current code maps it to DEFAULT_CHARSET.
> There needs to be a different way of doing this for UTF-8 locales.
> 
> So this fix is a "band aid" on the problem that in the UTF 8 locale we don't 
> seem to be picking
> up the entry we should. 
> If Toshio confirms for SURE he's seeing the UTF-8 one picked up it would be a 
> useful data point.
> I still need to debug why I am not getting it.
> 
> UPDATE: pilot error on my part - I set lang to jp .. not ja .. 
> 
> So back to just the encoding case .. 
> 
> Regarding what Toshio pointed out that we can't have both
> sequence.allfonts.UTF-8.zh.CN=alphabetic,chinese-ms936,dingbats,symbol,chinese-ms936-extb
> sequence.allfonts.UTF-8.zh.CN=alphabetic,chinese-gb18030,dingbats,symbol,chinese-gb18030-extb
> I think that's just a fact. Once you choose UTF-8 you have to decide which of 
> these you want.

> Hi @prrace
> 
> Yes, my system also picked up "UTF-8.ja" line. "ja" can be specified by 
> locale data, such as "-Duser.language=ja".
> 

Yes, so did I after I fixed my typo.

> However, I was not able to recreate the wrong size issue on the system which 
> changed the primary language from English to Japanese. There may be 
> differences between pure Japanese Windows and English Windows changed the 
> primary language to Japanese.

I definitely can reproduce this on my "English" windows by changing the system 
locale.


> 
> Unicode (UTF-8) is language independent. So, we need to use a locale data. I 
> created a trial patch to use locale data. If you prefer this way, 

If you mean those changes you have in this latest diff then I was thinking of 
something like that .. although not exactly.

I'll also adjust fontconfig file and test some environments I can prepare.

I'm not sure what fontconfig changes you are thinking of.

> 
> > sequence.allfonts=alphabetic/default,dingbats,symbol
> 
> "alphabetic/default" assigned to "DEFAULT_CHARSET", but it's only used on 
> this line.
> 
> > sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol
> 
> "alphabetic" assigned to "ANSI_CHARSET". So, if we had "DEFAULT_CHARSET", 
> nothing was matched. Then, the first one was used. (WFontConfiguration.java, 
> l.165)

Yes, I think we have issues there I'd like to look closely at.

Really, I am at the point where I'd like to say "thank you for drawing this to 
our attention, but I'd prefer to do this fix myself"

I can forsee a bunch of follow up work that we might want to do over time.
This change to UTF-8 seems to be very impactful on this code.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8329

Reply via email to