On Aug 16, 2010, at 9:49 30PM, Philippe Marschall wrote:

> Hi
> 
> I decided to write ISO-8859-15 and CP-1252 support [1] (mostly for
> selfish reasons so that Seaside on Pharo would support ISO-8859-15 and
> CP-1252).


More converters are always nice :D
Their code seems ok to me.
> 
> A couple of notes:
> - the five unmapped bytes of CP-1252 (not ISO-8859-15, the comment is
> wrong) are mapped to the Unicode replacement character (U+FFFD)
> - a new Latin9 language environment is introduced
> - some minor clean up like removing unused class variables
> 
> I'd appreciate it if somebody knowledgeable in these areas could review
> the changes. I'm especially unsure about the Latin9 language
> environment, but reusing Latin1 or Unicode seemed wrong.

I'm not sure its too wrong, according to EncodedCharSet comment: 
"The other confusion comes from the name of "Latin1" class.  It used to mean 
the Latin-1 (ISO-8859-1) character set, but now it primarily means that the 
"Western European languages that are covered by the characters in Latin-1 
character set."
I'd reckon the same holds true for Latin1Environment (Western ), 
Latin2Environment (Eastern), and Latin7Environment (Greek). I don't think 
CP1252/8859-15 warrants the same as they are basically alternative encodings to 
latin1 for western languages.

Also: 
- leadingChar is used in StrikeFontSet to choose different glyph sets. This 
allows for StrikeFonts supporting more than the default latin1 glyphs, seems to 
me it would be "wrong" to use the same one for two different encodings. 
Not sure why this approach was taken rather than allowing additional strike 
font sets based on unicode code point ranges, then using leadingChar only to 
differentiate when the visual glyphs for those code points would be different. 
I suspect it maybe was developed to deal with Han unification first, then 
reused to support multiple character sets later.

- LanguageEnvironment seems to have been used in conjunction with translation 
(note the entire old translation system was removed in Pharo and replaced by an 
external package), maybe to decide which encoding externally stored translation 
files should be read in as.
Then, having environments with overlapping supportedLanguages seem somewhat 
weird as well.
Modifying defaultEncodingName/systemConverterClass of Latin1Environment to use 
CP1252 for some Windows systems (as per Latin2) may be another approach, may or 
may not lead to unintended consequences elsewhere though, I did not investigate 
all uses.

IMHO, for someone who wasn't involved in its developemnt, the whole 
multilingual package could use some cleaning, more class comments, and clearer 
statement of responsibilities.

Cheers,
Henry

TLDR; 
More converters: yay! 
More LanguageEnvironments: o_O, not sure
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to