Thanks you all and I love the bug entry.
Hi nicolas I understand your point but you see nicolas we are not
sprinting blindly.
We are the guys that pushed SUnit Tests in Squeak back then and in Pharo
as well.
We are pushing for Quality Rules and a lot more.
Now we are doing a lot of things because the world is moving
and we have a dream to make comes true :)
And you would commit to Pharo would help us too because we have a long list
of great changes.
- Epicea
- Bootstrap
- Bloc
- Xtreams
-
Stef
Le 4/3/16 11:28, Max Leske a écrit :
https://pharo.fogbugz.com/f/cases/17751/Remove-TextConverter
On 03 Mar 2016, at 09:32, Max Leske <[email protected]> wrote:
Thank you everyone for the explanations.
How would you feel about opening an issue for Pharo 6 to remove
TextConverter?
On 03 Mar 2016, at 01:04, Nicolas Cellier
<[email protected]
<mailto:[email protected]>> wrote:
In other words Max, if you correctly invoke
MacRomanTextConverter initializeLatin1MapAndEncodings.
Then the old converter retrieves a good health.
Now that you know that, you can remove the old converters and keep
Zn modernized ones ;)
That does indeed fix the problem.
2016-03-03 0:56 GMT+01:00 Nicolas Cellier
<[email protected]
<mailto:[email protected]>>:
2016-03-02 23:16 GMT+01:00 Henrik Sperre Johansen
<[email protected]
<mailto:[email protected]>>:
Not sure I'd say Squeak's (5.0 at least) MacRoman conversion
is free of bugs
either, at least the "legacy" ByteTextConverter subclass in
Pharo passes the
following:
"U+0152, Latin capital ligature OE is codepoint 16rCE in
mac-roman"
((Character value: 16r0152) asString convertToEncoding:
'mac-roman') first
charCode = 16rCE.
"Codepoint 170 in MacRoman is TM sign, U+2122"
((Character value: 170) asString convertFromEncoding:
'mac-roman') first
charCode = 16r2122.
Cheers,
Henry
Yes, you're right, it's because squeak tables did and still use
CP1252 instead of ISO8859-L1 and thus do not match unicode. That
might have made sense when porting from mac to windows while
keeping ByteString, but at least since the switch to unicode
that's bogus. I guess it's still here because some in image
fonts would support cp1252 but I am too tired to check it now...
My mistake is that character unicode 216 -> MacRoman was already
false in Pharo 1.1.
It was false because Pharo picked a a bogus table manually
crafted from the internet pages (from Sophie project?)
Then Sven did correct the table by automagically decoding the url...
But this didn't correct anything because the
initializeLatin1MapAndEncodings was never invoked (it was
missing already in Pharo 1.1).
Unfortunately, those maps are a speed-up cache and will mask the
correction of table if not updated.
In Squeak, initializeLatin1MapAndEncodings was called from class
side initialization right from the beginning, but this was
forgotten during the port to Pharo, that would be interesting to
know why...
Ah yes, lazy initialization made it work without the need for
class initialization, but that was a one shot gun, not robust to
further table changes, that's the drawback of being lazy.
So, most probably code was too complex and this is enough to
explain the mistakes.
Why was it too complex?
Because it was an optimization for speed (fast scanning of bytes
NOT NEEDING ANY conversion).
And the initialization was too much convoluted because it was
reusing convoluted multilngual API.
My feeling is that it's an effect of the "least possible change
that could possibly extend functionality".
For me, it's never enough to say "old converters were broken".
There's allways to learn from one mistake and that's why I'm asking.
My feeling is that Pharo guys allways sprint and never look behind.
This is at the risk of repeating some mistake...
--
View this message in context:
http://forum.world.st/TextConverter-is-broken-tp4882039p4882095.html
Sent from the Pharo Smalltalk Developers mailing list
archive at Nabble.com <http://nabble.com/>.