Hi,
Given
https://pharo.fogbugz.com/f/cases/21858/Cleanup-remaining-DeprecatedFileSystem-users
[where we need more help !!] I found String>>#zipped to be one of the users of
the deprecated RWBinaryOrTextStream. Although this usage is easy enough to fix,
I think the current #zipped / #unzipped on String is broken.
(note also that there are no real users of these methods)
Right now it seems cool that the following is an identity.
'foobar' zipped unzipped.
However, the result of zipping something is actual something binary (a
collection of opaque bytes). Thinking of it, the input is actually also bytes,
not unencoded characters.
Of course, the current methods are broken, as can be seen from a more complex
(wide) string.
'élèves Françaises @ 10 €' zipped unzipped. >>> <something very weird>
The error results from some implicit/wrong character encoding being used.
The right way to do this is to explicitly encode/decode the string.
(GZipReadStream on: (ByteArray streamContents: [ :out |
(GZipWriteStream on: out)
nextPutAll: 'élèves Françaises à 10 €' utf8Encoded;
close ])) upToEnd utf8Decoded.
From this it would follow that #zipped / #unzipped would make more sense on
ByteArray. So that the above identity would become.
'élèves Françaises à 10 €' utf8Encoded zipped unzipped utf8Decoded.
This change of signature would be comparable to what we recently did with
#base64Encoded / #base64Decoded
What do you think ?
Sven