On Tue, 9 Feb 2010, Henrik Johansen wrote:

>
>
> Den 09.02.2010 06:14, skrev Levente Uzonyi:
>> On Mon, 8 Feb 2010, Henrik Sperre Johansen wrote:
>>
>>
>>> I believe its due to stream positioning when crossing buffer boundries in
>>> basicChunk, I have to debug a bit further for a solution though, sorry...
>>>
>> It seems to be an easy one, though I didn't try the fix, just reviewed
>> the code. So the cause of the issue is in MultiByteFileStream >>
>> #basicChunk, which doesn't care about readLimit. When readLimit is less
>> than "collection size", the end of the returned chunk may be the end of a
>> previous chunk.
>>
> Thanks Levente, in my original-author-blindness, I couldn't see past
> that position was apparently set wrong compared to the original
> approach. When you say it like that, it's obvious :)
>> This method has at least two other flaws (These probably won't hurt
>> anyone in the near future, though both can be avoided):
>> 1. if read buffering is disabled it will raise an error
>>
> Yes, this should be fixed. I assume a collection ifNil: [^nil] is enough?

Yes.

>> 2. it assumes that the encoding of the stream is ascii compatible
>>
>>
> Yes, that was a tradeoff. (Which I hoped was a clear implication from
> the method comment)
> Only Converter with this characteristic currently existing (afaict) is
> QuotedPrintable, I doubt that or another format will ever be used for
> storage of code.

This method fails if the encoding is utf16 (which is unlikely), since ! is 
encoded as 00 21 (in hex) so !! will be 00 21 00 21. #basicChunk will 
recognize these as two chunk endings instead of an escaped chunk 
terminator.

> The method you wrote in Squeak doesn't have this limitiation, on the
> other hand that does little for encodings other than utf8 (which, of
> course, could be successfully argued to be the important case).

Currently two encodings are used for fileIn/Out: utf8 and macroman, these
are both ascii compatible. Macroman is only supported for reading, writing 
is always done as utf8. Other encodings' performance can be easily 
enhanced in Squeak if necessary (but that's very unlikely).

> This way, you also avoid a Stream creation/String copy in the common

This could be added to Squeak, but I found that it doesn't make much 
(measureable) difference.


Levente

> case, as well as not let encodings in addition to streams contain
> special logic to read in code chunks.
>> Levente
>>
>>
> Cheers,
> Henry
>
> _______________________________________________
> Pharo-project mailing list
> [email protected]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to