Re: [Pharo-dev] 12259: FileSystem memory reads writesusing a binary stream by default

Max Leske Sat, 25 Jan 2014 14:44:52 -0800

So in favor of not modifying ReadStream I implemented your suggestion 
#binaryReadStream in Filesystem (will not work on all streams now but that’s ok 
for me until we get Xstreams).
Please take a look and let me know what you think (ReadStream would have been 2 
methods only vs. ~6 :) ).


Name: 
SLICE-Issue-12259-FileSystem-memory-readswrites-using-a-binary-stream-by-default-MaxLeske.1
Author: MaxLeske
Time: 25 January 2014, 11:38:48.027093 pm
UUID: 1248d0f9-f094-400f-a736-23fd41d12e50
Ancestors: 
Dependencies: FileSystem-Tests-Core-MaxLeske.67, FileSystem-Memory-MaxLeske.46, 
FileSystem-Disk-MaxLeske.72, FileSystem-Core-MaxLeske.139

* added #binaryReadStream and friends to FileSystem (for compatibility)

Cheers,
Max


On 10.12.2013, at 22:19, Nicolas Cellier <[email protected]> 
wrote:

> The more you add, the more you'll have to remove
> 
> 
> 2013/12/10 Damien Cassou <[email protected]>
> 
> On Dec 5, 2013 10:50 PM, "Max Leske" <[email protected]> wrote:
> >
> > There are several different approaches in different places:
> >
> > - FileStream reads strings by default. #binary and #ascii switch between 
> > formats. File streams use an internal buffer which is either a String 
> > (default) or a ByteArray. It’s even possible to switch between binary and 
> > ascii midstream without losing information (if done right) because it only 
> > affects the buffer.
> > - ReadStream and WriteStream cannot change their format. Their behavior is 
> > determined by the underlying collection. Forcing conversions (e.g. by 
> > #asString) can lead to loss of information
> > - RWBinaryOrTextStream (and other subclasses of ReadWriteStream) also 
> > support the #binary #ascii method of switching format. Default is #ascii
> > - SocketStream uses the same #binary / #ascii mechanism. Default is #ascii
> > - ZnLimitedReadStream uses the same #binary / #ascii mechanism. Default is 
> > #binary (implicit); depends on the underlying stream
> >
> > I think the pattern to follow is clear: ReadStream and WriteStream should 
> > allow switching format with #ascii and #binary, default should be #ascii. 
> > However, I suspect there’s a reason that these classes don’t support 
> > switching, namely that switching makes the implementation more complicated 
> > and also slower because more checks need to be made.
> >
> > The easiest solution I see would be to implement something like this:
> >
> > ReadStream>>next
> > ^ self isBinary
> > ifTrue: [ self basicNext asCharacter ]
> > ifFalse: [ self basicNext ]
> >
> > However, #next et al. are implemented in a plugin and the primitive method 
> > looks like this:
> >
> > ReadStream>>next
> > <primitive: 65> 
> > position >= readLimit 
> > ifTrue: [^nil] 
> > ifFalse: [^collection at: (position := position + 1)]
> >
> > This means the collection instance variable has to hold either a binary or 
> > a string collection.
> >
> > I’ve found a solution which would work and I’ve whipped up a working way 
> > (there’s space for improvement…):
> >
> > ReadStream>>binary
> > collection isString ifFalse: [ ^ self ].
> > collection := (ByteArray new: collection size) copyReplaceFrom: 1 to: 
> > collection size with: collection
> >
> > ReadStrem>>ascii
> > collection isString ifTrue: [ ^ self ].
> > collection := (String new: collection size) copyReplaceFrom: 1 to: 
> > collection size with: collection
> >
> > @Damien
> > opposed to what I wrote earlier, #asString does *not* destroy non-printable 
> > characters. Instead, every byte (from 0 to 255) is encoded as a character 
> > and thus the string can be converted back to a ByteArray *without* loss of 
> > information. Sorry about that.
> Thank you very much for your analysis. I'm a bit reluctant to change 
> ReadStream now, but I could be ok with enough unit tests. 
> Another potential solution: What about adding a #binaryReadStream method to 
> the memory file system as a workaround before the introduction of xstreams ? 
> 
> > With this change in place the 12259 would become obsolete.
> >
> > Please let me know what you think. This is a pretty big change that might 
> > have a lot of consequences in the image.
> >
> > Cheers,
> > Max
> >
> > On 04.12.2013, at 13:14, Max Leske <[email protected]> wrote:
> >
> >> Let me see what I can come up with.
> >>
> >>
> >> On 03.12.2013, at 19:36, Damien Cassou <[email protected]> wrote:
> >>
> >>> Thanks Max for the report. Do you have an idea on how we could solve the 
> >>> problem ? The previous behaviour was not acceptable either because the 
> >>> streams that came out of a memory filesystem were the only ones with 
> >>> binary content
> >>>
> >>> On Dec 3, 2013 5:35 PM, "Max Leske" <[email protected]> wrote:
> >>>>
> >>>> Damien, Marcus
> >>>>
> >>>> this change breaks a lot of things in FileSystem-Git. I don’t disagree 
> >>>> with the idea that reading characters should be default (one could argue 
> >>>> about it…) but your change makes it IMPOSSIBLE to read bytes because 
> >>>> unprintable characters are discarded! So if my ByteArray is a NULL 
> >>>> terminated string, for instance, I can not check for the NULL 
> >>>> termination anymore.
> >>>>
> >>>> Cheers,
> >>>> Max
> >>
> >>
> >
>

Re: [Pharo-dev] 12259: FileSystem memory reads writesusing a binary stream by default

Reply via email to