Alistair,

I am replying in-between, please don't take my remarks as being negative, I 
just like to straighten things out as clear as possible.

> On 10 Apr 2018, at 22:14, Alistair Grant <akgrant0...@gmail.com> wrote:
> 
> Hi Sven,
> 
> On 10 April 2018 at 19:36, Sven Van Caekenberghe <s...@stfx.eu> wrote:
>> I have trouble understanding your problem analysis, and how your proposed 
>> solution, would solve it.
> 
> The discussion started with my proposal to modify the Zinc streams to
> check the return value of #next for nil rather than using #atEnd to
> iterate over the underlying stream.  You correctly pointed out that
> the intention behind #atEnd is that it can be used to control
> iterating over streams.

OK

>> Where is it being said that #next and/or #atEnd should be blocking or 
>> non-blocking ?
> 
> There is existing code that assumes that #atEnd is non-blocking and
> that #next is allowed block.  I believe that we should keep those
> conditions.

I fail to see where that is written down, either way. Can you point me to 
comments stating that, I would really like to know ?

>> How is this related to how EOF is signalled ?
> 
> Because, combined with terminal EOF not being known until the user
> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used
> for iterating over input from stdin connected to a terminal.

This seems to me like an exception that only holds for one particular stream in 
one particular scenario (interactive stdin). I might be wrong.

>> It seems to me that there are quite a few classes of streams that are 
>> 'special' in the sense that #next could be blocking and/or #atEnd could be 
>> unclear - socket/network streams, serial streams, maybe stdio (interactive 
>> or not). Without a message like #isDataAvailable you cannot handle those 
>> without blocking.
> 
> Right.  I think this is a distraction (I was trying to explain some
> details, but it's causing more confusion instead of helping).
> 
> The important point is that #atEnd doesn't work for iterating over
> streams with terminal input

Maybe you should also point to the actual code that fails. I mean you showed a 
partial stack trace, but not how you got there, precisely. How does the 
application reading from an interactive stdin do to get into trouble ?

>> Reading from stdin seems like a very rare case for a Smalltalk system (not 
>> that it should not be possible).
> 
> There's been quite a bit of discussion and several projects recently
> related to using pharo for scripting, so it may become more common.
> E.g.
> 
> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95
> https://github.com/rajula96reddy/pharo-cli

Still, it is not common at all.

>> I have a feeling that too much functionality is being pushed into too small 
>> an API.
> 
> This is just about how should Zinc streams be iterating over the
> underlying streams.  You didn't like checking the result of #next for
> nil since it isn't general, correctly pointing out that nil is a valid
> value for non-byte oriented streams.  But #atEnd doesn't work for
> stdin from a terminal.
> 
> 
> At this point I think there are three options:
> 
> 1. Modify Zinc to check the return value of #next instead of using #atEnd.
> 
> This is what all existing character / byte oriented streams in Squeak
> and Pharo do.  At that point the Zinc streams can be used on all file
> / stdio input and output.

I agree that such code exists in many places, but there is lots of stream 
reading that does not check for nils. 

> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel
> or notification / exception.
> 
> This is what we were discussing below.  But it is a decent chunk of
> work with significant impact on the existing code base.

Agreed. This would be a future extension.

> 3. Require anyone who wants to read from stdin to code around Zinc's
> inability to handle terminal input.
> 
> I'd prefer to avoid this option if possible.

See higher for a more concrete usage example request. 

> Does that clarify the situation?

Yes, it helps. Thanks. But questions remain.

> Thanks,
> Alistair
> 
> 
> 
>>> On 10 Apr 2018, at 18:30, Alistair Grant <akgrant0...@gmail.com> wrote:
>>> 
>>> First a quick update:
>>> 
>>> After doing some work on primitiveFileAtEnd, #atEnd now answers
>>> correctly for files that don't report their size correctly, e.g.
>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly or
>>> redirected through stdin.
>>> 
>>> However determining whether stdin from a terminal has reached the end of
>>> file can't be done without making #atEnd blocking since we have to wait
>>> for the user to flag the end of file, e.g. by typing Ctrl-D.  And #atEnd
>>> is assumed to be non-blocking.
>>> 
>>> So currently using ZnCharacterReadStream with stdin from a terminal will
>>> result in a stack dump similar to:
>>> 
>>> MessageNotUnderstood: receiver of "<" is nil
>>> UndefinedObject(Object)>>doesNotUnderstand: #<
>>> ZnUTF8Encoder>>nextCodePointFromStream:
>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:
>>> ZnCharacterReadStream>>nextElement
>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next
>>> UndefinedObject>>DoIt
>>> 
>>> 
>>> Going back through the various suggestions that have been made regarding
>>> using a sentinel object vs. raising a notification / exception, my
>>> (still to be polished) suggestion is to:
>>> 
>>> 1. Add an endOfStream instance variable
>>> 2. When the end of the stream is reached answer the value of the
>>>  instance variable (i.e. the result of sending #value to the variable).
>>> 3. The initial default value would be a block that raises a Deprecation
>>>  warning and then returns nil.  This would allow existing code to
>>>  function for a changeover period.
>>> 4. At the end of the deprecation period the default value would be
>>>  changed to a unique sentinel object which would answer itself as its
>>>  #value.
>>> 
>>> At any time users of the stream can set their own sentinel, including a
>>> block that raises an exception.
>>> 
>>> 
>>> Cheers,
>>> Alistair
>>> 
>>> 
>>> On 4 April 2018 at 19:24, Stephane Ducasse <stepharo.s...@gmail.com> wrote:
>>>> Thanks for this discussion.
>>>> 
>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <s...@stfx.eu> wrote:
>>>>> Alistair,
>>>>> 
>>>>> First off, thanks for the discussions and your contributions, I really 
>>>>> appreciate them.
>>>>> 
>>>>> But I want to have a discussion at the high level of the definition and 
>>>>> semantics of the stream API in Pharo.
>>>>> 
>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <akgrant0...@gmail.com> wrote:
>>>>>> 
>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <s...@stfx.eu> wrote:
>>>>>>> Playing a bit devil's advocate, the idea is that, in general,
>>>>>>> 
>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ].
>>>>>>> 
>>>>>>> is no longer allowed ?
>>>>>> 
>>>>>> It hasn't been allowed "forever" [1].  It's just been misused for
>>>>>> almost as long.
>>>>>> 
>>>>>> [1] Time began when stdio stream support was introduced. :-)
>>>>> 
>>>>> I am still not convinced. Another way to put it would be that the old 
>>>>> #atEnd or #upToEnd do not make sense for these streams and some new loop 
>>>>> is needed, based on a new test (it exists for socket streams already).
>>>>> 
>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ]
>>>>> 
>>>>>>> And you want to replace it with
>>>>>>> 
>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] whileTrue.
>>>>>>> 
>>>>>>> That is a pretty big change, no ?
>>>>>> 
>>>>>> That's the way quite a bit of code already operates.
>>>>>> 
>>>>>> As Denis pointed out, it's obviously problematic in the general sense,
>>>>>> since nil can be embedded in non-byte oriented streams.  I suspect
>>>>>> that in practice not many people write code that reads streams from
>>>>>> both byte oriented and non-byte oriented streams.
>>>>> 
>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear definition 
>>>>> problem.
>>>>> 
>>>>> And I do use streams of byte arrays or strings all the time, this is 
>>>>> really important. I want my parsers to work on all kinds of streams.
>>>>> 
>>>>>>> I think/feel like a proper EOF exception would be better, more correct.
>>>>>>> 
>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue.
>>>>>> 
>>>>>> I agree, but the email thread Nicolas pointed to raises some
>>>>>> performance questions about this approach.  It should be
>>>>>> straightforward to do a basic performance comparison which I'll get
>>>>>> around to if other objections aren't raised.
>>>>> 
>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which is 
>>>>> basically Unix's (2) Read sys call), would solve performance problems, I 
>>>>> think.
>>>>> 
>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use it ?
>>>>>> 
>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if an
>>>>>> error occurs.  You should still check #atEnd after reading past the
>>>>>> end of the file to make sure no error occurred.  Another part of the
>>>>>> primitive change I'm proposing is to return additional information
>>>>>> about what went wrong in the event of an error.
>>>>> 
>>>>> I am sorry, but this kind of semantics (the OR) is way too complex at the 
>>>>> general image level, it is too specific and based on certain underlying 
>>>>> implementation details.
>>>>> 
>>>>> Sven
>>>>> 
>>>>>> We could modify the read primitive so that it fails if an error has
>>>>>> occurred, and then #atEnd wouldn't be required.
>>>>>> 
>>>>>> Cheers,
>>>>>> Alistair
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <akgrant0...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Nicolas,
>>>>>>>> 
>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier
>>>>>>>> <nicolas.cellier.aka.n...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant <akgrant0...@gmail.com>:
>>>>>>>>>> 
>>>>>>>>>> Hi Sven,
>>>>>>>>>> 
>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van Caekenberghe 
>>>>>>>>>> wrote:
>>>>>>>>>>> Somehow, somewhere there was a change to the implementation of the
>>>>>>>>>>> primitive called by some streams' #atEnd.
>>>>>>>>>> 
>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated yet.  
>>>>>>>>>> So
>>>>>>>>>> the discussion below should apply to the current stable vm (from 
>>>>>>>>>> August
>>>>>>>>>> last year).
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being zero'
>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized.
>>>>>>>>>> 
>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) it is
>>>>>>>>>> effectively defined as:
>>>>>>>>>> 
>>>>>>>>>> atEnd := stream position >= stream size
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Now, all kinds of changes are being done image size to work around 
>>>>>>>>>>> this.
>>>>>>>>>> 
>>>>>>>>>> I would phrase this slightly differently :-)
>>>>>>>>>> 
>>>>>>>>>> Some code does the right thing, while other code doesn't.  E.g.:
>>>>>>>>>> 
>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while
>>>>>>>>>> FileStream>>contents is incorrect
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) streams, but 
>>>>>>>>>>> I
>>>>>>>>>>> am not sure we are doing the right thing here.
>>>>>>>>>>> 
>>>>>>>>>>> Point is, I am not sure #next returning nil is official and 
>>>>>>>>>>> universal.
>>>>>>>>>>> 
>>>>>>>>>>> Consider the comments:
>>>>>>>>>>> 
>>>>>>>>>>> Stream>>#next
>>>>>>>>>>> "Answer the next object accessible by the receiver."
>>>>>>>>>>> 
>>>>>>>>>>> ReadStream>>#next
>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented by the
>>>>>>>>>>> receiver. Fail if the collection of this stream is not an Array or a
>>>>>>>>>>> String.
>>>>>>>>>>> Fail if the stream is positioned at its end, or if the position is 
>>>>>>>>>>> out
>>>>>>>>>>> of
>>>>>>>>>>> bounds in the collection. Optional. See Object documentation
>>>>>>>>>>> whatIsAPrimitive."
>>>>>>>>>>> 
>>>>>>>>>>> Note how there is no talk about returning nil !
>>>>>>>>>>> 
>>>>>>>>>>> I think we should discuss about this first.
>>>>>>>>>>> 
>>>>>>>>>>> Was the low level change really correct and the right thing to do ?
>>>>>>>>>> 
>>>>>>>>>> The primitive change proposed doesn't affect this discussion.  It 
>>>>>>>>>> will
>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while 
>>>>>>>>>> currently it
>>>>>>>>>> returns true (incorrectly).  The end result is still incorrect, e.g.
>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo.
>>>>>>>>>> 
>>>>>>>>>> You're correct about no mention of nil, but we have:
>>>>>>>>>> 
>>>>>>>>>> FileStream>>next
>>>>>>>>>> 
>>>>>>>>>>     (position >= readLimit and: [self atEnd])
>>>>>>>>>>             ifTrue: [^nil]
>>>>>>>>>>             ifFalse: [^collection at: (position := position + 1)]
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> which has been around for a long time (I suspect, before Pharo 
>>>>>>>>>> existed).
>>>>>>>>>> 
>>>>>>>>>> Having said that, I think that raising an exception is a better
>>>>>>>>>> solution, but it is a much, much bigger change than the one I 
>>>>>>>>>> proposed
>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Cheers,
>>>>>>>>>> Alistair
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> yes, if you are after universal behavior englobing Unix streams, the
>>>>>>>>> Exception might be the best way.
>>>>>>>>> Because on special stream you can't allways say in advance, you have 
>>>>>>>>> to try.
>>>>>>>>> That's the solution adopted by authors of Xtreams.
>>>>>>>>> But there is a runtime penalty associated to it.
>>>>>>>>> 
>>>>>>>>> The penalty once was so high that my proposal to generalize 
>>>>>>>>> EndOfStream
>>>>>>>>> usage was rejected a few years ago by AndreaRaab.
>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html
>>>>>>>> 
>>>>>>>> Thanks for this, I'll definitely take a look.
>>>>>>>> 
>>>>>>>> Do you have a sense of how Denis' suggestion of using an EndOfStream
>>>>>>>> object would compare?
>>>>>>>> 
>>>>>>>> It would keep the same coding style, but avoid the problems with nil.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Alistair
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago.
>>>>>>>>> Maybe i can excavate and pass on newer VM.
>>>>>>>>> 
>>>>>>>>> In the mean time, i had experimented a programmable end of stream 
>>>>>>>>> behavior
>>>>>>>>> (via a block, or any other valuable)
>>>>>>>>> http://www.squeaksource.com/XTream.htm
>>>>>>>>> so as to reconcile performance and universality, but it was a source 
>>>>>>>>> of
>>>>>>>>> complexification at implementation side.
>>>>>>>>> 
>>>>>>>>> Nicolas
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Note also that a Guille introduced something new, #closed which is
>>>>>>>>>>> related to the difference between having no more elements (maybe 
>>>>>>>>>>> right now,
>>>>>>>>>>> like an open network stream) and never ever being able to produce 
>>>>>>>>>>> more data.
>>>>>>>>>>> 
>>>>>>>>>>> Sven
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
> <stdio.cs>


Reply via email to