Re: [Pharo-dev] About cr and lf

Thierry Goubier Sun, 06 Aug 2017 02:48:52 -0700

Hi Stef, all,

I'd like to point out that, during the life of an image (prepared onwindows, then used on Linux, then on Mac), the meaning of a new linechanges -> should then all 'newLines' done on streams while on windowsbe changed when the image restart on Linux?


My take would be:

- Set a convention internal to Pharo (#cr is then fine for me)
- provide an easy and simple "convert to current os convention"

I'd suspect, Stef, that your #newLine will end up being a mess formulti-platform apps (and force multi-platform apps to add code todetermine on which platform a stream was written to to be able toreconvert it back)


Regards,

Thierry


Le 06/08/2017 à 10:54, Stephane Ducasse a écrit :

Agreed :)

But so far what do we do?


- #cr and #lf just put this character in the stream.
- #newLine put the underlying OS line ending.

Then we should revisit all the cr inside the system and use newline.
Then we should think about the internal usage of cr by default in
Pharo (We should change it).

Does it make sense?
Stef

Stef

On Sun, Aug 6, 2017 at 10:36 AM, Sven Van Caekenberghe <[email protected]> wrote:

On 6 Aug 2017, at 08:59, Guillermo Polito <[email protected]> wrote:

Can somebody propose an implementation besides on top of this discussion? I 
propose such an implementation should take the form of a stream decorator 
instead of changing the base implementation.


YES, a decorator !

We want simpler streams, not the old complex ones. Less API, less functionality.

Guille

On Sun, Aug 6, 2017 at 12:01 AM, Peter Uhnak <[email protected]> wrote:
Hi,

just to (hopefully) clarify my intention, maybe pseudocode would describe my 
thoughts better.

Stream>>beForWindows
         "Use consistenly Windows line endings (CRLF) in the stream"
         self convertLineEndings: true
         lineEnding := String crlf

Stream>>convertLineEndings: aBoolean
         "automatically convert line endings to the predefined one"
         convertLineEndings := aBoolean

Stream>>nl
         self nextPutAll: lineEnding

Stream>>cr
         convertLineEndings ifTrue: [
                 self deprected: 'Using #cr/#lf for generic newlines is deprected, 
use #nl instead".
                 self nl.
         ] ifFalse: [
                 self nextPutAll: String cr.
         ]


So when "convertLineEndings = true", then using anything else than #nl would 
warn the user that they should use #nl instead (as they explicitly requested to have the 
line endings consistend).

And when "convertLineEndings = false", it would behave pretty much the same way 
as now, #cr would write #cr, etc.


With such approach imho

* output of existing code wouldn't be broken
* when switching to new scheme (#nl) the programmer would be warned where the 
missed a change
* it will be easier to keep the newlines consistent

Peter

On Sat, Aug 05, 2017 at 11:30:58AM +0200, Esteban Lorenzano wrote:

On 5 Aug 2017, at 11:17, Peter Uhnak <[email protected]> wrote:

I think there is a consensus we need to keep #cr and #lf as intended


Is there?

My argument was that there's no (obvious) reason to combine different line 
endings in the same document. Therefore if you were to use #cr, #lf, #crlf, you 
would actually mean that you just want to enter newline.


no, sometimes you want to enforce a specific line ending. You will not mix, but 
you will not use the one from platform. Also, sometimes you actually can use 
#cr and #lf different as their immediate, common use (I’ve seen some weird 
exporting formats).


Similar problem arises when you would write a multiline string:

stream nextPutAll: 'first line with enter
second line'.

Stored in method this will most likely contain Pharo's internal representation 
(#cr), even though you just want a new line, and not specifically #cr. (I've 
lost count how many times tests on CI failed because of this.)

Considering the above, my opinion is:

1) by default #cr, #lf, #crLf, #nl (#newLine) will write whatever is globally 
configured for the stream (#beFor*)


No, I strongly disagree.
#cr and #lf are ascii characters and should not be redefined.

2) if one wanted to combine different line endings in the same stream, there should 
be an option to disable autoconversion. (Stream>>noNewLineAutoconversion)

If (1) is much more common than (2), then imho autoconversion should cause no 
issues.
If (1) is NOT that much more common than (2), then autoconversion wouldn't be 
beneficial.

Autoconversion could also make transition easier because existing code will 
suddenly work as intended here without breaking anything (hopefully).


Sorry, I do not see what this approach solves than the other one does not (and 
as I see, this one is a lot more complicated and prone to confusion).

cheers,
Esteban


Peter


On Sat, Aug 05, 2017 at 10:49:02AM +0200, Esteban Lorenzano wrote:

I think there is a consensus we need to keep #cr and #lf as intended, yet to 
add some kind of #newLine (which btw is different to EOL :P) vocabulary, isn’t?

In this, I favour Peter approach for define line ending convention (the way 
#newLine will work)… and of course by default it should use the one from the 
current platform.

anything agains this approach?

Esteban

On 4 Aug 2017, at 23:48, Tudor Girba <[email protected]> wrote:

+1.

We need a basic representation of those characters. Logical ones should be 
derived from the simple ones.

Doru

On Aug 4, 2017, at 3:44 PM, Esteban Lorenzano <[email protected]> wrote:

On 4 Aug 2017, at 15:41, Damien Pollet <[email protected]> wrote:

I agree with Pablo, #cr and #lf should not be clever and just be names for the 
carriage return and linefeed characters/codepoints.

+1


Making #newLine's behavior dependent on the current platform disturbs me, 
though. I'd rather have:

Stream >> newLineFor: platform
   self nextPutAll: platform lineEnding

Stream >> newLineForCurrentPlatform
   self newLineFor: OSPlatform current

Stream >> newLineForWindows "convenience for the most common platforms
Stream >> newLineForUnix
Stream >> newLineForHistoricReasons

Stream >> newLine
   "delegates to one of the above, I'd argue for unix for convenience, but windows 
is the technically correct combination of cr + lf, and cr only is the historic one"


On 4 August 2017 at 14:25, [email protected] <[email protected]> wrote:
To me it is clear that cr and lf should be in streams. But they should put the 
'cr' or 'lf' character only. And of course the platform independent newline 
should be also.

The first (cr, lf) should be used by the code wanting to have absolute control 
of what is in the stream. The later (newline) when you just want a new line.

The two have completely different behaviour, ones are really low level, the 
other is higher level.

On 4 Aug 2017 14:20, "Esteban Lorenzano" <[email protected]> wrote:

On 4 Aug 2017, at 14:06, Stephane Ducasse <[email protected]> wrote:

Well. This is not implemented like that in Pharo.

cr is bad because it does not mean that it is independent of the platform.
So cr can be redefined as newLine and keep but not used inside the system.


sometimes you actually want to write a cr (or a lf). So it needs to remain in 
the system, of course.
now, including #newLine can be cool (most of the times you want the “platform 
compatible” new line). Also I would consider including #nl, abbreviated… just 
for convenience :P

Esteban


Stef

On Fri, Aug 4, 2017 at 12:50 PM, Jan Vrany <[email protected]> wrote:

On Fri, 2017-08-04 at 12:03 +0200, Stephane Ducasse wrote:

Hi guys

While writing pillar code, I ended up using "stream cr" and it
worries
me to still expand usage
of a pattern I would like to remove.

Let us imagine that we would like to prepare the migration from cr.
I was thinking that we could replace cr invocation by newLine so that
after newLine
could be redefined as

Stream >> newLine
     self nextPutAll: OSPlatform current lineEnding


what do you think about this approach?


Why not? But please keep #cr.

Section 5.9.4.1 of ANSI reads:

Message: cr

Synopsis
Writes an end-of-line sequence to the receiver.

Definition: <puttableStream>
A sequence of character objects that constitute the implementation-
defined end-of-line sequence is added to the receiver in the same
manner as if the message  #nextPutAll: was sent to the receiver with
an argument string whose elements are the sequence of characters.

Return Value
UNSPECIFIED
Errors
It is erroneous if any element of the end-of-line sequence is an
object that does not conform to the receiver's sequence value type .

my 2c,

Jan


Stef






--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet


--
www.tudorgirba.com
www.feenk.com

"Presenting is storytelling."





--

Guille Polito

Research Engineer
French National Center for Scientific Research - http://www.cnrs.fr


Web: http://guillep.github.io
Phone: +33 06 52 70 66 13

Re: [Pharo-dev] About cr and lf

Reply via email to