Next plate is (1a) incidental whitespace.

Having decided that we are content with "fat" delimiters (""") for multi-line 
strings, we have some more choices to make regarding multi-line strings.  
(We're not going to talk about "raw" strings yet; let's finish the multi-line 
course first.)

Multi-line strings are different from single-line strings in a number of ways, 
so let's get clear on what we want "multi-line" to mean.

Line terminators:  When strings span lines, they do so using the line 
terminators present in the source file, which may vary depending on what 
operating system the file was authored.  Should this be an aspect of 
multi-line-ness, or should we normalize these to a standard line terminator?  
It seems a little weird to treat string literals quite so literally; the choice 
of line terminator is surely an incidental one.  I think we're all comfortable 
saying "these should be normalized", but its worth bringing this up because it 
is merely one way in which incidental artifacts of how the string is embedded 
in the source program force us to interpret what the user meant.  Which brings 
us to the next incidental aspect...

Whitespace:  A multi-line string is nestled in the context of a Java source 
program.  It is likely (though not guaranteed) that the indentation of lines 
has been distorted by the desire to make the embedded snippet align with the 
enclosing lines.  Most of the time, there is some combination of incidental 
whitespace and intended whitespace.  There are a number of algorithms by which 
we could try to intuit which the user intended.  Which brings us to ask:

 - Assuming the existence of a reasonable algorithm for re-aligning text, what 
should the _default_ be for the language? Should it assume the user wants 
re-alignment, or make the user explicitly opt in?
 - If the choice is "automatically align", how would we indicate the desire to 
opt out?
 - Should we limit what we do automatically to only what can be done by an 
equivalent library routine?

(Again, let's focus on the requirements and semantics and defaults first, 
before we bikeshed the syntax.)

Its hard to answer the above without a clear understanding of the use cases.  
So, here's a partial catalog of examples; let's play "what was the user 
thinking", and see if we can agree on that.

Examples;

String a = """
           +--------+
           |  text  |
           +--------+
           """; // first characters in first column?

String b = """
               +--------+
               |  text  |
               +--------+
           """; // first characters in first column or indented four spaces?

String c = """
               +--------+
               |  text  |
               +--------+
"""; // first characters in first column or indented several?

String d = """
    +--------+
    |  text  |
    +--------+
"""; // first characters in first column or indented four?

String e =
"""
+--------+
|  text  |
+--------+
"""; // heredoc?

String f = """


               +--------+
               |  text  |
               +--------+


           """; // one or all leading or trailing blank lines stripped?

String g = """
              +--------+
              |  text  |
              +--------+"""; // Last \n dropped

String h = """+--------+
              |  text  |
              +--------+"""; // determine indent of first line using scanner 
knowledge?

String i = """  "nested"  """; // strip leading/trailing space?

String j = ("""
                 public static void """ + name + """(String... args) {
                     System.out.println(String.join(args));
                 }
           """).align(); // how do we handle expressions with multi-line 
strings?

String k = """
                 public static void %s(String... args) {
                     System.out.println(String.join(args));
                 }
           """.format(name); // is this the answer to  multi-line string 
expressions?

As we can see, there were a lot of cases where the user _probably_ wanted one 
thing, but _might have_ wanted another.  What control knobs do we have, that we 
could assign meaning to, that would let the user choose either way?  Candidates 
include:

 - The opening line (is it blanks followed by a newline, or are there 
non-whitespace characters?)
 - The position of the close delimiter (is it on its own line, or not?)

Similarly, we have a number of policy choices:

 - Do we allow content on the same lines as the delimiters?
 - Should we always add a final newline?
 - Should we strip blanks lines?  Only on the first and last?  All leading and 
trailing?
 - How do we interpret auto-alignment on single-line strings? Strip?
 - Should we right strip lines?

And some syntax choices (not to be discussed now):

 - How do we indicate opt-out?

Comments?


Examples narrative.  Don’t peek yet.  Stop and comment first.


Unlike most other Java constructs, multi-line strings force us to look at 
coding style "square on".  Keep in mind that we are often guilty of making 
assumptions about developer coding style.  For instance, we may assume that 
multi-line strings tend to be large elements.  We may also assume that 
developers will declare static final String variables to keep multi-line 
strings from messing up their code.  All very neat and tidy, but...  we know 
from experience that developers will use multi-line strings everywhere, as they 
have with array initialization and large lambda bodies.

From this, we recommend that multi-line string fat delimiters should follow the 
brace pattern used in array initialization, lambdas and other Java constructs. 
The open delimiter should end the current line.  Content follows on separate 
lines, indented one level.  The close delimiter starts a new line, back 
indented one level, followed by the continuation of enclosing expression.

So as in this brace pattern;

int[] ia = new int[] {
        1,
        2,
        3
};

we have the fat delimiter pattern;

String d = """
    +--------+
    |  text  |
    +--------+
""";

and;

String.format("""
     public static void %s(String... args) {
         System.out.println(String.join(args));
     }
 """, name);

The fat delimiter pattern also significantly helps with future editing in and 
around the multi-line string.  For example, changing the length of the variable 
name in the above "String d =" example doesn't affect the positioning of the 
string content or the close delimiter.

If we adopt this style, some of the answers to the incidentals questions become 
easier or even moot.  Other styles are still valid, but the result of automatic 
incidental handling may be surprising.

Note that fat delimiters can be used on single lines.  What are the semantics 
for auto-alignment in that case?  The question of stripping whitespace and 
newlines is not really about alignment.  It's about what are the rules for 
handling incidental characters in a fat delimiter string.


Continuing with the examples, let's assume some (negotiable) auto-alignment 
basic rules;

1. All content lines are uniformly right stripped. Whitespace at the end of 
lines is not something that is consistently managed by IDEs/editors.
2. End of lines are always translated to \n.
3. If the content after the open delimiter is empty then the first end of line 
is discarded.
4. Content is left justified while preserving relative indentation.

And as a reminder, in the last round we introduced or attempted to introduce 
the following String methods;

- String::indent(n) - used to change indentation, line by line (in JDK 11)
- String::align() and String::align(n) - used to manage incidental indentation 
(didn't make it)
- String::format as an instance method (resolution issues YTBD)

__________________________________________________________________________________________________
String a = """
           +--------+
           |  text  |
           +--------+
           """; // first characters in first column?

RESULT:
+--------+\n
|  text  |\n
+--------+\n

The problem with this example is that it is not following the fat delimiter 
pattern.  Let's change the variable name "a" to "something".

String something = """
        .......... +--------+
        .......... |  text  |
        .......... +--------+
        .......... """; // first characters in first column?

The "." indicate all the places where we had to add whitespace to maintain the 
pattern used.
__________________________________________________________________________________________________
String b = """
               +--------+
               |  text  |
               +--------+
           """; // first characters in first column or indented four?

RESULT:
+--------+\n
|  text  |\n
+--------+\n

Same maintenence problem as example (a).

Still works, but the question here is, do we give meaning to indentation 
relative to the close delimiter? Did we want?;

    +--------+\n
    |  text  |\n
    +--------+\n

It's a nice trick but we sabotage the fat delimiter pattern.  We would always 
get at least one level of indentation, whether we wanted it or not.  Maybe 
better to code as;

String b = """
    +--------+
    |  text  |
    +--------+
""".indent(4);

So the question here is: should it be possible to specify "extra" indentation 
through the positioning of quotes, or are we better off saying that any extra 
indentation should be done through library calls?  Also noting that the library 
calls might be subject to compile time folding.
__________________________________________________________________________________________________
String c = """
               +--------+
               |  text  |
               +--------+
"""; // first characters in first column or indented several?

RESULT:
+--------+\n
|  text  |\n
+--------+\n

The amount of indentation is not a problem, just an aesthetic issue.

__________________________________________________________________________________________________
String d = """
    +--------+
    |  text  |
    +--------+
"""; // first characters in first column or indented four?

RESULT:
+--------+\n
|  text  |\n
+--------+\n

Text book fat delimiter pattern.
__________________________________________________________________________________________________
String e =
"""
+--------+
|  text  |
+--------+
"""; // heredoc?

RESULT:
+--------+\n
|  text  |\n
+--------+\n

Just an aesthetic issue.
__________________________________________________________________________________________________
String f = """


               +--------+
               |  text  |
               +--------+


           """; // one or all leading or trailing blank lines stripped?

As-is would generate;
\n
\n
+--------+\n
|  text  |\n
+--------+\n
\n
\n
\n

If we stripped away all leading or trailing blank lines, we would then have 
code as;

String f = "\n".repeat(2) + """
    +--------+
    |  text  |
    +--------+
""" + "\n".repeat(2);
__________________________________________________________________________________________________
String g = """
              +--------+
              |  text  |
              +--------+"""; // Last \n dropped

RESULT:
+--------+\n
|  text  |\n
+--------+

This one is likely okay. It's not the fat delimiter pattern, but the oddity 
makes it clear we mean something different; we want to drop the last \n.
__________________________________________________________________________________________________
String h = """+--------+
              |  text  |
              +--------+"""; // determine indent of first line using scanner 
knowledge?

RESULT:
+--------+\n
|  text  |\n
+--------+

We can do this because the compiler's scanner can determine the indentation on 
the open delimiter line.  However, this one is problematic if we require a 
String method to duplicate the compiler's algorithm (String::align).  Tool 
vendors may also find this one problematic.
__________________________________________________________________________________________________
String i = """  "nested"  """; // strip leading/trailing space?

RESULT:
"nested"

This one still follows the rules; left and right stripped.
__________________________________________________________________________________________________
String j = ("""
                 public static void """ + name + """(String... args) {
                     System.out.println(String.join(args));
                 }
           """).align(); // how do we handle expressions with multi-line 
strings?

Mid-string substitution gets messy fast.  Let's break the example down to the 
following (without align.)

String j = """
                 public static void """ + name + """(String... args) {
                     System.out.println(String.join(args));
                 }
           """;

This is the same as

String j =
"""
    public static void """
+ name +
"""(String... args) {
        System.out.println(String.join(args));
    }
""";

Which works fine if we say no \n when close delimiter is on the same line. The 
other requirement is there is that each multi-line string componment ends up 
with a common indentation.  The odds of that happening are poor.

Guess we're stuck with parentheses String::align. Unless...
__________________________________________________________________________________________________
String k = """
                 public static void %s(String... args) {
                     System.out.println(String.join(args));
                 }
           """.format(name); // is this the answer to  multi-line string 
expressions?

RESULT:
public static void methodName(String... args) {
    System.out.println(String.join(args));
}

Maybe a better substitution solution.
__________________________________________________________________________________________________

Reply via email to