Time to take stock of where we are with respect to the multi-line aspect of Raw String Literals.  Not surprisingly, this has taken a few iterations to get to a reasonable place.  I think we're in a pretty reasonable place right now.

The main challenge is separating "intended" indentation of multi-line strings from the "incidental" indentation that comes from wanting the embedded snippet to look reasonable in the context of the surrounding code / inserted by IDEs.  This, in turn, has prompted an exploration of "what is the user thinking" (always a dangerous question) and a number of proposed tweaks to allow the user greater control over saying "these four spaces may look like incidental indentation, but they are in fact intended." The earlier attempts at controlling these added complexity, but Jim has found a way to get that effect while rolling back the complexity.

The main transform we've been designing is what we're now calling `align()` (previously `stripIndent()` or `trimIndent()`), which is to remove all seemingly-incidental horizontal and vertical indentation.  This has been simplified as follows:
 - Now removes all leading and trailing blank lines;
 - Left-justifies remaining text based only on indentation of non-blank lines.

The observation that led us here is: if the user wants some extra horizontal indentation, its better to specify this explicitly (say, via an `indent(n)` method) than to rely on significant whitespace of the trailing line.  Similarly, if the user wants extra vertical indentation, that can be also added explicitly.  We think that the current operation now essentially finds Kevin's minimal "rectangular box", and then the user can explicitly add back any extra indentation (horizontal or vertical) that is desired.

There are a few reasons why this operation is important:
 - Most users will not want to have to start their code "undented" relative to the Java code, but instead will want it to embed cleanly both horizontally and vertically;  - As the code is refactored, incidental indentation will change, and this may cause instability in the output.  Users will want a way to get to stable output.

Raw string literals already normalize end-of-line characters.  We could describe the transformations that `align()` does as further normalizing horizontal whitespace.

In order to make the above argument work, there needs to be an easy way to do relative indentation, which is proposed as:

    String indent(int n)

This indents a multi-line string to the left (negative n) or to the right (positive n) by n whitespace characters.  We can then define, for convenience:

    String align(int n) { return align().indent(n); }

so that users can express normalization + indentation in one go:

    String s = `
                blah blah
                     blah
               `.align(4); // normalized, indented 4 chars

So there are two indentation mechanisms: relative (indent) and absolute (align).  This covers the waterfront.


Assuming we've factored this down to the appropriate primitives, the remaining decision to be made here is: should the language try to auto-align multi-line strings, or is asking users to explicitly use a library method (`string.align()`) better.

Arguments in favor of the library approach:
 - Many embedded languages don't care about indentation anyway (HTML, SQL, JSON);  - The string mangling algorithm is somewhat complicated (though less than it used to be) and subjective, both strikes against pushing it into the language;  - If auto-alignment doesn't do what the user wants, it may be hard to get back to what the user does want.

Arguments in favor of the language approach:
 - Most usages of this feature will want alignment anyway, and having to explicit ask for it feels like noise;  - Failure to normalize leading whitespace will mean that the indentation of output will be perturbed by ordinary code refactoring (which might lead to instabilities in tests);  - It is easy to explicitly specify additional horizontal or vertical indentation if desired; normalizing the rest of the time makes results more predictable.

Did I miss any?


(Note that arguments about constant pool efficiency or runtime efficiency are mostly red herrings; `align()` can be safely folded at compile time in the library approach, and a principled framework for such transformations is in the works.)

Reply via email to