The is a rule when you design a language, if you can do something in the compiler or in a library, do it in the library :)
I do not thing it's a good idea to force the pipe prefix in the spec, and from an IDE point of view, you have to do more analysis but you can recognize the sequence ` ... `.trimMargin() in order to auto-indent things correctly. regards, Rémi ----- Mail original ----- > De: "Tagir Valeev" <amae...@gmail.com> > À: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Samedi 27 Janvier 2018 09:23:31 > Objet: [raw-strings] Indentation problem > Hello! > > Every language which implements the multiline strings has problems > with indentation. E.g. consider something like this: > > public class Multiline { > static String createHtml(String message) { > String html = `<html> > <head> > <title>Message</title> > </head> > <body>`; > if (message != null) { > html += ` > <p> > Message: `+message+` > </p>`; > } > html += ` > </body> > </html>`; > return html; > } > } > > Here the indentation of embedded snippet breaks the indentation of the > Java program harming its readability. The overall structure of the > method is messed with generated HTML structure. This is not just bad > indentation which could be fixed by auto-formatting feature of IDE. > You cannot fix this without throwing away a multiline string syntax > and without changing the semantics. Some people sacrifice the > semantics, namely the indentation of generated output if output > language is indentation agnostic. HTML is mostly so, unless you have a > <pre> section. So one may "fix" it like this: > > public class Multiline { > static String createHtml(String message) { > String html = `<html> > <head> > <title>Message</title> > </head> > <body>`; > if (message != null) { > html += ` > <p> > Message: `+message+` > </p>`; > } > html += ` > </body> > </html>`; > return html; > } > } > > Now we have broken formatting in the generated HTML, which ruins the > idea of multiline strings (why bother to generate \n in output HTML if > it looks like a mess anyways?) Moreover, the structure of Java program > now affects the output. E.g. if you add several more nested "if" or > "switch" statement, you will need to indent <p> even more. > > Many languages provide library methods to handle this. E.g. > trimIndent() could be provided to remove leading spaces of every line, > but this would kill the HTML indents at all. Another possibility is to > provide a method like trimMargin() on Kotlin [1] which trims all > spaces before a special character (pipe by default) including a > special character itself. > > Assuming such method exists in Java, we can rewrite our method in a > prettier way preserving both Java and HTML formatting: > > public class Multiline { > static String createHtml(String message) { > String html = `<html> > | <head> > | <title>Message</title> > | </head> > | <body>`.trimMargin(); > if (message != null) { > html += ` > | <p> > | Message: `+message+` > | </p>`.trimMargin(); > } > html += ` > | </body> > |</html>`.trimMargin(); > return html; > } > } > > This is almost nice. Even without syntax highlighting you can easily > distinguish between Java code and injected HTML code, you can indent > Java and HTML independently and HTML code does not clash with Java > code structure. The only problem is the necesity to call the > trimMargin() method. This means that original line is preserved in the > bytecode and during runtime and the trimming is processed every time > the method is called causing performance and memory handicap. This > problem could be minimized making trimMargin() a javac intrinsic. > Hoever even in this case it would be hard to enforce usage of this > method and I expect that tons of hard-to-read Java code will appear in > the wild, despite I believe that Java is about readability. > > So I propose to enforce such (or similar) format on language level > instead of adding a library method like "trimMargin()". The syntax > could be formalized like this: > > - Raw string starts with back-quote, ends with back-quote, as written > in draft before > - When line terminating sequence is encountered within a raw string, > the '\n' character is included into the string, and the literal is > interrupted > - After the interruption any amount of whitespace or comment tokens > are allowed and ignored > - The next meaningful token must be a pipe '|'. It's a compilation > error if any other token or EOF appears before '|' except comments or > whitespaces > - After '|' the raw-string literal continues and may either end with > back-quote or be interrupted again with the subsequent line > terminating sequence. > > Note the you don't need to especially escape the pipes within the literals. > > I see some advantages with such syntax: > 1. You can comment (or comment out!) a part of multiline string > without terminating it: > > String sql = `SELECT * FROM table > // Negative entry ID = deleted entry > | WHERE entryID >= 0`; > > If you want you can still make this comment a part of the query > (assuming DBMS accepts // comments): > > String sql = `SELECT * FROM table > | // Negative entry ID = deleted entry > | WHERE entryID >= 0`; > > Outcommenting code: > > String html = `<div> > /* | <span color='red'> > | Error > | </span>*/ // single-line comments would work as well > | Something wrong happened > |</div>`; > > 2. Looking into code fragment out of context (e.g. diff log) you > understand that you are inside a multiline literal. E.g. consider > reviewing a diff like > > | x++; > + | if (x == 10) break; > | foo(x); > > Without pipes you could think that it's Java code without any further > consideration. But now it's clear that it's part of multiline string > (probably a JavaScript!), so this is not direct Java logic and you > should check the broader context to understand what's this literal is > for. > > 3. You cannot accidentally make a big part of program a part of > multiline raw string just forgetting to close the back-quote. A > compilation error will be issued right in the next string like > "Multiline string must continue with a pipe token", not some obscure > message five screens below where the next raw string literal happens > to start. > > 4. IDEs will easily distinguish between in-literal indentation and > Java indentation and may allow you to adjust independently one or > another. > > In general this greatly increases the readability clearly telling you > at every line that you're not in Java, but inside something nested. > You can easily nest Java snippet into Java snippet and use multiline > raw-strings inside and still not get lost! > > String javaMethod = `public void dumpHtml() { > | System.out.println(``<!DOCTYPE html> > | |<html> > | | <body> > | | <h1>HelloWorld!</h1> > | | </body> > | |</html>``); > |}` > > One pipe means one level inside, two pipes mean two levels inside. > > > The only disadvantage I see in forcing a pipe prefix is inability to > just paste a big snippet from somewhere to the middle of Java program > in a plain text editor. However any decent IDE would support automatic > addition of pipes on paste. If not, simple search-and-replace with > regex like s/^/ |/ though the pasted content will do the thing. Even > adding pipes manually is not that hard (I did this manually many times > writing this letter). > > What do you think? > > [1] https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/trim-margin.html