Focus
=====

Instead of ordering everything on the menu and immobilizing ourselves with 
excessive gluttony, let’s focus our attention on the appetizer. If we plan 
correctly, we'll have room for entrees and desserts later.

The appetizer here is simplifying the injection of "foreign" language code into 
Java source. Think tapas. We may well be sated by the time we’re done.


Goal
====

Repurposing the Java String as a "foreign" code literal seems to be the most 
natural and least intrusive contrivance for Java support. In fact, this is 
already the case. Example;

          // <html>
          //   <body style="width: 100vw">
          //       <p>Hello World.</p>
          //   </body>
          //   <script>console.log("\nloaded")</script>
          // </html>

          String html = "<html>\n" +
                        "  <body style=\"width: 100vw\">\n" +
                        "        <p>Hello World.</p>\n" +
                        "  </body>\n" +
                        "  <script>console.log(\"\\nloaded\")</script>\n" +
                        "</html>\n";

The primary reason we are having the string literal discussion is that the 
existing form has a few issues;

        • The existing form is difficult to maintain without support from IDEs 
and is prone to error. The introduction and subsequent editing of foreign code 
requires additional delimiters, newlines, concatenations and escape sequences 
(DNCE).

        • More to the point, the existing form is difficult to read. The 
additional DNCE obscure the underlying content of the string.

Our aim is to come up with a DNCE lexicon that improves foreign code literal 
readability and maintainability without leaving developers in a confused state; 
with emphasis on reducing the E (escape sequences.)


50% solution
============

Where we keep running into trouble is that a choice for one part of the lexicon 
spreads into the the other parts. That is, use of certain characters in the 
delimiter affect which characters require escaping and which characters can be 
used for escaping.

So, let's pick off the lexicon easy bits first. Newlines, concatenations and 
in-between delimiters can be implicit if we just allow strings to span multiple 
lines (see Rust.)

          String html = "<html>
                           <body style=\"width: 100vw\">
                                 <p>Hello World.</p>
                          </body>
                          <script>console.log(\"\\nloaded\")</script>
                         </html>";

That's not so bad. If we did nothing else, we still would be better off than we 
were before.


75% solution, almost
====================

What problems are left?

        • The foreign delimiters (quotes) have to be escaped.

        • The foreign escape sequences also have to be escaped.

        • And to a lesser degree, it's difficult to locate the closing 
delimiter.

Fortunately, we don't have many choices for dealing with escapes;

        • Backslash is Java's escape character.

        • Either escaping is on or is off (raw), so we need a way to flag a 
string as being escaped. We could have an option to turn escaping on/off within 
a string, but it has been hard to come up with examples where this might be 
required.

        • Even with escaping off, we still might have to escape delimiters. 
Repeated backslashes (or repeated delimiters) is the typical out.

How about trying \ as the flag for escapes off;

          String html = \"<html>
                            <body style="width: 100vw">
                                  <p>Hello World.</p>
                            </body>
                            <script>console.log("\nloaded")</script>
                          </html>";

That doesn't work because it looks like the string ends at the first quote. 
Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable 
because then it doesn't look like an escape sequence (see Swift.)

          String html = \"<html>
                            <body style="width: 100vw">
                                  <p>Hello World.</p>
                            </body>
                            <script>console.log("\nloaded")</script>
                          </html>"\;

        • The only new string rule added is to allow multi-line strings.

        • Adding backslash before and after the string indicates escaping off.


But wait
========

This looks like the 75% solution;

        • Builds on our cred with existing strings.

        • Escape processing is orthogonal to multi-line.

        • Delimiter can easily be understood to mean “string with escapes."

But wait. "\nloaded" looks like it contains the end delimiter. Rats!!! Captain 
we need more sequences.

And, this is the crux of all the debate around strings. Fixed delimiters imply 
a requirement for escape sequences, otherwise there is content you cannot 
express as a string.

The inverse of this implication is that if you have escape sequences you don't 
need flexible delimiters. This can be reinterpreted as you only need flexible 
delimiters if you want to always avoid escape sequences.

Wasn't avoiding escape sequences the goal?

All this brings us to the central choice we have to make before we get into the 
rest of the meal. Do we go with fixed delimiter(s), structured delimiters or 
nonce delimiters.


Fixed delimiter
===============

If we go with a fixed delimiter then we limit the content that can be expressed 
without escape sequences. This is not totally left field. There are floating 
point values we can not express in Java and types we can express but not 
denote, such as anonymous class types, intersection types or capture types.

Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we 
are explicit about it.

We could get closer to the 85% mark if we had a way to have " in our content 
without escaping. Let's introduce a secondary delimiter, """.

          String html = """<html>
                             <body style="width: 100vw">
                                   <p>Hello World.</p>
                             </body>
                             <script>console.log("\\nloaded")</script>
                           </html>""";

The introduction of """ would allow " with the only restriction that we can not 
use """ in the content without escaping. We could say that """ also means 
escaping off, but then we would have no way to escape """ (\"""). Keeping 
escaping as an orthogonal issue allows the best of both worlds.

          String html = \"""<html>
                              <body style="width: 100vw">
                                    <p>Hello World.</p>
                              </body>
                              <script>console.log("\nloaded")</script>
                            </html>"""\;

Once you take away conflicts with the delimiter, most strings do not require 
escaping.

Also at this point we should note that other combinations of quotes ('''. ```, 
"'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato.

Summary: All strings can be expressed with fixed plus escaping, but can not 
express strings containing the fixed delimiter (""") with escaping off.

Jumping ahead: I think that stating that traditional " strings must be 
single-line will be a popular restriction, even if it not needed. Then they 
will think of """ as meaning multi-line.


Structured delimiter
====================

A structured delimiter contains a repeating pattern that can be expanded to 
suit a scenario. We attempted to introduce this notion with the original 
backtick proposal, but that proposal was withdrawn because a) didn't want to 
burn the backtick, b) developers weren't comfortable with infinitely repeating 
delimiters, and c) non-expressible anomalies such as content with leading or 
trailing backticks.

Using " instead of backtick addresses a).

          String html = """"""<html>
                                <body style="width: 100vw">
                                  <p>Hello World.</p>
                                </body>
                                <script>console.log("\nloaded")</script>
                              </html>"""""";

For b) is there a limit where developers would be comfortable? That is, what 
about a range of fixed delimiters; ", """, """", """"", """""". This is 
slightly different than fixed delimiters in that it increases the combinations 
of content containing delimiters. Example, """"" could allow ", """, """", ..., 
Nx" for N != 5.

Structured delimiters also differ from fixed delimiters in the fact that there 
is pressure to have escaping off when N >= 3. You can always fall back to a 
single ".

Summary: Can express all strings with and without escaping. If the delimiter 
length is limited the there there is still a (smaller) set of strings that can 
not be expressed.


Nonce delimiter
===============

A nonce or custom delimiter allows developers to include a unique character 
sequence in the delimiter. This provides a flexible delimiter without fear of 
going too far. There is also the advantage/distraction of providing commentary.

          String html = \HTML"<html>
                              <body style="width: 100vw">
                                    <p>Hello World.</p>
                              </body>
                              <script>console.log("\nloaded")</script>
                            </html>"HTML\;

Summary: Can express all strings with and without escaping, but nonce can 
affect readability.


Multi-line formatting
=====================

I left this out of the main discussion, but I think we can all agree that 
formatting rules should separate the delimiters from the content. Other details 
can be refined after choice of delimiter(s).

          String html = \"""
                          <html>
                            <body style="width: 100vw">
                              <p>Hello World.</p>
                            </body>
                            <script>console.log("\nloaded")</script>
                          </html>
                        """\;

          String html = """"""
                          <html>
                            <body style="width: 100vw">
                              <p>Hello World.</p>
                            </body>
                            <script>console.log("\nloaded")</script>
                          </html>
                        """""";

          String html = \HTML"
                          <html>
                            <body style="width: 100vw">
                              <p>Hello World.</p>
                            </body>
                            <script>console.log("\nloaded")</script>
                          </html>
                        "HTML/;


Entrees and desserts
====================

If we make good choices now (stay away from the oysters) we can still move on 
to other courses later.

For instance; if we got up from the table with the ", """, \", \""" set of 
delimiters, we could still introduce structured delimiters in the future; 
either with repeated \ (see Swift) or repeated ". We could also follow a 
suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""".

Point being, we can work with a 85% solution now that we can supplement later 
when we're not so hangry.






> On Feb 10, 2019, at 12:30 PM, James Laskey <james.las...@oracle.com> wrote:
> 
> I should know better than format e-mails. Many a backslash eaten. The summary 
> should be;
> 
>>> For instance; if we got up from the table with the ", """, \", \""" set of 
>>> delimiters, we could still introduce structured delimiters in the future; 
>>> either with repeated  \ (see Swift) or repeated ". We could also follow a 
>>> suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""".
>>> 
> 
> 
> 
> Sent from my iPhone
> 
> On Feb 10, 2019, at 11:43 AM, Jim Laskey <james.las...@oracle.com> wrote:
> 
>> 
>>> 
>>> 
>>> Focus
>>> 
>>> Instead of ordering everything on the menu and immobilizing ourselves with 
>>> excessive gluttony, let’s focus our attention on the appetizer. If we plan 
>>> correctly, we'll have room for entrees and desserts later.
>>> 
>>> The appetizer here is simplifying the injection of "foreign" language code 
>>> into Java source. Think tapas. We may well be sated by the time we’re done.
>>> 
>>> Goal
>>> 
>>> Repurposing the Java String as a "foreign" code literal seems to be the 
>>> most natural and least intrusive contrivance for Java support. In fact, 
>>> this is already the case. Example;
>>> 
>>>           // <html>
>>>           //   <body style="width: 100vw">
>>>           //       <p>Hello World.</p>
>>>           //   </body>
>>>           //   <script>console.log("\nloaded")</script>
>>>           // </html>
>>> 
>>>           String html = "<html>\n" +
>>>                         "  <body style=\"width: 100vw\">\n" +
>>>                         "        <p>Hello World.</p>\n" +
>>>                         "  </body>\n" +
>>>                         "  <script>console.log(\"\\nloaded\")</script>\n" +
>>>                         "</html>\n";
>>> 
>>> The primary reason we are having the string literal discussion is that the 
>>> existing form has a few issues;
>>> 
>>>     • The existing form is difficult to maintain without support from IDEs 
>>> and is prone to error. The introduction and subsequent editing of foreign 
>>> code requires additional delimiters, newlines, concatenations and escape 
>>> sequences (DNCE).
>>> 
>>>     • More to the point, the existing form is difficult to read. The 
>>> additional DNCE obscure the underlying content of the string.
>>> 
>>> Our aim is to come up with a DNCE lexicon that improves foreign code 
>>> literal readability and maintainability without leaving developers in a 
>>> confused state; with emphasis on reducing the E (escape sequences.)
>>> 
>>> 50% solution
>>> 
>>> Where we keep running into trouble is that a choice for one part of the 
>>> lexicon spreads into the the other parts. That is, use of certain 
>>> characters in the delimiter affect which characters require escaping and 
>>> which characters can be used for escaping.
>>> 
>>> So, let's pick off the lexicon easy bits first. Newlines, concatenations 
>>> and in-between delimiters can be implicit if we just allow strings to span 
>>> multiple lines (see Rust.)
>>> 
>>>           String html = "<html>
>>>                            <body style=\"width: 100vw\">
>>>                                  <p>Hello World.</p>
>>>                           </body>
>>>                           <script>console.log(\"\\nloaded\")</script>
>>>                          </html>";
>>> 
>>> That's not so bad. If we did nothing else, we still would be better off 
>>> than we were before.
>>> 
>>> 75% solution, almost
>>> 
>>> What problems are left?
>>> 
>>>     • The foreign delimiters (quotes) have to be escaped.
>>> 
>>>     • The foreign escape sequences also have to be escaped.
>>> 
>>>     • And to a lesser degree, it's difficult to locate the closing 
>>> delimiter.
>>> 
>>> Fortunately, we don't have many choices for dealing with escapes;
>>> 
>>>     • Backslash is Java's escape character.
>>> 
>>>     • Either escaping is on or is off (raw), so we need a way to flag a 
>>> string as being escaped. We could have an option to turn escaping on/off 
>>> within a string, but it has been hard to come up with examples where this 
>>> might be required.
>>> 
>>>     • Even with escaping off, we still might have to escape delimiters. 
>>> Repeated backslashes (or repeated delimiters) is the typical out.
>>> 
>>> How about trying  as the flag for escapes off;
>>> 
>>>           String html = \"<html>
>>>                             <body style="width: 100vw">
>>>                                   <p>Hello World.</p>
>>>                             </body>
>>>                             <script>console.log("\nloaded")</script>
>>>                           </html>";
>>> 
>>> That doesn't work because it looks like the string ends at the first quote. 
>>> Let's try symmetry, either " or " as the closing delimiter. " is preferable 
>>> because then it doesn't look like an escape sequence (see Swift.)
>>> 
>>>           String html = \"<html>
>>>                             <body style="width: 100vw">
>>>                                   <p>Hello World.</p>
>>>                             </body>
>>>                             <script>console.log("\nloaded")</script>
>>>                           </html>"\;
>>> 
>>>     • The only new string rule added is to allow multi-line strings.
>>> 
>>>     • Adding backslash before and after the string indicates escaping off.
>>> 
>>> But wait
>>> 
>>> This looks like the 75% solution;
>>> 
>>>     • Builds on our cred with existing strings.
>>> 
>>>     • Escape processing is orthogonal to multi-line.
>>> 
>>>     • Delimiter can easily be understood to mean “string with escapes."
>>> 
>>> But wait. "" looks like it contains the end delimiter. Rats!!! Captain we 
>>> need more sequences.
>>> 
>>> And, this is the crux of all the debate around strings. Fixed delimiters 
>>> imply a requirement for escape sequences, otherwise there is content you 
>>> cannot express as a string.
>>> 
>>> The inverse of this implication is that if you have escape sequences you 
>>> don't need flexible delimiters. This can be reinterpreted as you only need 
>>> flexible delimiters if you want to always avoid escape sequences.
>>> 
>>> Wasn't avoiding escape sequences the goal?
>>> 
>>> All this brings us to the central choice we have to make before we get into 
>>> the rest of the meal. Do we go with fixed delimiter(s), structured 
>>> delimiters or nonce delimiters.
>>> 
>>> Fixed delimiter
>>> 
>>> If we go with a fixed delimiter then we limit the content that can be 
>>> expressed without escape sequences. This is not totally left field. There 
>>> are floating point values we can not express in Java and types we can 
>>> express but not denote, such as anonymous class types, intersection types 
>>> or capture types.
>>> 
>>> Everything is a degree of tradeoff. And, those tradeoffs are okay as long 
>>> as we are explicit about it.
>>> 
>>> We could get closer to the 85% mark if we had a way to have " in our 
>>> content without escaping. Let's introduce a secondary delimiter, """.
>>> 
>>>           String html = """<html>
>>>                              <body style="width: 100vw">
>>>                                    <p>Hello World.</p>
>>>                              </body>
>>>                              <script>console.log("\\nloaded")</script>
>>>                            </html>""";
>>> 
>>> The introduction of """ would allow " with the only restriction that we can 
>>> not use """ in the content without escaping. We could say that """ also 
>>> means escaping off, but then we would have no way to escape """ (\"""). 
>>> Keeping escaping as an orthogonal issue allows the best of both worlds.
>>> 
>>>           String html = \"""<html>
>>>                               <body style="width: 100vw">
>>>                                     <p>Hello World.</p>
>>>                               </body>
>>>                               <script>console.log("\nloaded")</script>
>>>                             </html>"""\;
>>> 
>>> Once you take away conflicts with the delimiter, most strings do not 
>>> require escaping.
>>> 
>>> Also at this point we should note that other combinations of quotes ('''. 
>>> ```, "'") don't bring anything new to the table; Tomato/Tomato, 
>>> Potato/Potato.
>>> 
>>> Summary: All strings can be expressed with fixed plus escaping, but can not 
>>> express strings containing the fixed delimiter (""") with escaping off.
>>> 
>>> Jumping ahead: I think that stating that traditional " strings must be 
>>> single-line will be a popular restriction, even if it not needed. Then they 
>>> will think of """ as meaning multi-line.
>>> 
>>> Structured delimiter
>>> 
>>> A structured delimiter contains a repeating pattern that can be expanded to 
>>> suit a scenario. We attempted to introduce this notion with the original 
>>> backtick proposal, but that proposal was withdrawn because a) didn't want 
>>> to burn the backtick, b) developers weren't comfortable with infinitely 
>>> repeating delimiters, and c) non-expressible anomalies such as content with 
>>> leading or trailing backticks.
>>> 
>>> Using " instead of backtick addresses a).
>>> 
>>>           String html = """"""<html>
>>>                                 <body style="width: 100vw">
>>>                                   <p>Hello World.</p>
>>>                                 </body>
>>>                                 <script>console.log("\nloaded")</script>
>>>                               </html>"""""";
>>> 
>>> For b) is there a limit where developers would be comfortable? That is, 
>>> what about a range of fixed delimiters; ", """, """", """"", """""". This 
>>> is slightly different than fixed delimiters in that it increases the 
>>> combinations of content containing delimiters. Example, """"" could allow 
>>> ", """, """", ..., Nx" for N != 5.
>>> 
>>> Structured delimiters also differ from fixed delimiters in the fact that 
>>> there is pressure to have escaping off when N >= 3. You can always fall 
>>> back to a single ".
>>> 
>>> Summary: Can express all strings with and without escaping. If the 
>>> delimiter length is limited the there there is still a (smaller) set of 
>>> strings that can not be expressed.
>>> 
>>> Nonce delimiter
>>> 
>>> A nonce or custom delimiter allows developers to include a unique character 
>>> sequence in the delimiter. This provides a flexible delimiter without fear 
>>> of going too far. There is also the advantage/distraction of providing 
>>> commentary.
>>> 
>>>           String html = \HTML"<html>
>>>                               <body style="width: 100vw">
>>>                                     <p>Hello World.</p>
>>>                               </body>
>>>                               <script>console.log("\nloaded")</script>
>>>                             </html>"HTML\;
>>> 
>>> Summary: Can express all strings with and without escaping, but nonce can 
>>> affect readability.
>>> 
>>> Multi-line formatting
>>> 
>>> I left this out of the main discussion, but I think we can all agree that 
>>> formatting rules should separate the delimiters from the content. Other 
>>> details can be refined after choice of delimiter(s).
>>> 
>>>           String html = \"""
>>>                           <html>
>>>                             <body style="width: 100vw">
>>>                               <p>Hello World.</p>
>>>                             </body>
>>>                             <script>console.log("\nloaded")</script>
>>>                           </html>
>>>                         """\;
>>> 
>>>           String html = """"""
>>>                           <html>
>>>                             <body style="width: 100vw">
>>>                               <p>Hello World.</p>
>>>                             </body>
>>>                             <script>console.log("\nloaded")</script>
>>>                           </html>
>>>                         """""";
>>> 
>>>           String html = \HTML"
>>>                           <html>
>>>                             <body style="width: 100vw">
>>>                               <p>Hello World.</p>
>>>                             </body>
>>>                             <script>console.log("\nloaded")</script>
>>>                           </html>
>>>                         "HTML/;
>>> 
>>> Entrees and desserts
>>> 
>>> If we make good choices now (stay away from the oysters) we can still move 
>>> on to other courses later.
>>> 
>>> For instance; if we got up from the table with the ", """, ", """ set of 
>>> delimiters, we could still introduce structured delimiters in the future; 
>>> either with repeated  (see Swift) or repeated ". We could also follow a 
>>> suggestion John made to use a pseudo nonce like " for \\" or """"".
>>> 
>>> Point being, we can work with a 85% solution now that we can supplement 
>>> later when we're not so hangry.
>>> 

Reply via email to