On 4/10/2019 8:22 AM, Jim Laskey wrote:
Line terminators:  When strings span lines, they do so using the line
terminators present in the source file, which may vary depending on what
operating system the file was authored.  Should this be an aspect of
multi-line-ness, or should we normalize these to a standard line
terminator?  It seems a little weird to treat string literals quite so
literally; the choice of line terminator is surely an incidental one.  I
think we're all comfortable saying "these should be normalized", but its
worth bringing this up because it is merely one way in which incidental
artifacts of how the string is embedded in the source program force us
to interpret what the user meant.

No-one has commented on this, but it's important because some libraries are going to be surprised by the presence of line terminators, of any kind, in strings denoted by multi-line string literals.

To be clear, I agree with normalizing line terminators. And, I understand that any string could have contained line terminators thanks to escape sequences in traditional string literals. But, it was not common to see a \n except where multi-line-ness was expected or harmless. Going forward, who can guarantee that refactoring the argument of `prepareStatement` from a sequence of concatenations:

  try (PreparedStatement s = connection.prepareStatement(
      "SELECT * "
    + "FROM my_table "
    + "WHERE a = b "
  )) {
      ...
  }

to a multi-line string literal:

  try (PreparedStatement s = connection.prepareStatement(
      """SELECT *
         FROM my_table
         WHERE a = b"""
  )) {
      ...
  }

is behaviorally compatible for `prepareStatement`? It had no reason to expect \n in its string argument before.

(Hat tip: https://blog.jooq.org/2015/12/29/please-java-do-finally-support-multiline-strings/)

Maybe `prepareStatement` will work fine. But someone somewhere is going to take a program with a sequence of 2000 concatenations and turn them into a huge multi-line string literal, and the inserted line terminators are going to cause memory pressure, and GC is going to take a little longer, and eventually this bug will be filed: "My system runs 5% slower because the source code changed a teeny tiny bit."

In reality, a few libraries will need fixing, and that will happen quickly because developers are very keen to use multi-line string literals. But it's fair to point out that while everyone is worrying about whitespace on the left of the literal, the line terminators to the right are a novel artifact too.

Alex

Reply via email to