Thanks, Maurizio. I find your arguments helpful and persuasive. They
indicate that “autoboxing” is the wrong model, since it would lift
ad hoc strings into places that want only STs.
The poly-expression move, applied only to string literals, is not so
bad, since the only ad hoc strings liftable to STs are those right next
to the API points that demand STs.
But, if we are going to make ST-demanding APIs the lock and STs the
keys, it might be reasonable to demand that all STs look distinctive
(with that extra sigil), which is an argument even against the
poly-expression move.
Guy’s disruptive suggestion, of having both kinds of interpolation
expressions, would play out as two tiers of vetting and security. The
lower tier is inhabited by strings. You have to drive carefully on
those streets, where dodgy APIs accept all kinds of strings, and there
are no $ sigils to indicate vetted inputs. The higher tier would be API
points that demand STs (and do not welcome plain strings). To get into
that safer tier, you pay a cover charge, the $ sigils (or API points
which manufacture STs explicitly).
It might seem wrong to ask a cover charge for a tier we want users to
prefer, but the IDE will surely help pay it as needed. (The $ is
visible in the code, as a reminder the security is enabled. Like the
wrist band you get when you pay the cover?)
On the other hand, if we try to make everything be one tier (everything
potentially vettable, but with loopholes for raw strings), the security
guarantees get muddier. If everything is equally secure, and there are
loopholes (for string concat and the like) then everything is also
equally insecure, in some hand-wavy sense.
More hand-waving: Distinct tiers is a more honest design, allowing for
better invariants within the higher tier, and relaxed behavior in the
lower tier. Also, maybe, having the distinct tiers be visibly connected
by syntax encourages folks muddling around with string-concat to lift
their code to work on STs instead of strings. Switch the APIs and add
the dollar signs.
OK, I’ll stop now. I’m past the point where I need to try the API
on some serious project, before I speculate more.
On 13 Mar 2024, at 16:47, Maurizio Cimadamore wrote:
There is a problem/slippery slope with overloads, which I think should
be discussed (and that discussion seems, at least to me, more
important than the discussion on how we spell string literals).
Consider the case of a /new/ API, that perhaps wants to build SQL
queries (or any other kind of injection-sensitive factory):
|Query makeQuery(???) |
What should be the natural parameter type for this query? Well, we
know that String is flawed here. Easy to reach for, but also too easy
to abuse. StringTemplate is a much better type because it allows
user-injectable values and constant parts to carried in separate parts
of the string template, so that the library has a chance at looking at
what’s going on.
Ok, so let’s say we write the factory as:
|Query makeQuery(StringTemplate) |
As that is clearly the safer option. This obviously works well /as
long as clients are passing template with arguments/.
No-argument templates might be a corner case, but, sooner or later
somebody might want to do this:
|makeQuery("SELECT foo FROM bar WHERE foo = 42"); |
Only to discover that this doesn’t compile. What then? There are a
couple of alternatives I can think of. The first is to add a
String-accepting overload:
|Query makeQuery(StringTemplate) Query makeQuery(String) |
The second is to use some use-site factory call to turn the string
into a degenerate string template:
|makeQuery(StringTemplate.fromString("SELECT foo FROM bar WHERE foo =
42")); |
IMHO, both approaches have problems: they force the user to go from
the safer StringTemplate world, to the more unsafe String world.
It’s sort of like crossing the Rubicon: once you’re in
String-land, it then become easier to introduce potentially very
costly mistakes. If we have overloads:
|makeQuery("SELECT " + foo + " FROM " + bar + " WHERE " + condition);
|
This would now compile just fine. Effectively, safety-wise we’d be
back at square one. The factory case is only marginally better -
because using the factory is more convoluted, so it would perhaps be
easier to spot that something fishy is going on. That said, as the
expression got more complicated, it’s easier for bugs to sneak in:
|makeQuery(StringTemplate.fromString("SELECT " + foo + "FROM bar WHERE
foo = 42")); |
So, at least in my opinion, having a string template literal, or some
kind of compiler-controlled promotion from string /constants/ to
string templates, is not just something we need to type less
characters (I honestly couldn’t care less about that, at least not
at this stage). These things are needed to allow developers to remain
in StringTemplate-land.
That is, the best /overall/ outcome is for the library /not/ to have
an overload, /and/ for the client to either say this:
|makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because of
implicit promotion of constant String -> StringTemplate |
or this:
|makeQuery(<insert your favourite "I'M A TEMPLATE" char here>"SELECT
foo FROM bar WHERE foo = 42"); // works because it's a string template
all along |
Maurizio
On 13/03/2024 22:37, John Rose wrote:
On 13 Mar 2024, at 15:22, John Rose wrote:
… OVERLOADS …
I don’t see (maybe I missed it) a decisive objection to
overloading
across ST and String, at least for some processing APIs.
Perhaps it is this: A language processor API that takes STs and
never Strings is making it clear that all inputs should be
properly
vetted, nothing taken on trust as a bare string.
Doing that MIGHT require a performance model which permits
expensive
vetting operations to be memoized on particular OCCURRENCES of
inputs
(not just the input strings viewed in and of themselves).
If that’s true, then I guess that’s support for Guy’s
proposal: That
STs (even trivial ones) should never look identical to strings.
Maybe they should always be preceded by a sigil $, or (per my
suggestion) they should always have at least one occurrence of {
inside, even if it’s a trivial nop.
I kind of like Guy’s offensive-to-everyone suggestion that $ is
required to make a true ST. Then it’s clear how the veteting APIs
mate up with their vetted inputs. And if $ is not placed in front,
we surrender to the string-pasters, but at least the resulting
true-string expressions won’t be accepted by the vetting APIs.