This is a more formal write-up of the discussion started with José here 
<https://elixirforum.com/t/internal-guards/16723>. Interested in all 
feedback, potential use-cases, and syntactic edge-cases!


Proposal

Allow `*when conditions...*` to be placed anywhere within the patterns used 
in functions, cases, and other such match head constructs—in addition to 
the suffix-only version we support today.

I believe this can be done by the compiler today as a backwards-compatible 
enhancement with no change to the parser.

Synopsis

I'm proposing we allow guard clauses anywhere within match patterns, as 
well as following them, so that the AST for match heads containing guards 
can be composed easily.

To show rather than tell, assuming you have AST for the following:

X = x when x > 0
Y = y when y > 1

Both *X* and *Y* are valid, stand-alone match heads. However, they are not 
also valid stand-alone match patterns: there is no easy way to combine them 
to form a new 2-arity match. The naive approach would be:

XY = (x when x > 0, y when y > 1)

While this is not currently a valid match head, once can hoist any internal 
guards into a valid form like:

XY2 = (x, y) when x > 0 and y > 1

This proposal explores allowing *XY* as valid Elixir code and rewriting it 
to *XY2* at compile time.


Terminology

To talk about this precisely I'm going to appropriate some ETS terminology 
to reference Elixir syntax constructs I don't have canonical names for. Let 
me know if you are aware of the correct terms for these!

- a *match specification* is defined as a *match head -> body* pair
- a *match head* is defined as a *match pattern* optionally followed by *guard 
clauses*
- a *match pattern* is defined as a comma delimited series of expressions 
allowed in match contexts
- a *guard clause* is defined as a *when conditions* expression formulated 
of functions allowed in guard contexts

Technically, what I'm proposing is to loosen the restriction that guard 
clauses must follow match patterns, by allowing match heads themselves to 
recursively be valid expressions in parameters lists. The compiler can 
extract all guard clauses found within a match head, leaving only a valid 
match pattern, and combine the extracted guards with any existing ones to 
create a new valid match head with equivalent semantics.


Parsing

Currently, both of these (and all other variations I can think of) are 
already valid syntax to the parser:

fn x when x > 0, y when y > 1 -> #... end
def name(x when x > 0, y when y > 1), do: #...

Only the compiler keeps you from running a program with these internal 
guards; it can be parsed into AST with the exact precedence we want without 
a hitch.


Grammar

The only ambiguity I can think of would be the following:

fn x when x > 0, y when y > 1 when y < 3 -> #... end

It is not clear where whether the intent was to create an 'or' multi-guard 
around the entire parameters list, or to guard both the last parameter and 
the parameters at large with two separate clauses. The parenthesis 
conventionally always used in *def*s resolves the ambiguity. I am open to 
ideas on how to handle this situation, though personally I envision a 
compile-time warning and treating it as a multi-guard as this is most 
consistent with the precedence of *when*. I don't see it coming up too 
often in generated code.

It is worth observing that while the following technically has the same 
ambiguity:

fn x when x > 0, y when y > 1 -> #... end

however you decide to treat the guard after the second parameter, the 
resulting guards post-rewrite will be semantically equivalent.


Rewriting

Since everything up to this point is already valid, I suspect the rewrite 
could be done in a single place 
<https://github.com/elixir-lang/elixir/blob/4d2f4865fe44ac05476a6908b43c9783e5ae8064/lib/elixir/src/elixir_utils.erl#L42-L54>
 
in the compiler with no further changes to any other code.

The algorithm I have in mind is to simply walk the AST outside-in, removing 
each set of consecutive guards it finds, and exploding the permutation of 
each sets' multi-guards out to create new trailing clauses, then *and*ing all 
terms together in each new guard clause. In the most common case that 
simply entails prefacing the trailing guards with any interior guard, all 
*and*ed together.

This approach is intentionally naive about what variables are referenced in 
which guards where. Anything that produces a valid guard in the end can 
fly, even if generated code produces them in odd or unexpected places 
within the params list.

Simple extraction:

def name(%{foo: bar when is_integer(bar), fizz: buzz when is_integer(buzz)})
  when bar + buzz > 100

def name(%{foo: bar, fizz: buzz})
  when (is_integer(bar) and is_integer(buzz)) and (bar + buzz > 100)


In order to handle mutli-guards correctly we'd have to get a little *n^m *but 
I doubt even the most ambitious metaprogramming would ever need to generate 
code like that.

Matrix of multi-guards with oddly referenced variables:

def name(x, y when y > z when z > x, z)
  when is_integer(z)
  when is_string(z)

def name(x, y, z)
  when y > z and is_integer(z)
  when y > z and is_string(z)
  when z > x and is_integer(z)
  when z > x and is_string(z)


Thoughts

The purpose of this is to make function heads more composable in 
meta-programming. Initially it would be released with zero fanfare except 
perhaps a footnote in the guard docs. However, I can see this ability 
catching on with people coming from strongly-specified typed languages, 
allowing them to qualify type expectations in parameter lists inline with 
where variables are defined.

I don't think that would be too problematic since I find the style to be 
pretty readable and more importantly the code still easy to reason about. 
However, this is definitely enabling an alternate way to write pretty basic 
syntax, so if we were really against human beings rather than macros 
employing doing this, the rewrite process could emit warnings from code not 
marked as generated.

I have no idea how the formatter should handle this syntax. Perhaps its 
behaviour already suffices, since it's pretty good at deconstructing long 
function heads?

I'm interested if there are strong opinions about this one way or the other 
on the mailing list, as well as if there are any implications I've 
overlooked in my suggested implementation.

Thanks for reading!
Chris K

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/29c74e96-efc0-4254-bd34-eb1fc2c03968%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to