[elixir-core:11229] Re: Formatting of module attributes

Christopher Keele Wed, 28 Dec 2022 14:49:53 -0800

This feels like a feature request to me: I understand why sigils are 
generally not touched by the formatter without plugins, but I feel like the 
sigil_w included in the standard library should have smarter formatting by 
default.

*NOTE: I'm using Elixir 1.14.2 here to observe the behaviour of the 
formatter, this may be out-of-date with the mainline branch.*

*General formatting of sigils*

Conceptually, to the compiler, the contents of a sigil is a potentially 
multi-line string. However, actually using a multiline string does get 
forced into a format as expected:

@words """
        ONE
        TWO
        THREE
        FOUR
        FIVE
        """
        |> String.split
        |> Enum.filter(&String.contains?(&1, "F"))

Elixir *literal* multiline strings have special semantics for stripping the 
whitespace on the left, based on the indentation of the closing """. Try 
increasing and decreasing the indentation of that lexeme and watch how the 
formatter reacts.

I'd personally intuitively expect the sigil_w case to do the same, but you 
can see why we cannot apply multiline string semantics to every sigil—the 
sigil macro, called at compile-time, receives the verbatim contents of the 
string, extra whitespace and all. Correctly handling that whitespace, 
including stripping it, is the job of the sigil itself, which may vary 
depending on the intentions of the sigil developer. (Consider: a custom 
sigil for parsing the Whitespace esolang 
<https://en.wikipedia.org/wiki/Whitespace_(programming_language)> or a 
python program has different multiline-stripping semantics than literal 
multiline strings or sigil_w.)

Since the formatter cannot know what whitespace semantics any particular 
sigil expects, it cannot modify the contents of the string with the 
knowledge that it will not impact the program, unlike multiline string 
literals. So it will do absolutely no work on the sigil's contents; leaving 
your awkward indentation in place. The good news is that if you correct the 
indentation manually, knowing how this particular sigil handles whitespace, 
that rewriting will pass formatting and stay unchanged.

*Formatting stdlib sigils*

That being said, the documentation for extending the formatter 
<https://hexdocs.pm/mix/main/Mix.Tasks.Format.html#module-plugins> is very 
sigil-special-casing-oriented. It should be easy to implement a plugin that 
knows how to normalize the currently un-touched

~w(
  ONE
  TWO
  THREE
    FOUR
  FIVE
    )

However, *I'd really imagine that the stdlib formatter would understand the 
special whitespace semantics of the stdlib sigil_w and format it 
out-of-the-box*. This is the feature request I see here.

I also think that other stdlib sigil formatting could be improved; for 
example I feel like

~D[2022-01-01
]

should automatically be formatted to 

~D[2022-01-01]

without any plugins.

*Formatting module attributes*

> Is there reason why when I pipe the module attribute that it gets 
intended differently than when I do not pipe it (compare @other with @xs)? 

I believe this is an emergent behaviour of whatever order the formatter 
calculates rules for determining:

- the indentation of the module attribute's argument
- the indentation of the pipeline
- the indentation of the (list) argument to the pipeline
- the indentation of list items within the list

These determinations add up in an unexpected way I do not understand. 
Essentially, Pipelines want to have their indentation flush with the 
leftmost character of their argument, so that you get:

[1, 2, 3]
|> Enum.map(&(&1 * 2))
|> Enum.reject(&(&1 < 5))
|> length()

[
  1,
  2,
  3
]
|> Enum.map(&(&1 * 2))
|> Enum.reject(&(&1 < 5))
|> length()

Somehow this interacts with how module attributes want to indent things, 
and we get

@nums [
        1,
        2,
        3
      ]
      |> Enum.map(&(&1 * 2))
      |> Enum.reject(&(&1 < 5))
      |> length()

This does not seem like a bug per se, but I also personally think that this 
should format as

@nums [
  1,
  2,
  3
]
|> Enum.map(&(&1 * 2))
|> Enum.reject(&(&1 < 5))
|> length()

This seems like it would be a backwards-compatible enhancement.

*Summary*

The combination of sigil_w not being internally normalized with sigil_w 
whitespace semantics, alongside the current behaviour of multi-line 
expressions in module attributes, leads to this particularly unexpected 
appearance.

I feel like improvements to both would be welcome in PRs. It may be worth 
first discussing the impact of releasing changes to the formatter, though. 
Even semantically backwards-compatible changes have the potential to lead 
to a lot of syntactic line diff noise and churn when upgrading Elixir, so 
I'm not certain if there is a more cautious release policy for such 
things—such as only releasing major formatter changes in minor version 
bumps.
On Wednesday, December 28, 2022 at 5:59:17 AM UTC-6 dario.h...@gmail.com 
wrote:

> Running `mix format --check-formatted` passes with success on the 
> following code:
>
> defmodule Example do
>   @xs [
>     1,
>     2,
>     3,
>     4,
>     5,
>     6,
>     7
>   ]
>
>   @other [
>            1,
>            2,
>            3,
>            4,
>            5,
>            6,
>            7
>          ]
>          |> Enum.map(&(&1 * 2))
>          |> Enum.reject(&(&1 < 5))
>          |> length()
>
>   @words ~w(
>   ONE
>   TWO
>   THREE
>   FOUR
>   FIVE
>   )
>          |> Enum.filter(&String.contains?(&1, "F"))
> end
>
> I am wondering whether that is intended or if I should open an issue on 
> Github and look into fixing it. 
>
> `@words` does not seem to be formatted in the same way as `@other` which I 
> would kinda expect and the formatting of `@words` looks kinda weird.
>
> Secondly is there reason why when I pipe the module attribute that it gets 
> intended differently than when I do not pipe it (compare @other with @xs)? 
>
> Best regards,
> Dario
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/35cedeec-9456-4d80-9346-09ab1d739229n%40googlegroups.com.

[elixir-core:11229] Re: Formatting of module attributes

Reply via email to