Thanks to Jon and Jay's testing, we can make the following statement:
Compilers ignore a source file that is physically empty (zero length) or
logically empty (contains only whitespace and/or comments).
(I confirmed this by tweaking Test.java so that the empty case was not
`""` but rather `" /* Comment */ "`. javac still accepted/ignored it.)
In other words, compilers do not observe an ordinary compilation unit if
it has no package, import, or type declarations -- a.k.a. a "vacant"
ordinary compilation unit.
We know this to be true because if compilers did observe a vacant
ordinary compilation unit, then the lack of a package declaration would
cause an error when the empty source file is in a modular location; but
no such error is given.
Compilers are free to take this not-observable stance, per 7.3: "The
host system determines which compilation units are observable". It would
be possible to mandate that the host system MUST NOT observe a vacant
ordinary compilation unit, but such a mandate would probably have
unintended consequences. It would also be possible to define a vacant
ordinary compilation unit out of existence, by tweaking 7.3's grammar as
proposed in the quoted mail below, but again, beware unintended
consequences. What the JLS should do is affirm the compilers' decision
to "accept/ignore" a vacant ordinary compilation unit, by clarifying
that a vacant ordinary compilation unit is exempt from the "part of an
unnamed package" rule in 7.4.2. I have filed spec bug JDK-8214743; "An
ordinary compilation unit that has no package declaration, but has at
least one other kind of declaration, is part of an unnamed package."
Alex
P.S. In the course of examining 7.3's grammar, I realized that
OrdinaryCompilationUnit is not congruent with how 2.1 defines a
production in a context-free grammar as having "a sequence of one or
more nonterminal and terminal symbols as its right-hand side."
2.1's definition is intended to apply _after_ interpretation of 2.4's
grammar notation. For example, the production `A: [B]` is really two
productions, `A: ` and `A: B`. The first has zero symbols as its RHS, so
the grammar is not context-free -- parsing of an A is possible at any
time, based on considerations other than the terminals in hand.
Similarly, the production `C: {D}` is really an infinite number of
productions `C: ` and `C: D` and `C: D D` and `C: D D D` etc.
OrdinaryCompilationUnit is significant for being the only production in
the JLS to allow zero symbols and thus _not_ be context-free. Compilers
provide the context when they lex an empty source file and decide not to
observe an ordinary compilation unit therein.
There's nothing good to be done here. We aren't going to change the
longstanding OrdinaryCompilationUnit production after all, and I don't
want to complicate 2.1 by special-casing its zero-symbols RHS.
On 12/3/2018 8:29 AM, Jayaprakash Artanareeswaran wrote:
Thanks for the test file Jon. Last week I and Stephan had a discussion
and agreed with the specified behavior and made some changes to our
compiler.
I can also confirm that both the compilers behave the same way for all
the scenarios included in the test file.
Regards,
Jay
------------------------------------------------------------------------
*From:* Jonathan Gibbons <jonathan.gibb...@oracle.com>
*Sent:* Monday, November 26, 2018 11:22 PM
*To:* Alex Buckley; Jayaprakash Artanareeswaran;
jigsaw-dev@openjdk.java.net; compiler-dev
*Subject:* Re: Where do empty compilation units belong?
On 11/26/2018 01:44 PM, Alex Buckley wrote:
// Adding compiler-dev since the parsing of files into compilation
units is not a Jigsaw issue.
On 11/20/2018 9:14 PM, Jayaprakash Artanareeswaran wrote:
"jigsaw-dev" <jigsaw-dev-boun...@openjdk.java.net>
<mailto:jigsaw-dev-boun...@openjdk.java.net> wrote on 21/11/2018
01:56:42 AM:
> Jon points out that `OrdinaryCompilationUnit` will match an empty
stream
> of tokens (I dislike the syntax-driven optionality here, but it's
> longstanding) so the file D.java could be regarded as a
compilation unit
> with no package declaration, no import declarations, and no type
> declarations.
>
> Per JLS 7.4.2, such a compilation unit is in an unnamed package, and
> must be associated with an unnamed module.
>
> I would prefer 7.4.2 to say only that a compilation unit with no
package
> declarations _and at least one type declaration_ is in an unnamed
> package (and must be associated with an unnamed module; 7.3 should
> enumerate that possibility). A compilation unit with no package
> declarations _and no type declarations_ would be deemed
unobservable by
> 7.3, and all these questions about what to do with empty files would
> disappear.
That would be perfect and make things unambiguous. But for now, the
paragraph above is good enough for me.
Unfortunately, import declarations can have side effects (compile-time
errors) so to be sure that the "no package or type decl ===
unobservable" rule is suitable for a file containing just an import
decl, we would have to do a case analysis of how javac and ecj handle
the eight combinations of the three parts allowed in an ordinary
compilation unit. That's overkill for the situation involving empty
files that keeps coming up and that I really want to clarify. I don't
think anyone loves that an ordinary compilation unit matches the empty
stream, so let's define away that scenario. As Jon said, an empty file
doesn't present anything to be checked; there is no compilation unit
there, so let's be unambiguous about that.
We can rule out the empty stream in 7.3 with grammar or with
semantics. Usually a semantic description is clearest (gives everyone
the proper terminology and concepts) but in this case we don't want
the description to wrestle with "consists of one, two, or three parts"
when the grammar allows zero. So, a new grammatical description is
appropriate, and straightforward:
OrdinaryCompilationUnit:
PackageDeclaration {ImportDeclaration} {TypeDeclaration}
ImportDeclaration {ImportDeclaration} {TypeDeclaration}
TypeDeclaration {TypeDeclaration}
The "three parts, each of which is optional" description is still
accurate. The package decl part is optional (as long as you have the
import decls part and/or the type decls part); the import decls part
is optional (as long as you have either the package decl part or ...)
... you get the picture.
I would leave 7.4.2 alone; an ordinary compilation unit with no
package or type decls but with import decls is part of the unnamed
package (and thus unnamed module) as before, and compilers can handle
that, I think.
Any comments?
Alex
That seems good to me.
To summarize the javac behavior ...
* javac accepts/ignores an empty file
* javac treats import-only compilation units as in the unnamed
package, which is not allowed in a named module
* javac enforces file naming constraints when declaring a public class
* javac uses file naming constraints when looking on the (module)
source path for a file for a class
Attached is a toy class to generate combinations of package, import and
type declarations. You can use the source-launcher feature to run it.
-- Jon