Scheme _is_ my cup of tea.  That's kind of the point; I don't have a very 
good memory (when I programmed in Perl I had to keep the manual sitting by 
my hand, and when I'm reading other people's C code I pin a precedence 
table to the wall) but the parsing rules for Lisp-like languages are so 
simple I haven't forgotten them in 30 years of on-and-off use.  Go, for all 
its faults, tries to keep the semantics simple enough to hold in a 
programmer's head (and has the actual rules concisely stated in the online 
manual when the semantics aren't in the programmer's head.)  The problem 
with Ruby isn't that their parser is in some strange position in the 
Chomsky Hierarchy, it's that some programs are "write-only" because the 
syntax is ambiguous and there's no obvious rule for remembering how the 
ambiguities are resolved.

I'm not trying to encourage you to use a parser generator.  I'm just 
honestly telling you that I wouldn't be a good candidate to concisely write 
down what your current parser actually does.  It would make me frustrated, 
and I would just end up posting flamy rants on the newsgroups.

What I am encouraging you to do is to clearly state, in a couple of 
sentences in the online manual, how you parse expressions.  Unfortunately I 
think it is a bit more complicated than Stefan let on.  Here's what I think 
I've figured out, approximately:

Julia, like Go, has implicit semicolons in compound expressions.  If you 
leave out the semicolons then Julia, like Go, will auto-insert them at 
newlines, if the resulting expression "makes sense".  (This is essentially 
the explanation from the Go manual.)  If the semicolons (implicit or 
explicit) are missing then the parser complains.  That is: the parser does 
_not_ try to figure out where to insert semicolons between two expressions 
on the same line.

But Julia, unlike Go, also has an implicit "begin" for compound expressions 
that have a controlling expression (if-else-end and while-end (and perhaps 
others?))  This allows single-line if-else-end and single-line while-end 
expressions, but the cost is a new rule: after an "if" and after a "while" 
(and probably "elseif" and maybe "catch"?) an implicit "begin" is inserted 
at whatever comes first: the next newline or the end of the longest 
possible expression that can be constructed before the newline.

Perhaps "implicit begin" isn't the right way to explain it.  But it is 
understandable to me.

On Saturday, July 19, 2014 7:22:27 PM UTC-5, Jake Bolewski wrote:
>
> There are some cases when you cannot just consume the tokenized source but 
> have to drop down and do character look ahead to disambiguate at a certain 
> point.  I don't know know how common that is in other languages and it 
> happens in a couple of cases when parsing julia (mostly with characters 
> which are overloaded to have meanings in different contexts) but would love 
> to find out.  But as Stephan said it is all look ahead.. I believe there is 
> only one place where you consume a token and backtrack if a condition does 
> not hold. 

Reply via email to