With bison/flex the lexical analyzer pipelines tokens on demand to the
parser
as the grammar rules are processed.
It appears in Antlr (3) that lexical analysis tokenizes the input stream
before any grammar rules are activated.
By pipelining, it's possible to influence the parse from prece
arches
on both markmail and in the downloaded samples have turned up no
grammars that use both return-attributes and list matching in tandem.
Someone out there know the right way to write this idiom?
Thanks in advance!
-Bob
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsu
does this in the generated code.
Jim
> -Original Message-
> From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
> boun...@antlr.org] On Behalf Of Bob
> Sent: Saturday, May 22, 2010 11:33 AM
> To: antlr-interest@antlr.org
> Subject: Re: [antlr-interest] Antlr 3.2 v
the memory usage you are seeing is probably this first and
not the tokens.
Jim
> -Original Message-
> From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
> boun...@antlr.org] On Behalf Of Bob
> Sent: Friday, May 21, 2010 7:47 PM
> To: antlr-interest@antlr.org
>
A tiny grammar was implemented in both Antlr and Bison+Flex (shown below).
Test files repeating two lines (shown below) were made in 6 different
sizes.
One executable compiled with command line switch choosing either
Antlr or Bison+Flex.
One run with empty actions, one run with actions popula
I'm 1 day into Antlr and hope for an answer to this:
With an identifier rule (for example this one):
SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')* ;
Is it possible, when the lexer recognizes the input stream to be a
SIMPLE_IDENTIFIER, to add some e
s not used.
Reason: Some source files are 800mb - 1.4gb in size and reading the entire
thing into 32 bit address space doesn't leave much leftover.
If it's possible to limit the input buffer size, can you point me in the
right direction?
Thanks,
Bob
List: http://w
what's the best way to detected illegal input chars in the lexer -- in
my case, chars with a code > 127 [i just had my grammar enter an
infinite loop on an arithmetic expression where the minus sign was
really an en-dash with code == 150, but maybe that's another problem!!!]
presumably, some p
. This either
leads to the island lexer throwing away input (since it terminates on
EOF and tosses the remaining multi-emit buffer) or throwing an error
(if it runs across input that it cannot understand).
I've included a reproducer (in python) which can demonstrate and gives
ng you could invoke because you hadn't written a tree
parser). You should probably run through the python examples
(http://www.antlr.org/download/examples-v3.tar.gz
) or the ANTLR book if you're going to do that.
Hope this helps.
-Bob
For clarity, here's the whole file
my language has a simple pre-processor that expands text of the form
${} as a first phase of translation; the expanded stream
is then input to my ANTLRInputStream, where it proceeds onward to the
lexer/parser in the usual fashion. said another way, neither the lexer
nor the parser is aware of
g
things for several days. It's a pretty exciting tool, but non-trivial.
Bob Anderson
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe
I'm trying to build an AST that has extraneous tokens removed. But whether
or not I use a rewrite rule, ANTLRworks always shows me an AST with every
token in it. Is that normal ANTLRworks behavior, or am I missing some
directive?
Below is a snippet of the grammar...
options
{
language =
13 matches
Mail list logo