Hi there,
I'd like to parse 'large' streams (200 MBytes and more) with only
small chunks of data (tokens/characters) in memory at a time.
The goal is that the parser/lexer should block until more chars are
available from the underlying input stream. I have a few simple
'callbacks' embedded in the grammar which call into the business logic
to process recognized data. But with the standard setup, the callbacks
are just called after the complete input stream was read.
// uncompressed replication transaction.
transaction
: { if(callback != null) callback.startTransaction(); }
x01 (update_type)+
{ if(callback != null) callback.endTransaction(); }
;
update_type
: entityId = entity_id '{' (values = basic_update)+ '}'
{ if(callback != null) callback.updateType(entityId, values); }
;
basic_update returns [List<String> values]
@init {
values = new ArrayList<String>();
}
: '{' s = value { values.add(s); } ('|' (s = value
{ values.add(s); } )? )* '}'
;
There are a few reasons why I'd like to do it this way:
1. the data is received in rather small chunks (< 4k or so) from NIO
sockets
2. I don't want to buffer the data on the file system (file I/O)
3. have as small a memory footprint as possible
4. it is possible that many streams are processed/parsed at one time
I'm using ANTLR 3.1.3 (Java/Scala).
From what I see CommonTokenStream.fillBuffer() is pretty greedy and
loads all tokens at once. Right now I'm using ANTLRInputStream as the
CharStream.
Is there a (simple) way to accomplish this? What would be the right
approach: a custom Token Stream or rather another Char Stream? BTW:
Lookahead of 1 is fine for me.
Thanks for your help.
Cheers,
Horst
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.