Hi,

I'm doing pretty well recognizing LaTeX commands, but now I'm at the 
stage where I want to capture the "text". I'm having trouble defining 
"everything else".
Basically, I currently define LaTeX as

commands (as I define them), possibly separated by WS, and everything 
that's not a command is "text". I keep running into a problem that when 
I define "text" generously, it starts grabbing tokens that belong to 
commands. Any help would be greatly appreciated!

Thanks in advance,

Pavel

  I'm including what I have so far, and the document I'm hoping to parse.

grammar PGTeX;

doc : (command WS?)+ EOF;

command : escWord  cWord+ ( sWord+ cWord*)?;

sWord    : '[' word ']';
cWord    : '{' word '}';
escWord : '\\' word;

word : WORD;

WORD:    ('-'|'a'..'z'|'A'..'Z'|'0'..'9'|'\*')+;

WS  :   ( ' ' | '\t'| '\r' | '\n' )+;

COMMENT
     :    '%' (~('\n'|'\r'))*  {$channel = HIDDEN;};


And here's the document:

\documentclass{book}%
\usepackage{amsfonts}
\usepackage{amsmath}%
\newtheorem{summary}[theorem]{Summary}
\begin{document}


\chapter*{Intro}

Book starts here $x^{2}+y^{2}=1$. Here's an intersting faction:
\begin{equation}
\int_{0}^{1}\sin xdx=4
\end{equation}

\end{document}




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to