>> The source code is the de-facto specification
Fair enough although it does beg the question of which parser source code,
there being no shortage of Lucene/Solr/etc. query parsers, parser releases,
and parser versions at github. Anyway, below is my de jure yacc. I think
it covers everything in the 2012 specification and rounds out the special
cases a little.
Your comments are solicited and will be greatly appreciated.
Cheers, Scott
P.S. yacc/bison can generate parsers in programming languages other than C
including Java.
query : query TOK_AND query
| query TOK_OR query
| TOK_NOT query
| '(' query ')'
| term
term:
TOK_ALPHA |
TOK_WILD |
TOK_ALPHA ':' TOK_ALPHA |
TOK_ALPHA ':' TOK_WILD |
TOK_ALPHA '~' |
TOK_ALPHA '~' TOK_NUM |
TOK_ALPHA '^' TOK_NUM |
TOK_ALPHA ':' TOK_ALPHA '~' |
TOK_ALPHA ':' TOK_ALPHA '~' TOK_NUM |
TOK_ALPHA ':' TOK_ALPHA '^' TOK_NUM |
'"' TOK_ALPHA TOK_ALPHA '"' '~' TOK_NUM |
TOK_ALPHA ':' '[' TOK_NUM TOK_TO TOK_NUM ']' |
TOK_ALPHA ':' '{' TOK_ALPHA TOK_TO TOK_ALPHA '}' |
'+'TOK_ALPHA |
'-'TOK_ALPHA