There is no single query parser that covers all of Lucene's API (various Query class implementations). The existing "query parsers" cover subsets of the API - historically they vary depending on what people needed, what kind of audience was targeted (technical users, general public)...Many factors.
There is no "single" Lucene syntax that would cover everything - people pick what they need and write query parsers that work for them, typically. If you take a look at Elasticsearch, their primary "query" is a structured DSL covering typical Query classes, not any plain text representation. Perhaps this is the closest to what a "query parser" for Lucene API should be. Dawid On Wed, Nov 11, 2020 at 9:54 PM Scott Guthery <[email protected]> wrote: > > >> The source code is the de-facto specification > > Fair enough although it does beg the question of which parser source code, > there being no shortage of Lucene/Solr/etc. query parsers, parser releases, > and parser versions at github. Anyway, below is my de jure yacc. I think it > covers everything in the 2012 specification and rounds out the special cases > a little. > > Your comments are solicited and will be greatly appreciated. > > Cheers, Scott > > P.S. yacc/bison can generate parsers in programming languages other than C > including Java. > > query : query TOK_AND query > | query TOK_OR query > | TOK_NOT query > | '(' query ')' > | term > term: > TOK_ALPHA | > TOK_WILD | > TOK_ALPHA ':' TOK_ALPHA | > TOK_ALPHA ':' TOK_WILD | > TOK_ALPHA '~' | > TOK_ALPHA '~' TOK_NUM | > TOK_ALPHA '^' TOK_NUM | > TOK_ALPHA ':' TOK_ALPHA '~' | > TOK_ALPHA ':' TOK_ALPHA '~' TOK_NUM | > TOK_ALPHA ':' TOK_ALPHA '^' TOK_NUM | > '"' TOK_ALPHA TOK_ALPHA '"' '~' TOK_NUM | > TOK_ALPHA ':' '[' TOK_NUM TOK_TO TOK_NUM ']' | > TOK_ALPHA ':' '{' TOK_ALPHA TOK_TO TOK_ALPHA '}' | > '+'TOK_ALPHA | > '-'TOK_ALPHA --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
