[ https://issues.apache.org/jira/browse/LUCENE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699417#comment-13699417 ]
Roman Chyla commented on LUCENE-5014: ------------------------------------- New addition: solr qparser plugin. It is unfortunately not as easy as one may think, because of various defaults - e.g. user may want to specify different defaultField, whether wildcards are allowed at the beginning, what is the maximum range for proximity values... some of which should be only in solrconfig.xml, and some also in query params. So here is a stab at it, it works, but may require more config options - there is also a new unittest. Only that Ivy mirrors decided to not work now (ughhh) so I could not test solr unittests - ihope it works. Lucene's 'ant test' went fine. If sb wants to try in solr, please make sure you have antlr-runtime.jar in your solr libs and this should go inside solrconfig.xml {code} <queryParser name="lucene2" class="AqpLuceneQParserPlugin"> <lst name="defaults"> <str name="defaultField">text</str> </lst> </queryParser> {code} > ANTLR Lucene query parser > ------------------------- > > Key: LUCENE-5014 > URL: https://issues.apache.org/jira/browse/LUCENE-5014 > Project: Lucene - Core > Issue Type: Improvement > Components: core/queryparser, modules/queryparser > Affects Versions: 4.3 > Environment: all > Reporter: Roman Chyla > Labels: antlr, query, queryparser > Attachments: LUCENE-5014.txt, LUCENE-5014.txt, LUCENE-5014.txt, > LUCENE-5014.txt, LUCENE-5014.txt > > > I would like to propose a new way of building query parsers for Lucene. > Currently, most Lucene parsers are hard to extend because they are either > written in Java (ie. the SOLR query parser, or edismax) or the parsing logic > is 'married' with the query building logic (i.e. the standard lucene parser, > generated by JavaCC) - which makes any extension really hard. > Few years back, Lucene got the contrib/modern query parser (later renamed to > 'flexible'), yet that parser didn't become a star (it must be very confusing > for many users). However, that parsing framework is very powerful! And it is > a real pity that there aren't more parsers already using it - because it > allows us to add/extend/change almost any aspect of the query parsing. > So, if we combine ANTLR + queryparser.flexible, we can get very powerful > framework for building almost any query language one can think of. And I hope > this extension can become useful. > The details: > - every new query syntax is written in EBNF, it lives in separate files (and > can be tested/developed independently - using 'gunit') > - ANTLR parser generates parsing code (and it can generate parsers in > several languages, the main target is Java, but it can also do Python - which > may be interesting for pylucene) > - the parser generates AST (abstract syntax tree) which is consumed by a > 'pipeline' of processors, users can easily modify this pipeline to add a > desired functionality > - the new parser contains a few (very important) debugging functions; it can > print results of every stage of the build, generate AST's as graphical > charts; ant targets help to build/test/debug grammars > - I've tried to reuse the existing queryparser.flexible components as much > as possible, only adding new processors when necessary > Assumptions about the grammar: > - every grammar must have one top parse rule called 'mainQ' > - parsers must generate AST (Abstract Syntax Tree) > The structure of the AST is left open, there are components which make > assumptions about the shape of the AST (ie. that MODIFIER is parent of a a > FIELD) however users are free to choose/write different processors with > different assumptions about the AST shape. > More documentation on how to use the parser can be seen here: > http://29min.wordpress.com/category/antlrqueryparser/ > The parser has been created more than one year back and is used in production > (http://labs.adsabs.harvard.edu/adsabs/). A different dialects of query > languages (with proximity operatos, functions, special logic etc) - can be > seen here: > https://github.com/romanchyla/montysolr/tree/master/contrib/adsabs > https://github.com/romanchyla/montysolr/tree/master/contrib/invenio -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org