Hi Tommaso,
It will depend on how different your target syntax will be. If you extend
the classic parser (or, QueryParserBase), there is a fair amount of overhead
and extras that you might not want or need. On the other hand, the query
syntax and the methods will be familiar to the Lucene community, and there is a
large number of test cases already built for you. On the third hand, if you
need not modify the low level parsing stuff, you'll have to be familiar with
javacc.
There's the "flexible" family that should allow for easy modifications, and
the "xml" family could offer an easy interface between a custom lexer and a
parser. The SimpleQueryParser offers a model of building something fairly
simple and yet very elegant from scratch.
In deciding where to start, another consideration might include how easy it
will be to integrate at the Solr level. Make sure to include field-based hooks
for processing multiterms, prefix and range queries.
For LUCENE-5205, I eventually chose to subclass QueryParserBase, and I had to
override a fair amount of code because every terminal had to be a SpanQuery -
most of the queryparser infrastructure is built for traditional queries.
So, what features do you want to add for mlt? What capabilities do you need?
Cheers,
Tim
From: Tommaso Teofili [mailto:[email protected]]
Sent: Thursday, March 06, 2014 6:23 AM
To: [email protected]
Subject: Suggestions about writing / extending QueryParsers
Hi all,
I'm thinking about writing/extending a QueryParser for MLT queries; I've never
really looked into that code too much, while I'm doing that now, I'm wondering
if anyone has suggestions on how to start with such a topic.
Should I write a new grammar for that ? Or can I just extend an existing
grammar / class?
Thanks in advance,
Tommaso