Having a query language that can express all the query capabilities
supported
by Lucene will allow programmers to construct a string representation of the
query and then 'compile' it using the query parser. This way, one don't
have to learn the internal representation of the various query classes.
For example, if the Search form of the application contains the fields 'any
word',
'all words' 'phrase' and 'subject', the application can collect the values
form the user
construct a string that describes the desired query and use the query parser
to
construct the actual query.
Tal
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Brian Goetz
> Sent: Wednesday, June 13, 2001 1:06 PM
> To: Tal Dayan
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Lucene-dev] New QueryParser
>
>
> > Brian, here is another idea for the query parser. To add the
> ability to mark
> > terms as 'non analyzed'.
> >
> > For example
> >
> > +body:xyz +folder:a.b.c.d
> >
> > when 'folder' is a non tokenized field will not match if a.b.c.d is
> > tokenized.
> >
> > A possible syntax may be
> >
> > +body:xyz +folder:'a.b.c.d'
>
> I understand the desire for such a feature (someone else suggested the
> same thing.) I am very wary of creating "new syntax" about which
> you'll have to educate your users. I know it sounds like you're only
> asking for one feature, but if you think it'll be the last "special
> case" that someone wants, well, I don't believe you. I can't think of
> any syntax that will clearly and unambiguously indicate "no
> tokenization please."
>
> Its one thing to add a syntax for the boost stuff, which only very
> advanced users will use, but this is something that might be expected
> of relatively beginning users -- "you have to put the author's name in
> single quotes, but the article title in double quotes." No way.
>
> I think the request for this underscores an issue that's been bugging
> me for a while -- since its so important that you use the same
> analyzer for queries as for indexing, maybe the analyzer should
> actually be stored in the index store.
>
> I could see two ways to address this issue:
>
> 1 (complicated way): When the index store is created, register an
> analyzer for each field (could be the same one.) A serialized copy of
> the analyzer is stored in the index base, and queries on that field
> are automatically processed with it.
>
> 2 (simpler, less complete way): Have a way of telling the query parser
> that "these fields use these analyzers", or at the very least, "these
> fields don't get tokenized with an analyzer."
>
>
> > BTW, it will be great if the syntax of the query parser will allow
> > to describe any query that is supported by Lucene standard
> > classes. This will provide a common language to describe queries and
> > will provide an alternative, and more intuitive, way to construct
> > queries.
>
> Nice goal, and I'm happy to try for it if practical, but I think a
> more important rule is that they syntax should be simple and hard to
> mess up. I would -1 adding any syntax which will only be used by 5%
> of the users, but which might confuse the other 95%, and the same with
> any syntax which will be widely used but which requires more than a
> sentence or two of explanation to the "average user." Remember, the
> people who create these queries are used to using Google; we should
> support a query language which is familiar (or at least easily
> explained to those users. Advanced users can still create their own
> with the query classes.
>
>
>
> _______________________________________________
> Lucene-dev mailing list
> [EMAIL PROTECTED]
> http://lists.sourceforge.net/lists/listinfo/lucene-dev
>
_______________________________________________
Lucene-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-dev