Yes, custom coded Queries can allow you that. Whatever is not possible to do
through Queries, you can do using Filters (but Queries are cheaper).

Implementing your own index input/output classes - that's possible as well
by writing your own Directory class with accompanying IndexInput/IndexOutput
clases if necessary. Depending on what you're exactly after you may be a bit
limited by the core structure. The core could be ammended to support edge
cases if those will be required. Isidor has done a bit work on that field,
have a look at our git repository in his branch (isidor_working), and in
directory_refactor and mpi branches. Specifically you might find commits
802e6b5b4b395010a253cd3a42fcb9904dd37db9 and
312185204d54367db77f5b8538e8cf42c5d6d594 useful.

A word of advice - don't ever use DB to store indexing data. This will
significantly slow down your implementation.

Itamar.

-----Original Message-----
From: Paul J. Lucas [mailto:p...@lucasmail.org] 
Sent: Tuesday, January 05, 2010 10:28 PM
To: clucene-developers@lists.sourceforge.net
Subject: Re: [CLucene-dev] Using CLucene to implement XQuery full-text
search

No, I do not mean parsing a query string.  There is a separate parser that
parses the full XQuery language of which full-text searching is a small
part.  The parser builds an AST.  It will then be my job to walk the AST and
look for full-text query nodes.  I then convert those into a Query object by
using the various query classes, e.g., SpanQuery.  Hence, I would not be
using the built-in CLucene QueryParser.

What I want to know is if the framework for CLucene is expressive enough to
be able to handle the criteria I listed below.  To elaborate: will I be able
to construct/perform queries that:

+ have a word or phrase that occurs at least/most N times can use or not 
+ use stop-words on a query-by-query basis can use wildcards can be 
+ case-sensitive or insensitive can be diacritical-mark-sensitive or 
+ insensitive can keep track of things like sentences and paragraphs

Yes, I am willing to write my own derived classes.

Also, I thought of another question: can I implement my own index
input/output classes so that CLucene stores/retrieves its index data using a
mechanism of my choice?  I.e., if I didn't want to use the binary index
files that CLucene normally creates but instead wanted to store the data
inside blobs in a SQL database, could I by writing my own classes to do
that?

- Paul


On Jan 5, 2010, at 11:43 AM, Itamar Syn-Hershko wrote:

> Hi Paul,
> 
> What do you mean by that?
> 
> CLucene should be handed with a Query object to perform a search, 
> which could then be filtered using a Filter object (this to allow for 
> more complex searches). Producing a Query object from a plain-text 
> string is done by using a QueryParser; a default one comes with 
> CLucene itself, but anyone can implement his own should the need 
> arise. Same goes for Queries and Filters - you can create your own derived
class to perform searches the way you want.
> 
> So if by "hand-coded" you mean creating your own classes, then yes, it
can.
> 
> Itamar. 
> 
> -----Original Message-----
> From: Paul J. Lucas [mailto:p...@lucasmail.org]
> Sent: Monday, January 04, 2010 10:20 PM
> To: clucene-developers@lists.sourceforge.net
> Subject: [CLucene-dev] Using CLucene to implement XQuery full-text 
> search
> 
> Hi -
> 
> I'm looking at CLucene to implement the full-text search feature of
XQuery:
> 
>       http://www.w3.org/TR/xpath-full-text-10/
> 
> Its query abilities are the most complicated I've seen.  Specifically, 
> it allows one specify the following as part of a query:
> 
> + occurs at {least|most} {N} times
> + {with|without} stop words
> + {with|without} wildcards
> + case sensitive | lowercase | uppercase diacritics insensitive 
> + {same|different} {sentence|paragraph} at {start|end} | entire 
> + content
> 
> Can CLucene do all that if the queries are hand-coded?  Thanks.
> 
> - Paul

----------------------------------------------------------------------------
--
This SF.Net email is sponsored by the Verizon Developer Community Take
advantage of Verizon's best-in-class app development support A streamlined,
14 day to market process makes app distribution fast and easy Join now and
get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers



------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to