[ 
http://issues.apache.org/jira/browse/DERBY-472?page=comments#action_12330027 ] 

Daniel John Debrunner commented on DERBY-472:
---------------------------------------------

I'm hoping the work I'm doing in DERBY-571 and beyond will provide the 
framework for integrating lucene into Derby.
With the query,

SELECT * FROM ARTICLES where CONTAINS(summary, '+Apache +Derby -hat') where 
rank > 0.8 order by rank 

I'm thinking that the CONTAINS function would map to a vritual table 
(implementation of java..sql.ResultSet) that took the arguments and ran against 
the lucene index.

> Full Text Indexing / Full Text Search
> -------------------------------------
>
>          Key: DERBY-472
>          URL: http://issues.apache.org/jira/browse/DERBY-472
>      Project: Derby
>         Type: New Feature
>   Components: SQL
>     Versions: 10.0.2.0
>  Environment: All environments
>     Reporter: Rick Hillegas

>
> Efficiently support full text search of string datatyped columns. Mag Gam 
> raised this issue on the user's mailing list on 24 July 2005; the email 
> thread is titled 'Full Text Indexing'. Mag wants to see something akin to the 
> functionality in tsearch2 
> (http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/). Dan points out 
> that we may be able to re-use index building technology exposed by the apache 
> Lucene project (http://lucene.apache.org/).
> Presumably we want to build inverted indexes on all string datatyped columns: 
> CHAR, VARCHAR, LONG VARCHAR, CLOB,, and their national variants (when they 
> are implemented). We should consider the following additional issues when 
> specifying this feature:
> 1) Do we also want to support text search on XML columns?
> 2) Which human languages do we support initially? Each language has its own 
> rules for lexing words and its own list of "noise" words which should not be 
> indexed. Hopefully, we can plug-in some existing packages of lexers and noise 
> filters. We should encourage users to donate additional lexers/fitlers.
> 3) The CREATE INDEX syntax (for these new inverted indexes)  should let us 
> bind a lexing human language to a string-datatyped column.
> 4) How do we express the search condition? For case-sensitive searches we can 
> get away with boolean expressions built out of standard LIKE clauses. 
> However, in my opinion, case-sensitive searches are an edge case. The more 
> useful situation is a case-insensitive search. Can we get away with 
> introducing a non-standard function here or do we need to push a proposal 
> through the standards commitees? Even more useful and non-standard are fuzzy 
> searches, which tolerate bad spellers.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to