On Jul 11, 2008, at 22:29 , Damien Katz wrote:
CouchDB needs integrate full-text indexing support. We should be
able to support multiple full text engines, but our reference
implementation will be Apache Lucene.
Initially (I'm hoping for 0.9.0) we should be able to index all
documents and their attachments (for types that lucene can index
anyway) and return queries against that index via. Jan has begun
this work and I think someone has this mostly working now somewhere,
but its not in trunk?
we have a patch that improves the API here:
https://issues.apache.org/jira/browse/COUCHDB-74
and there is the
http://svn.apache.org/repos/asf/incubator/couchdb/branches/lucene-search/
branch that this patch should be applied to. Further work should be
continued there. At this
point the only difference between trunk and the branch is the addition
of the /db/_search
API call. The branch also might need to be brought up to trunk. It has
no current maintainer,
although Paul Davis voiced interest in pushing this forward. Also,
there were attempts at adding
other search engines but they never surfaced. If I remember correctly,
the problem that views
can not be searched without expanding the view server, stopped most
work.
By 1.0, we should also do a view intersections with full text
results. At query time, CouchDB gets back a list of matching
documents and then finds the emited view rows from those documents,
and returns them sorted by relevance score. This will require some
enhancements to the internal view API, but the data and required
index (views keys by doc id) already exist to make this efficient.
I opened a bug report for this.
--
Since I started the work on Lucene I am by open source work definition
somewhat responsible for the life of this. But I'd rather not, at
least for the Java side of things. If somebody (heya Paul, still in?)
wants to take this over, that'd be mighty cool.
Cheers
Jan
--
Perhaps not initially, but eventually the integration of the
fulltext engine will be as proper couchdb HTTP and daemon plug-ins
(once those apis are established).
On Jul 2, 2008, at 3:08 AM, Jan Lehnardt wrote:
Hello everybody,
this thread is meant to collect missing work items (features and
bugs) for for our 1.0 release and a discussion about how to split
them up between 0.9 and 1.0.
Take it away: Damien.
Cheers
Jan
--