Hi Nils, I am currently rewriting the indexing and basing in on the prior discussion here. In that the indexing is done on views not on documents (your 2. suggestion). The views to index are listed in a special design document (_design/fulltextsearch).
My problem is that couchdb only tells me which database has changed, with that information I can get the documents changed, but now I need to run the views to index on these documents (and preferably these documents only) in order to get what I need to index. I believed I could filter on the documentid (startkey_docid) but that is apparently only supported in conjunction with startkey (which I do not know, as I haven't ran the view on the document yet). If you haven't followed the discussion on the fulltext design document you can find a summary on the wiki: http://wiki.apache.org/couchdb/FullTextSearch Have fun Søren On Fri, April 11, 2008 01:13, Nils Adermann wrote: > Hi Søren, > > I'm not entirely sure I understand your problem but to me it looks like > you assume that every view result is tied to exactly one document. This > is not the case. Right now there can be multiple results from a single > document and once we have reduce there can be view results that depend > on any number of documents. That's why Jan's basic fulltext support on a > database level cannot be used to retrieve a subset of a view without > recomputing the entire view ad-hoc. There's only two ways I see right now: > > 1. Recompute the views based on the found documents > - Works for small result sets, otherwise it's probably too slow > - Requires the fulltext indexer to be able to index documents with any > number of differing arbitrary structures containing any amount of text > values > > 2. Index view results > - If not limited this would probably create too many and too big > indexes, therefore we would need a view setting _fulltext_index that > indicates that a view should be indexed. You would use this setting if > your application plans to search results of that view. > - In order to allow fulltext indexers which require a fixed structure > for all documents to index a CouchDB view, you could go even further: > Define a structure without data in the view specification that informs > the indexer which format it can expect all view results will follow. > This could also be used to indicate which resulting values really > contain text that needs to be indexed at all. > > Cheers > Nils > > Søren Hilmer wrote: >>>> 2. startkey_docid does not seam to work, the first document in the >>>> view is >>>> always returned. >>>> >>> startkey_docid needs to be combined with startkey to work correctly. I >>> don't think it's even applied when there's no startkey. >>> >> >> Ahh, this is very unfortunate, say you know the document_id of a changed >> document, but not necessary the view-key, then you have no way of >> getting >> what the view will return for that specific document. >> >> This is the situation for the indexer, CouchDB will notify it with which >> DB has changed, the indexer knows the previous update-sequence and gets >> all documents newer, but it need to index the views specified for >> indexing, and thus run the view for the changed documents only, but as >> it >> has not got the view-key in this situation, it is out of luck. >> >> The wiki for HttpViewApi says "For efficient paging use startkey and/or >> startkey_docid." >> >> Are you sure this does not classify as a Bug? Is there something I am >> missing. >> >> Have fun >> Søren >> >> >>> Cheers, >>> -- >>> Christopher Lenz >>> cmlenz at gmx.de >>> http://www.cmlenz.net/ >>> >>> >>> >> >> > > -- Søren Hilmer, M.Sc., M.Crypt. wideTrail Phone: +45 25481225 Pilevænget 41 Email: [EMAIL PROTECTED] DK-8961 Allingåbro Web: www.widetrail.dk
