On Mar 29, 2010, at 10:11 AM, mark harwood wrote: > >Of course, but what about the Lucene doc id doesn't provide that? > > The question being how you determine the correct doc id to use in the first > place (especially when they are know to be volatile) - the current answer is > to use a stable identifier term which your app holds in the index, AKA a > primary key. > To support single-doc updates, app developers currently have to : > a) allocate keys uniquely > b) ensure they do not store >1 document with the same key. > > My suggestion was, being fundamental requirements to supporting updates > Lucene could, as a convenience, provide some support for this in it's API - > in the same way a database typically does.
I don't think Lucene needs a primary key. I don't see why this number can't be determined in the usual ways. > > Earwin has perhaps extended your (and my) original thinking to incorporate > set-based updates (a single set of values applied to many documents which > match a query). > His proposal (correct me if I'm wrong, Earwin) is that single and set-based > changes could both be supported by a single > IndexWriter.updateDocuments(query, changedFields) type method. > The benefit of this scheme is that we are providing a simple method, re-using > established concepts (Queries for document selection) but this does not > change the fact that many users will still need to use primary keys for > single-doc updates and they have to assume responsibility for a) and b) above. Hmmm, this sounds like the Parallel Incr. Indexing Busch has put up in a patch. > > On reflection, I guess these responsibilities are not too tough. > a) is catered for by the fact that Lucene is not typically the master data > store (yet!) and filesystem/webserver/database datasources where document > content is sourced usually have the responsibility to allocate some form of > unique identifier in the form of URLs, database keys or filenames which can > be used. Also, b) is not too hard to handle in app code if you always use the > IndexWriter.updateDocument(term,doc) method for inserts. > > > Cheers, > Mark > > From: Grant Ingersoll <gsing...@apache.org> > To: java-dev@lucene.apache.org > Sent: Mon, 29 March, 2010 13:11:56 > Subject: Re: Incremental Field Updates > > > On Mar 29, 2010, at 2:26 AM, Mark Harwood wrote: > >> >>> >>>> Of course introducing the idea of updates also introduces the notion of a >>>> primary key and there's probably an entirely separate discussion to be had >>>> around user-supplied vs Lucene-generated keys. >>> >>> Not sure I see that need. Can you explain your reasoning a bit more? >>>> >> >> If you want to update a document you need a way of expressing *which* >> document you are updating. > > Of course, but what about the Lucene doc id doesn't provide that? > > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search