>I can delete by lucene-generated docId. 

Which users used to have to find by first coding a primary-key-term search. 
Delete by term removed this step to make life easier.


>If someone needs this, it can be built over lucene, without
>introducing it as a core feature and needlessly complicating things.

I think with any partial-update feature the *absence* of primary key support 
would  "needlessly complicate things":
If Lucene is not capable of performing duplicate detection on insert (because 
it has no notion of a primary key field), we need to be prepared for the 
situation where we have duplicate-key docs in the index.
What then happens when Grant wants to do a "partial update" as opposed to the 
existing full-update semantics which first deletes all documents containing the 
supplied term (always a form of primary key)? 
Which document instance gets "partially updated"? We either:
a) throw a "duplicate" error (which ideally should have happened back at dup 
insert time)
b) Choose one of the documents to "partially update" and keep the duplicate(s)
c) Choose one of the documents to "partially update" and delete the duplicate(s)
d) "Partially update" all of the duplicate(s)
All less than ideal.

I know we are schema-averse with Lucene (and I value that) but surely any 
partial update feature has to start with a strongly maintained notion of 
document identity as a foundation?
Rather than "needless complexity" I'd argue this was "needed rigour" and 
actually simplifies the user's job if Lucene can do the duplicate-key-on-insert 
check automatically rather than relying on ropy application code and dealing 
with any failures in that.
Of course primary keys are not mandatory. You only use them when you need this 
behaviour - just like in SQL.

Cheers
Mark





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to