[ 
https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219202#comment-13219202
 ] 

Per Steffensen edited comment on SOLR-3173 at 2/29/12 1:49 PM:
---------------------------------------------------------------

bq. Optimistic locking as a superset to insert/update:
bq.
bq. What I already had in mind:
bq. - update only a specific version of the document by specifying it's exact 
version:  _version_=12345
bq. - add a document only if it doesn't already exist (i.e. insert): 
_version_=-1
bq. - add a document regardless: don't specify a version

I still need a little time to evaluate to what extend _version_ can be used.

bq. So now that I look at it again, it looks like what's missing is your 
"UPDATE" semantics which would only replace the record if it already existed (a 
weaker form of the first case... any positive version is OK).  But I really 
wonder how useful those semantics are (only add a doc if it's overwriting an 
existing doc, regardless of what version or what data it contains?)
bq. If there are usecases, we certainly should be able to do it.

The only-insert-if-not-exists is needed by us. The only-update-if-exists is 
mostly for consistency with what we know from RDBMS. Basically simulating what 
happens when you do the following in SQL and you have unique-constraint on id 
column. 1) will fail with a unique-key constraint error if document already 
exists and 2) will not create the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, 
value3,...)
and
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document 
if it has already been deleted. I will consider not making that feature - it is 
only there to give a consistent experince compared to what you are used to 
using RDBMS's, and actually seen from a distant perspective I think it is not 
logical with an "update"-operation that creates stuff if it does not exist (it 
is simple not logical from the word "update")

Right now I believe the solution will be that you will have the following 
URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged 
semantics
b) .../solr/.../database/update, that updates if document already exists and 
does nothing if it does not already exists. And when versioning is activated 
(SOLR-3178) only updates if correct version is given - give VersionConflict 
error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does 
not already exist. Fails with DocumentAlreadyExists error if document already 
exists.
The you can keep using Solr exactly as you are used to, and you can start using 
the new "database semantics" features if you want that. I might create a 
optinal config for DirectUpdateHandler2 where you can deactivate the stuff 
behind a). This can be used when you dont trust clients to use a) correctly in 
a setup where you want to ensure consistency under high concurrent load.

bq. As far as what \_version\_ is, it's new and used for solrcloud to handle 
reorders of updates to replicas (among other things).
bq. The leader shard decides what the version of a document should be (versions 
only increase), and forwards the doc with the version to the replicas.
bq. If a replica receives the same doc with a lower version, it knows that it 
can safely drop it because it already has a newer version.

Cool. I understand a little better now. So no (Wiki) documentation written yet?
                
      was (Author: steff1193):
    
bq. Optimistic locking as a superset to insert/update:
bq.
bq. What I already had in mind:
bq. - update only a specific version of the document by specifying it's exact 
version:  _version_=12345
bq. - add a document only if it doesn't already exist (i.e. insert): 
_version_=-1
bq. - add a document regardless: don't specify a version

I still need a little time to evaluate to what extend _version_ can be used.

bq. So now that I look at it again, it looks like what's missing is your 
"UPDATE" semantics which would only replace the record if it already existed (a 
weaker form of the first case... any positive version is OK).  But I really 
wonder how useful those semantics are (only add a doc if it's overwriting an 
existing doc, regardless of what version or what data it contains?)
bq. If there are usecases, we certainly should be able to do it.

The only-insert-if-not-exists is needed by us. The only-update-if-exists is 
mostly for consistency with what we know from RDBMS. Basically simulating what 
happens when you do the following in SQL and you have unique-constraint on id 
column. 1) will fail with a unique-key constraint error and 2) will not create 
the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, 
value3,...)
and
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document 
if it has already been deleted. I will consider not making that feature - it is 
only there to give a consistent experince compared to what you are used to 
using RDBMS's, and actually seen from a distant perspective I think it is not 
logical with an "update"-operation that creates stuff if it does not exist (it 
is simple not logical from the word "update")

Right now I believe the solution will be that you will have the following 
URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged 
semantics
b) .../solr/.../database/update, that updates if document already exists and 
does nothing if it does not already exists. And when versioning is activated 
(SOLR-3178) only updates if correct version is given - give VersionConflict 
error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does 
not already exist. Fails with DocumentAlreadyExists error if document already 
exists.
The you can keep using Solr exactly as you are used to, and you can start using 
the new "database semantics" features if you want that. I might create a 
optinal config for DirectUpdateHandler2 where you can deactivate the stuff 
behind a). This can be used when you dont trust clients to use a) correctly in 
a setup where you want to ensure consistency under high concurrent load.

bq. As far as what \_version\_ is, it's new and used for solrcloud to handle 
reorders of updates to replicas (among other things).
bq. The leader shard decides what the version of a document should be (versions 
only increase), and forwards the doc with the version to the replicas.
bq. If a replica receives the same doc with a lower version, it knows that it 
can safely drop it because it already has a newer version.

Cool. I understand a little better now. So no (Wiki) documentation written yet?
                  
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of 
> concurrent inserts, updates, deletes and queries in the entire lifetime of 
> the index) instead of just a search index (first: everything indexed (in one 
> thread), after: only queries), I would like Solr to support the following 
> features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" 
> and a document Dold, when trying to INSERT a new document Dnew where 
> Dold.uniqueField is equal to Dnew.uniqueField, then I want a 
> DocumentAlredyExists error. If no such document Dold exists I want Dnew 
> indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" 
> and a document Dold, when trying to UPDATE a document Dnew where 
> Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and 
> Dnew added to the index (just as it is today).If no such document Dold exists 
> I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or 
> update) and have slightly different semantics (from each other and the 
> existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you 
> run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and 
> only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to