[ 
https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263514#comment-13263514
 ] 

Per Steffensen commented on SOLR-3178:
--------------------------------------

First of all. Thanks a lot for getting back to me, and taking the time to try 
to understand such a big patch covering several jira issues - Im still sorry 
about that.

{quote} I think I agree it would be nice to have more of the functionality 
consolidated to the update handler and not do quite so much "up the 
callstack"... of course the downside to this would be the context that the 
update processors have (namely they know when a request, which can contain 
multiple updates, starts and finishes). {quote}

I agree and understand about the different context, but failing/succeeding with 
PartialError (VersionConflict, DocumentAlreadyExists, DocumentDoesNotExist or 
WrongUsage) is a per-document-to-be-updated thing. Therefore I believe it does 
not matter in this case, or acutally, that might just be the argument that it 
belongs in UpdateHandler. Certainly it can be done in UpdateHandler.

{quote} Regardless - we are where we are right now... and doing the version 
check in the update handler presents some problems. For cloud, we want to start 
sending to replicas at the same time as indexing locally. {quote}

Im not sure I understand. As I read the code, we are not doing that as it is 
today. In DistributedUpdateProcessor.processAdd you first do the local add (as 
the leader) in versionAdd AND THEN, if you succeed, forwards to all replica (in 
the if (nodes != null) block). I believe it is a good strategy to first do 
update on leader and then, if success on leader, do update on replica 
concurrently (and asynchronously). So stick with that!

So doing version-check etc (when leader) in DirectUpdateHandler2 does not 
represent a problem. Errors occuring as a result of version-check etc should 
just (as any other error) result in update not forwarded to replica.

One problem in my patch is that DirectUpdateHandler2 does not notify (by 
throwing error or making sure dropCmd is true) DistributedUpdateProcessor to 
not send to replica in case of PartialError (VersionConflict, 
DocumentAlreadyExists, DOcumentDoesNotExists or WrongUsage). It will in the 
next upcomming patch. 

{quote} We also want to check the version before sending to replicas since it 
would be easy to get into a scenario where a conditional update fails on some 
replicas and succeeds on others (due to the fact that updates can easily be 
seen by different replicas in a different order). {quote}

Agree about check version on leader before sending to replica. But that can 
equally well be done in "DistributedUpdateProcessor.versionAdd" (inside the 
locks) as you do, as it can be done in "DirectUpdateHandler2.addDoc" as I do. 
So with respect to that it doesnt matter if you take your patch or mine. Of 
course when I change my patch so that version-check etc errors in 
DirectUpdateHandler2 actually ensures that the update is not sent to replica.

The idea is that version-checks and uniqueKey-already-exists-checks etc, 
leading to PartialError, are only performed on leader, so replica will not fail 
with this kind of errors - and they shouldnt. Please note "cmd.isLeaderLogic()" 
in DirectUpdateHandler2.addDoc in the line
{code}
boolean getAndCheckAgainstExisting = 
semanticsMode.needToGetAndCheckAgainstExistingDocument(cmd) && 
cmd.isLeaderLogic();
{code}
Replica are supposed to be "dumb" in the way that they just index (without any 
checking) what was successfully indexed on the leader (and therefore forwarded 
to replica in the first place), and support getting updates in different orders 
by not indexing updates if they are "older" than an update they have already 
indexed. 

{quote} So I think the best way forward is to commit the patch that does the 
version check in the distributed update processor, and then build off of that 
(for instance your more nuanced error messages and the set/get requestVersion 
on the update command). {quote}

I have to say that I do not agree. Believe I have argued why the problems you 
mention with my patch/approach is not true (when I make the small fix 
mentioned). Sure your patch is a small step forward (progress, not perfection), 
but building on top of that will put me in a situation where I have to make 
changes to my patch in order to get all the extra bennefits it provide (much 
more progresss, not perfection :-) ). Of course you can commit your patch, but 
I still believe the code in that patch should be deleted when adding my patch, 
and I still hope that will be the end result.

I will provide patch soon with the following changes
* PartialError in DirectUpdateHandler2 on leader will make 
DistributedUpdateProcessor not forward to replica (as with any other error on 
leader)
* Error propagation will work when client contact server that is not the leader 
of the document to be updated, and therefore forwards to leader, and therefore 
also have to propagate errors from that leader back to client. His has always 
worked for single errors, where you just abort the entire update on the first 
error, leaving no information to the client about what failed and what did not, 
but doesnt work now that you can have partial errors where some documents fail 
and other succeeds (on different servers) and you want to provide a detailed 
picture about it back to the client.
* Cleanup of a few things in the patch that should never have been included 
(survivals from early attempts rolled back on my side)
                
> Versioning - optimistic locking
> -------------------------------
>
>                 Key: SOLR-3178
>                 URL: https://issues.apache.org/jira/browse/SOLR-3178
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, locking, nosql, optimistic, uniqueKey, 
> update, versioning
>             Fix For: 4.0
>
>         Attachments: SOLR-3178.patch, SOLR_3173_3178_3382_plus.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of 
> concurrent inserts, updates, deletes and queries in the entire lifetime of 
> the index) instead of just a search index (first: everything indexed (in one 
> thread), after: only queries), I would like Solr to support versioning to be 
> used for optimistic locking.
> When my intent (see SOLR-3173) is to update an existing document, I will need 
> to provide a version-number equal to "the version number I got when I fetched 
> the existing document for update" plus one. If this provided version-number 
> does not correspond to "the newest version-number of that document at the 
> time of update" plus one, I will get a VersionConflict error. If it does 
> correspond the document will be updated with the new one, so that "the newest 
> version-number of that document" is NOW one higher than before the update. 
> Correct but efficient concurrency handling.
> When my intent (see SOLR-3173) is to insert a new document, the version 
> number provided will not be used - instead a version-number 0 will be used. 
> According to SOLR-3173 insert will only succeed if a document with the same 
> value on uniqueKey-field does not already exist.
> In general when talking about different versions of "the same document", of 
> course we need to be able to identify when a document "is the same" - that, 
> per definition, is when the values of the uniqueKey-fields are equal. 
> The functionality provided by this issue is only really meaningfull when you 
> run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3173, and 
> only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to