[ 
https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-9530:
-------------------------------
    Attachment: SOLR-9530.patch

Considering Noble's and Ishan's suggestions, cooked up a new patch with the 
following:

1. No solrconfig parameter(s) required for this URP now.

2. The URP will take inline parameters exactly as Noble mentioned:
{code}processor=Atomic&Atomic.my_newfield=add&Atomic.subject=set&Atomic.count_i=inc{code}

3. Both atomic and conventional updates as incoming documents to the URP are 
allowed.
   a. for atomic updates, the atomic operation in incoming doc should match 
with the parameters specified in processor call.
   e.g. {"id":"1","title":{"set":"A"}}  |  processor=Atomic&Atomic.title=set

4. After the conversion to atomic-style, latest _version_ will be added in the 
updated doc. If _version_, not present, send as it is.

5. if the update faces version conflict, retry by fetching latest _version_ 
from index, updating the SolrInputDoc. Maximum retries set to 5, hardcoded.

6. If the parameters are not sufficient to convert incoming document to 
atomic-style, abort the update.
e.g. {"id":"1","title":"A"} | processor=Atomic&Atomic.subject=set
there is no point sending this document for update via URP

{noformat}
new file:   
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateProcessorFactory.java
new file:   
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateProcessorFactoryTest.java
{noformat}

Tried to write a test case for multiple threads executing URP simultaneously, 
but was not able to replicate the scenario exactly. The test-method is 
commented out in the patch.

> Add an Atomic Update Processor 
> -------------------------------
>
>                 Key: SOLR-9530
>                 URL: https://issues.apache.org/jira/browse/SOLR-9530
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>         Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch, 
> SOLR-9530.patch
>
>
> I'd like to explore the idea of adding a new update processor to help ingest 
> partial updates.
> Example use-case - There are two datasets with a common id field. How can I 
> merge both of them at index time?
> Proposed Solution: 
> {code}
> <updateRequestProcessorChain name="atomic">
>   <processor class="solr.processor.AtomicUpdateProcessorFactory">
>     <str name="my_new_field">add</str>
>   </processor>
>   <processor class="solr.LogUpdateProcessorFactory" />
>   <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> {code}
> So the first JSON dump could be ingested against 
> {{http://localhost:8983/solr/gettingstarted/update/json}}
> And then the second JSON could be ingested against
> {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}}
> The Atomic Update Processor could support all the atomic update operations 
> currently supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to