[jira] [Comment Edited] (SOLR-9530) Add an Atomic Update Processor

Ishan Chattopadhyaya (JIRA) Tue, 14 Feb 2017 07:17:14 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865923#comment-15865923
 ]


Ishan Chattopadhyaya edited comment on SOLR-9530 at 2/14/17 3:16 PM:
---------------------------------------------------------------------

As far as I understand, this update processor is only for updates (if document 
pre-exists). I see this as useful in cases where we're ingesting CSV files with 
disjoint information about a document in different files. As an example:

||id||country||
|1| Japan |
|2| Russia |

||id||capital||
|1| Tokyo |
|2| Moscow |

Needs to be both ingested, and hence if both these are ingested through this 
Update Processor, we would end up with 2 documents with 3 fields each (id, 
country, capital).

Did I understand the motivation correctly?


was (Author: ichattopadhyaya):
As far as I understand, this update processor is only for updates (if document 
pre-exists). I see this as useful in cases where we're ingesting CSV files with 
disjoint information about a document in different files. As an example:

||id||country||
|1| Japan |
|2| Russia |

||id||capital||
|1| Tokyo |
|2| Moscow |

Needs to be both ingested, and hence the if both these are ingested through 
this Update Processor, we would end up with a document with 3 fields (id, 
country, capital).

Did I understand the motivation correctly?

> Add an Atomic Update Processor 
> -------------------------------
>
>                 Key: SOLR-9530
>                 URL: https://issues.apache.org/jira/browse/SOLR-9530
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>         Attachments: SOLR-9530.patch
>
>
> I'd like to explore the idea of adding a new update processor to help ingest 
> partial updates.
> Example use-case - There are two datasets with a common id field. How can I 
> merge both of them at index time?
> Proposed Solution: 
> {code}
> <updateRequestProcessorChain name="atomic">
>   <processor class="solr.processor.AtomicUpdateProcessorFactory">
>     <str name="my_new_field">add</str>
>   </processor>
>   <processor class="solr.LogUpdateProcessorFactory" />
>   <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> {code}
> So the first JSON dump could be ingested against 
> {{http://localhost:8983/solr/gettingstarted/update/json}}
> And then the second JSON could be ingested against
> {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}}
> The Atomic Update Processor could support all the atomic update operations 
> currently supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-9530) Add an Atomic Update Processor

Reply via email to