A RequestProcessor to support updates
-------------------------------------

                 Key: SOLR-828
                 URL: https://issues.apache.org/jira/browse/SOLR-828
             Project: Solr
          Issue Type: Improvement
            Reporter: Noble Paul
             Fix For: 1.4


This is same as SOLR-139. A new issue is opened so that the UpdateProcessor 
approach is highlighted and we can easily focus on that solution. 


The new {{UpdateProcessor}} called ({{UpdateableIndexProcessor}}) must be 
inserted before {{RunUpdateProcessor}}. 

* The {{UpdateProcessor}} must add an update method. 
* the {{AddUpdateCommand}} has a new boolean field append. If append= true 
multivalued fields will be appended else old ones are removed and new ones are 
added
* The schema must have a {{<uniquekeyField>}}
* {{UpdateableIndexProcessor}} registers {{postCommit/postOptimize}} listeners.

h1.Implementation
{{UpdateableIndexProcessor}} maintains two separate Lucene indexes for doing 
the backup
 * *temp.backup.index* : This index stores (not indexed) all the fields (except 
uniquekey which is stored and indexed) in the document 
 * *backup.index* : This index stores (not indexed) all the fields (except 
uniquekey which is stored and indexed) which are not stored in the actual 
schema and the fields which are targets of copyField.
h1.Implementation of various methods

h2.{{processAdd()}}
{{UpdateableIndexProcessor}} writes the document to temp.backup.index . And 
calls next {{UpdateProcessor}}

h2.{{processDelete()}}
{{UpdateableIndexProcessor}} gets the Searcher from a core query and find the 
documents which matches the query and delete from *backup.index* . if it is a 
delete by id delete the document with that id from *temp.backup.index* . call 
next {{UpdateProcessor}}

h2.{{processCommit()}}
{{UpdateableIndexProcessor}} calls next {{UpdateProcessor}}

h2.on {{postCommit/postOmptize}}
{{UpdateableIndexProcessor}} commits the *temp.backup.index* . Gets all the 
documents from the *temp.backup.index* one by one . if the document is present 
in the main index it is copied to *backup.index* .Finally it commits the 
*backup.index*. *temp.backup.index* is detryed after that

h2.{{processUpdate()}}
{{UpdateableIndexProcessor}} commits the *temp.backup.index* . Check the 
document first in *temp.backup.index* . If it is present read the document . if 
it is not present , check in *backup.index* .If it is present there , get the 
searcher from the main index and read all the missing fields from there, and 
the backup document is prepared

The single valued fields are used from the incoming document (if present) 
others are fillled from backup doc . If append=true all the multivalues values 
from backup document are added to the incoming document else the values from 
backup document is not used if they are present in incoming document also.

h2. new {{BackupIndexRequestHandler}} registered automatically at {{/backup}}
This exposes the data present in the backp indexes. The user must be able to 
get any document by id by invoking {{/backup?id=<value>}} (multiple id values 
can be sent eg:id=1&id=2&id=4). This helps the user to query the backup index 
and construct the new doc if he wishes to do so. The 
{{BackupIndexRequestHandler}} does a commit on *temp.backup.index* and searches 
the *temp.backup.index* first for the id and if the document is absent then it 
checks in the *backup.index* and returns the document.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to