Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by HossMan: http://wiki.apache.org/solr/CollectionBuilding The comment on the change is: extracted update info into UpdateXmlMessages ------------------------------------------------------------------------------ * When launching something new. * When a collection has become corrupted to a greater or lesser extent. * When redefining an existing field type—changing your schema in a way that requires a rebuild. For example, merely adding fields to the schema does not require a rebuild, but changing some field types from a simple integer to some exotic type of integer does. - - [[TableOfContents]] == Recommended Procedure for New Index Building == @@ -23, +21 @@ 1. Run the script, '''abo''' (Atomic Backup post-Optimize), to optimize the collection. [[BR]] '''Note:''' if you know that a large number of incremental updates are still in process from Step 4, wait until they are done before running abo. 1. Run the '''rsyncd-start''' script to re-enable collection distribution requests from the slaves. The new collection data will be pulled by the slaves while still serving requests. - === Alternative Approaches for New Index Building === + == Alternative Approaches for New Index Building == * Create an "offline" solar port, index from scratch on the offline port, disable snapshot pulling, shut down the master, copy the index from the offline port to the master, enable snapshot pulling. * Create an "offline" solar port, index from scratch on the offline port, disable snapshot pulling, shut down the master, copy the index from the offline port to the master, disable slave boxes one-at-a-time and copy the index to each manual, enable snapshot pulling. (This last one in particular reqires a lot more setup time and thought.) - == The Update Schema == - - (Not to be confused with [:SchemaXml:schema.xml].) - - Solr accepts POSTed XML messages that Add/Update, Commit, Delete, and Delete by query, using the url '''/update'''. Here is the syntax that SOLAR expects to see: - - === add/update === - - Example: - - {{{ - <add> - <doc> - <field name="employeeId">05991</field> - <field name="office">Bridgewater</field> - </doc> - </add> - }}} - - ==== Optional attributes for "add" ==== - * `allowDups = "true" | "false"` — default is "false" - * `overwritePending = "true" | "false"` — default is negation of allowDups - * `overwriteCommitted = "true"|"false"` — default is negation of allowDups - - The defaults for overwritePending and overwriteCommitted are linked to allowDups such that those defaults make more sense: - * If allowDups is '''false''' (overwrite any duplicates), it implies that overwritePending and overwriteCommitted are '''true''' by default. - * If allowDups is '''true''' (allow addition of duplicates), it implies that overwritePending and overwriteCommitted are '''false''' by default. - - ==== Optional attributes on "doc" ==== - * `boost = <float>` — default is 1.0 (See Lucene docs for definition of boost.) - - ==== Optional attributes for "field" ==== - * `boost = <float>` — default is 1.0 (See Lucene docs for definition of boost.) - - - Example of "add" with optional attributes: - - {{{ - <add allowDups="false" overwriteCommitted="true" overwritePending="true"> - <doc boost="2.5"> - <field name="employeeId">05991</field> - <field name="office" boost="2.0">Bridgewater</field> - </doc> - </add> - }}} - - === "commit" and "optimize" === - - Example: - {{{ - <commit/> - <optimize/> - }}} - - ==== Optional attributes for "commit" and "optimize" ==== - - * `waitFlush = "true" | "false"` — default is true — block until index changes are flushed to disk - * `waitSearcher = "true" | "false"` — default is true — block until a new searcher is opened and registered as the main query searcher, making the changes visible. - - Example of "commit" and "optimize" with optional attributes: - {{{ - <commit waitFlush="false" waitSearcher="false"/> - <optimize waitFlush="false" waitSearcher="false"/> - }}} - - === "delete" by ID and by Query === - Delete by id uses the uniqueKey field declared in the schema (in these examples, employeeId). - Delete by id is much more efficient than delete by query. - Example: - {{{ - <delete><id>05991</id></delete> - <delete><query>office:Bridgewater</query></delete> - }}} - - ==== Optional attributes for "delete" ==== - - * `fromPending = "true" | "false"` — default is "true" - * `fromCommitted = "true" | "false"` — default is "true" - - Example of "delete" with optional attributes: - - {{{ - <delete fromPending="true" fromCommitted="true"><id>05991</id></delete> - <delete fromPending="true" fromCommitted="true"><query>office:Bridgewater</query></delete> - }}} - - === Updating a Data Record via curl === - You can use curl to send any of the above commands. For example: - - {{{ - curl http://<hostname>:<port>/update --data-binary '/<add allowDups="false" overwriteCommitted="true" overwritePending="true"> - <doc boost="2.5"> <field name="employeeId">05991</field> - <field name="office" boost="2.0">Bridgewater</field> </doc> </add>' - }}} - - {{{ - curl http://<hostname>:<port>/update --data-binary '<commit waitFlush="false" waitSearcher="false"/>' - }}} - - Until a commit has been issued, you will not see any of the data in searches either on the master or the slave. After a commit has been issued, you will see the results on the master, then after a snapshot has been pulled by the slave, you will see it there also. -
