[Solr Wiki] Update of "CollectionBuilding" by HossMan

Apache Wiki Sat, 25 Feb 2006 00:54:42 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The following page has been changed by HossMan:
http://wiki.apache.org/solr/CollectionBuilding

The comment on the change is:
extracted update info into UpdateXmlMessages

------------------------------------------------------------------------------
     * When launching something new.
     * When a collection has become corrupted to a greater or lesser extent. 
     * When redefining an existing field type&#151;changing your schema in a 
way that requires a rebuild. For example, merely adding fields to the schema 
does not require a rebuild, but changing some field types from a simple integer 
to some exotic type of integer does. 
- 
- [[TableOfContents]]
  
  == Recommended Procedure for New Index Building ==
  
@@ -23, +21 @@

     1. Run the script, '''abo''' (Atomic Backup post-Optimize), to optimize 
the collection. [[BR]] '''Note:''' if you know that a large number of 
incremental updates are still in process from Step 4, wait until they are done 
before running abo.
     1. Run the '''rsyncd-start''' script to re-enable collection distribution 
requests from the slaves. The new collection data will be pulled by the slaves 
while still serving requests. 
  
- === Alternative Approaches for New Index Building ===
+ == Alternative Approaches for New Index Building ==
  
     * Create an "offline" solar port, index from scratch on the offline port, 
disable snapshot pulling, shut down the master, copy the index from the offline 
port to the master, enable snapshot pulling.
     * Create an "offline" solar port, index from scratch on the offline port, 
disable snapshot pulling, shut down the master, copy the index from the offline 
port to the master, disable slave boxes one-at-a-time and copy the index to 
each manual, enable snapshot pulling. (This last one in particular reqires a 
lot more setup time and thought.)
  
- == The Update Schema ==
- 
- (Not to be confused with [:SchemaXml:schema.xml].) 
- 
- Solr accepts POSTed XML messages that Add/Update, Commit, Delete, and Delete 
by query, using the url '''/update'''.  Here is the syntax that SOLAR expects 
to see: 
- 
- === add/update ===
- 
- Example:
- 
-    {{{
- <add>
-   <doc>
-     <field name="employeeId">05991</field>
-     <field name="office">Bridgewater</field>
-   </doc>
- </add>
- }}}
- 
- ==== Optional attributes for "add" ====
-    * `allowDups = "true" | "false"` &#8212; default is "false"
-    * `overwritePending = "true" | "false"` &#8212; default is negation of 
allowDups 
-    * `overwriteCommitted = "true"|"false"` &#8212; default is negation of 
allowDups 
-  
- The defaults for overwritePending and overwriteCommitted are linked to 
allowDups such that those defaults make more sense:
-    * If allowDups is '''false''' (overwrite any duplicates), it implies that 
overwritePending and overwriteCommitted are '''true''' by default.
-    * If allowDups is '''true''' (allow addition of duplicates), it implies 
that overwritePending and overwriteCommitted are '''false''' by default.
-  
- ==== Optional attributes on "doc" ====
-    * `boost = <float>`  &#8212; default is 1.0 (See Lucene docs for 
definition of boost.)
-  
- ==== Optional attributes for "field" ====
-    * `boost = <float>` &#8212; default is 1.0 (See Lucene docs for definition 
of boost.)
-  
- 
- Example of "add" with optional attributes:
- 
-    {{{
- <add allowDups="false" overwriteCommitted="true" overwritePending="true">
-   <doc boost="2.5">
-     <field name="employeeId">05991</field>
-     <field name="office" boost="2.0">Bridgewater</field>
-   </doc>
- </add>
- }}}
- 
- === "commit" and "optimize" ===
- 
- Example:
-    {{{
- <commit/>
- <optimize/>
- }}}
-  
- ==== Optional attributes for "commit" and "optimize" ====
- 
-    * `waitFlush = "true" | "false"`  &#8212; default is true   &#8212;  block 
until index changes are flushed to disk  
-    * `waitSearcher = "true" | "false"`   &#8212;  default is true  &#8212;  
block until a new searcher is opened and registered as the main query searcher, 
making the changes visible.
- 
- Example of "commit" and "optimize" with optional attributes:
-    {{{
- <commit waitFlush="false" waitSearcher="false"/>
- <optimize waitFlush="false" waitSearcher="false"/>
- }}}
- 
- === "delete" by ID and by Query ===
- Delete by id uses the uniqueKey field declared in the schema (in these 
examples, employeeId).
- Delete by id is much more efficient than delete by query.
- Example:
-    {{{
- <delete><id>05991</id></delete>
- <delete><query>office:Bridgewater</query></delete>
- }}}
- 
- ==== Optional attributes for "delete" ====
- 
-    * `fromPending = "true" | "false"`  &#8212; default is "true" 
-    * `fromCommitted = "true" | "false"`  &#8212; default is "true"
-  
- Example of "delete" with optional attributes:
- 
-    {{{
- <delete fromPending="true" fromCommitted="true"><id>05991</id></delete>
- <delete fromPending="true" 
fromCommitted="true"><query>office:Bridgewater</query></delete>
- }}}
- 
- === Updating a Data Record via curl ===
- You can use curl to send any of the above commands. For example:
- 
- {{{
- curl http://<hostname>:<port>/update --data-binary '/<add allowDups="false" 
overwriteCommitted="true" overwritePending="true">
- <doc boost="2.5"> <field name="employeeId">05991</field>
- <field name="office" boost="2.0">Bridgewater</field> </doc> </add>'
- }}}
- 
- {{{
- curl http://<hostname>:<port>/update --data-binary '<commit waitFlush="false" 
waitSearcher="false"/>'
- }}}
- 
- Until a commit has been issued, you will not see any of the data in searches 
either on the master or the slave. After a commit has been issued, you will see 
the results on the master, then after a snapshot has been pulled by the slave, 
you will see it there also.
-

[Solr Wiki] Update of "CollectionBuilding" by HossMan

Reply via email to