[GitHub] [couchdb-documentation] wohali commented on a change in pull request #418: Add search index documentation

GitBox Wed, 12 Jun 2019 03:43:34 -0700

wohali commented on a change in pull request #418: Add search index 
documentation
URL: 
https://github.com/apache/couchdb-documentation/pull/418#discussion_r292848213


 ##########
 File path: src/api/ddoc/views.rst
 ##########
 @@ -315,6 +315,1144 @@ including the update sequence of the database from 
which the view was
 generated. The returned value can be compared this to the current update
 sequence exposed in the database information (returned by :get:`/{db}`).
 
+Search
+======
+
+Search indexes enable you to query a database by using `Lucene Query Parser 
Syntax 
<http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Overview>`_.
 A search index uses one, or multiple, fields from your documents. You can use 
a search index to run queries, find documents based on the content they 
contain, or work with groups, facets, or geographical searches.
+
+To create a search index, you add a JavaScript function to a design document 
in the database. An index builds after processing one search request or after 
the server detects a document update. The ``index`` function takes the 
following parameters: 
+
+1.  Field name - The name of the field you want to use when you query the 
index. If you set this parameter to ``default``, then this field is queried if 
no field is specified in the query syntax.
+2.  Data that you want to index, for example, ``doc.address.country``. 
+3.  (Optional) The third parameter includes the following fields: ``boost``, 
``facet``, ``index``, and ``store``. These fields are described in more detail 
later.   
+
+By default, a search index response returns 25 rows. The number of rows that 
is returned can be changed by using the ``limit`` parameter. However, a result 
set from a search is limited to 200 rows. Each response includes a ``bookmark`` 
field. You can include the value of the ``bookmark`` field in later queries to 
look through the responses.
+
+*Example design document that defines a search index:*
+
+.. code-block:: javascript
+
+    {
+       "_id": "_design/search_example",
+       "indexes": {
+               "animals": {
+                       "index": "function(doc){ ... }"
+               }
+           }
+    }
+
+Search index partitioning type
+------------------------------
+
+A search index will inherit the partitioning type from the 
``options.partitioned``
+field of the design document that contains it.
+
+Index functions
+---------------
+
+Attempting to index by using a data field that does not exist fails. To avoid 
this problem, use an appropriate :ref:`index_guard_clauses <api/ddoc/view>`.
+
+.. note:: 
+    Your indexing functions operate in a memory-constrained environment where 
the 
+    document itself forms a part of the memory that is used in that 
environment. 
+    Your code's stack and document must fit inside this memory. In other 
words, a document 
+    must be loaded in order to be indexed. Documents are limited to a maximum 
size of 64 MB.
+
+.. note:: 
+    Within a search index, do not index the same field name with more than one 
data 
+    type. If the same field name is indexed with different data types in the 
same search 
+    index function, you might get an error when querying the search index that 
says the 
+    field "was indexed without position data." For example, do not include 
both of these 
+    lines in the same search index function, as they index the ``myfield`` 
field as two 
+    different data types: a string ``"this is a string"`` and a number ``123``.
+
+.. code-block:: javascript
+
+    index("myfield", "this is a string");
+    index("myfield", 123);
+
+The function that is contained in the index field is a JavaScript function
+that is called for each document in the database.
+The function takes the document as a parameter,
+extracts some data from it,
+and then calls the function that is defined in the ``index`` field to index 
that data.
+
+The ``index`` function takes three parameters, where the third parameter is 
optional.
+
+The first parameter is the name of the field you intend to use when querying 
the index,
+and which is specified in the Lucene syntax portion of subsequent queries.
+An example appears in the following query:
+
+.. code-block:: javascript
+
+    query=color:red
+
+The Lucene field name ``color`` is the first parameter of the ``index`` 
function.
+
+The ``query`` parameter can be abbreviated to ``q``,
+so another way of writing the query is as follows:
+
+.. code-block:: javascript
+
+    q=color:red
+
+If the special value ``"default"`` is used when you define the name,
+you do not have to specify a field name at query time.
+The effect is that the query can be simplified:
+
+.. code-block:: javascript
+
+    query=red
+
+The second parameter is the data to be indexed. Keep the following information 
in mind when you index your data: 
+
+- This data must be only a string, number, or boolean. Other types will cause 
an error to be thrown by the index function call.
+- If an error is thrown when running your function, for this reason or others, 
the document will not be added to that search index.
+
+The third, optional, parameter is a JavaScript object with the following 
fields:
+
+*Index function (optional parameter)*
+
++--------------+----------------------------------------------------------------------+----------------------------------+-----------------+
+| Option       | Description                                                   
       | Values                           | Default         |
++==============+======================================================================+==================================+=================+
+| ``boost``    | A number that specifies the relevance in search results.      
       | A positive floating point number | 1 (no boosting) |
+|              | Content that is indexed with a boost value greater than 1     
       |                                  |                 |
+|              | is more relevant than content that is indexed without a boost 
value. |                                  |                 |
+|              | Content with a boost value less than one is not so relevant.  
       |                                  |                 |
++--------------+----------------------------------------------------------------------+----------------------------------+-----------------+
+| ``facet``    | Creates a faceted index. For more information, see            
       | ``true``, ``false``              | ``false``       |
+|              | :ref:`faceting <api/ddoc/view>`.                              
       |                                  |                 |
++--------------+----------------------------------------------------------------------+----------------------------------+-----------------+
+| ``index``    | Whether the data is indexed, and if so, how. If set to 
``false``,    | ``true``, ``false``              | ``false``       |
+|              | the data cannot be used for searches, but can still be 
retrieved     |                                  |                 |
+|              | from the index if ``store`` is set to ``true``.               
       |                                  |                 |
+|              | For more information, see :ref:`analyzers <api/ddoc/view>`.   
       |                                  |                 |
++--------------+----------------------------------------------------------------------+----------------------------------+-----------------+
+| ``store``    | If ``true``, the value is returned in the search result;      
       | ``true``, ``false``              | ``false``       |
+|              | otherwise, the value is not returned.                         
       |                                  |                 |
++--------------+----------------------------------------------------------------------+----------------------------------+-----------------+
+
+.. note:: 
+
+    If you do not set the ``store`` parameter,
+    the index data results for the document are not returned in response to a 
query.
+
+*Example search index function:*
+
+.. code-block:: javascript
+
+    function(doc) {
+           index("default", doc._id);
+           if (doc.min_length) {
+                   index("min_length", doc.min_length, {"store": true});
+           }
+           if (doc.diet) {
+                   index("diet", doc.diet, {"store": true});
+           }
+           if (doc.latin_name) {
+                   index("latin_name", doc.latin_name, {"store": true});
+           }
+           if (doc.class) {
+                   index("class", doc.class, {"store": true});
+           }
+    }
+
+.. _api/ddoc/view/index_guard_clauses:
+
+Index guard clauses
+^^^^^^^^^^^^^^^^^^^
+
+The ``index`` function requires the name of the data field to index as the 
second parameter.
+However,
+if that data field does not exist for the document,
+an error occurs.
+The solution is to use an appropriate 'guard clause' that checks if the field 
exists,
+and contains the expected type of data,
+*before* any attempt to create the corresponding index.
+
+*Example of failing to check whether the index data field exists:*
+
+.. code-block:: javascript
+
+    if (doc.min_length) {
+           index("min_length", doc.min_length, {"store": true});
+    }
+
+You might use the JavaScript ``typeof`` function to implement the guard clause 
test.
+If the field exists *and* has the expected type,
+the correct type name is returned,
+so the guard clause test succeeds and it is safe to use the index function.
+If the field does *not* exist,
+you would not get back the expected type of the field,
+therefore you would not attempt to index the field.
+
+JavaScript considers a result to be false if one of the following values is 
tested:
+
+*      'undefined'
+*      null
+*      The number +0
+*      The number -0
+*      NaN (not a number)
+*      "" (the empty string)
+
+*Using a guard clause to check whether the required data field exists,
+and holds a number,
+before an attempt to index:*
+
+.. code-block:: javascript
+
+    if (typeof(doc.min_length) === 'number') {
+           index("min_length", doc.min_length, {"store": true});
+    }
+
+Use a generic guard clause test to ensure that the type of the candidate data 
field is defined.
+
+*Example of a 'generic' guard clause:*
+
+.. code-block:: javascript
+
+    if (typeof(doc.min_length) !== 'undefined') {
+           // The field exists, and does have a type, so we can proceed to 
index using it.
+           ...
+    }
+
+.. _api/ddoc/view/analyzers:
+
+Analyzers
+---------
+
+Analyzers are settings that define how to recognize terms within text.
+Analyzers can be helpful if you need to :ref:`language-specific-analyzers 
<api/ddoc/view>`.
+
+Here's the list of generic analyzers that are supported by search:
+
++----------------+---------------------------------------------------------------------------------+
+| Analyzer       | Description                                                 
                    |
++================+=================================================================================+
+| ``classic``    | The standard Lucene analyzer, circa release 3.1.            
                    |
++----------------+---------------------------------------------------------------------------------+
+| ``email``      | Like the ``standard`` analyzer, but tries harder to match 
an email              |
+|                | address as a complete token.                                
                    |
++----------------+---------------------------------------------------------------------------------+
+| ``keyword``    | Input is not tokenized at all.                              
                    |
++----------------+---------------------------------------------------------------------------------+
+| ``simple``     | Divides text at non-letters.                                
                    |
++----------------+---------------------------------------------------------------------------------+
+| ``standard``   | The default analyzer. It implements the Word Break rules 
from the               |
+|                | `Unicode Text Segmentation algorithm 
<http://www.unicode.org/reports/tr29/>`_.  |
++----------------+---------------------------------------------------------------------------------+
+| ``whitespace`` | Divides text at white space boundaries.                     
                    |
++----------------+---------------------------------------------------------------------------------+
+
+
+*Example analyzer document:*
+
+.. code-block:: javascript
+
+    {
+           "_id": "_design/analyzer_example",
+           "indexes": {
+                   "INDEX_NAME": {
+                           "index": "function (doc) { ... }",
+                           "analyzer": "$ANALYZER_NAME"
+                   }
+           }
+    }
+
+.. _api/ddoc/view/language-specific-analyzers:
+
+Language-specific analyzers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These analyzers omit common words in the specific language,
+and many also `remove prefixes and suffixes 
<http://en.wikipedia.org/wiki/Stemming>`_.
+The name of the language is also the name of the analyzer.
+
+*      ``arabic``
+*      ``armenian``
+*      ``basque``
+*      ``bulgarian``
+*      ``brazilian``
+*      ``catalan``
+*      ``cjk`` (Chinese, Japanese, Korean)
+*      ``chinese`` (`smartcn 
<http://lucene.apache.org/core/4_2_1/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html>`_)
+*      ``czech``
+*      ``danish``
+*      ``dutch``
+*      ``english``
+*      ``finnish``
+*      ``french``
+*      ``german``
+*      ``greek``
+*      ``galician``
+*      ``hindi``
+*      ``hungarian``
+*      ``indonesian``
+*      ``irish``
+*      ``italian``
+*      ``japanese`` (`kuromoji 
<http://lucene.apache.org/core/4_2_1/analyzers-kuromoji/overview-summary.html>`_)
+*      ``latvian``
+*      ``norwegian``
+*      ``persian``
+*      ``polish`` (`stempel 
<http://lucene.apache.org/core/4_2_1/analyzers-stempel/overview-summary.html>`_)
+*      ``portuguese``
+*      ``romanian``
+*      ``russian``
+*      ``spanish``
+*      ``swedish``
+*      ``thai``
+*      ``turkish``
+
+.. note::
+
+    Language-specific analyzers are optimized for the specified language. You 
cannot combine a generic analyzer with a language-specific analyzer. Instead, 
you might use a :ref:`per-field-analyzers <api/ddoc/view>` to select different 
analyzers for different fields within the documents.
+
+.. _api/ddoc/view/per-field-analyzers:
+
+Per-field analyzers
+^^^^^^^^^^^^^^^^^^^
+
+The ``perfield`` analyzer configures multiple analyzers for different fields.
+
+*Example of defining different analyzers for different fields:*
+
+.. code-block:: javascript
+
+    {
+           "_id": "_design/analyzer_example",
+           "indexes": {
+                   "INDEX_NAME": {
+                           "analyzer": {
+                                   "name": "perfield",
+                                   "default": "english",
+                                   "fields": {
+                                           "spanish": "spanish",
+                                           "german": "german"
+                                   }
+                           },
+                           "index": "function (doc) { ... }"
+                   }
+           }
+    }
+
+Stop words
+^^^^^^^^^^
+
+Stop words are words that do not get indexed.
+You define them within a design document by turning the analyzer string into 
an object.
+
+.. note:: 
+
+    The ``keyword``, ``simple``, and ``whitespace`` analyzers do not support 
stop words.
+
+The default stop words for the ``standard`` analyzer are included below:
+
+.. code-block:: javascript
+
+    "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", 
+    "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", 
+    "that", "the", "their", "then", "there", "these", "they", "this", 
+    "to", "was", "will", "with" 
+
+
+*Example of defining non-indexed ('stop') words:*
+
+.. code-block:: javascript
+
+    {
+           "_id": "_design/stop_words_example",
+           "indexes": {
+                   "INDEX_NAME": {
+                           "analyzer": {
+                                   "name": "portuguese",
+                                   "stopwords": [
+                                           "foo",
+                                           "bar",
+                                           "baz"
+                                   ]
+                           },
+                           "index": "function (doc) { ... }"
+                   }
+           }
+    }
+
+Testing analyzer tokenization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can test the results of analyzer tokenization by posting sample data to 
the ``_search_analyze`` endpoint.
+
+*Example of using HTTP to test the ``keyword`` analyzer:*
+
+.. code-block:: http
+
+    POST /_search_analyze HTTP/1.1
+    Content-Type: application/json
+    {"analyzer":"keyword", "text":"[email protected]"}
+
+*Example of using the command line to test the ``keyword`` analyzer:*
+
+.. code-block:: sh
+
+    curl 'https://$HOST:5984/_search_analyze' -H 'Content-Type: 
application/json'
+           -d '{"analyzer":"keyword", "text":"[email protected]"}'
+
+*Result of testing the ``keyword`` analyzer:*
+
+.. code-block:: javascript
+
+    {
+           "tokens": [
+                   "[email protected]"
+           ]
+    }
+
+*Example of using HTTP to test the ``standard`` analyzer:*
+
+.. code-block:: http
+
+    POST /_search_analyze HTTP/1.1
+    Content-Type: application/json
+    {"analyzer":"standard", "text":"[email protected]"}
+
+*Example of using the command line to test the ``standard`` analyzer:*
+
+.. code-block:: sh
+
+    curl 'https://$HOST:5984/_search_analyze' -H 'Content-Type: 
application/json'
+           -d '{"analyzer":"standard", "text":"[email protected]"}'
+
+*Result of testing the ``standard`` analyzer:*
+
+.. code-block:: javascript
+
+    {
+           "tokens": [
+                   "ablanks",
+                   "renovations.com"
+           ]
+    }
+
+Queries
+-------
+
+After you create a search index, you can query it.
+
+- Issue a partition query using: ``GET 
/$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_search/$INDEX_NAME``
+- Issue a global query using: ``GET 
/$DATABASE/_design/$DDOC/_search/$INDEX_NAME``
+
+Specify your search by using the ``query`` parameter.
+
+*Example of using HTTP to query a partitioned index:*
+
+.. code-block:: http
+
+    GET 
/$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/_search/$INDEX_NAME?include_docs=true&query="*:*"&limit=1
 HTTP/1.1
+    Content-Type: application/json
+
+*Example of using HTTP to query a global index:*
+
+.. code-block:: http
+
+    GET 
/$DATABASE/_design/$DDOC/_search/$INDEX_NAME?include_docs=true&query="*:*"&limit=1
 HTTP/1.1
+    Content-Type: application/json
+
+*Example of using the command line to query a partitioned index:*
+
+.. code-block:: sh
+
+    curl https://$HOST:5984/$DATABASE/_partition/$PARTITION_KEY/_design/$DDOC/
+    _search/$INDEX_NAME?include_docs=true\&query="*:*"\&limit=1 \
+
+*Example of using the command line to query a global index:*
+
+.. code-block:: sh
+
+    curl https://$HOST:5984/$DATABASE/_design/$DDOC/_search/$INDEX_NAME?
+    include_docs=true\&query="*:*"\&limit=1 \
+
+.. _api/ddoc/view/query_parameters:
+
+Query Parameters
+^^^^^^^^^^^^^^^^
+
+You must enable :ref:`faceting <api/ddoc/view>` before you can use the 
following parameters:
+
+-      ``counts``
+-      ``drilldown``
+
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| Argument               | Description                                         
 | Optional          | Type             | Supported values      | Partitioned 
query |
++========================+======================================================+===================+==================+=======================+===================+
+| ``bookmark``           | A bookmark that was received from a previous 
search. | yes               | String           |                       | yes    
           |
+|                        | This parameter enables paging through the results.  
 |                   |                  |                       |               
    |
+|                        | If there are no more results after the bookmark,    
 |                   |                  |                       |               
    |
+|                        | you get a response with an empty rows array and the 
 |                   |                  |                       |               
    | 
+|                        | same bookmark, confirming the end of the result 
list.|                   |                  |                       |           
        |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``counts``             | This field defines an array of names of string      
 | yes               | JSON             | A JSON array of field | no            
    |
+|                        | fields, for which counts are requested. The 
response |                   |                  | names.                |       
            |
+|                        | contains counts for each unique value of this       
 |                   |                  |                       |               
    |
+|                        | field name among the documents that match the 
search |                   |                  |                       |         
          | 
+|                        | query. :ref:`faceting <api/ddoc/view>` must         
 |                   |                  |                       |               
    |
+|                        | be enabled for this parameter to function.          
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``drilldown``          | This field can be used several times. Each use      
 | no                | JSON             | A JSON array with two | yes           
    |
+|                        | defines a pair with a field name and a value.       
 |                   |                  | elements: the field   |               
    |
+|                        | The search matches only documents containing the    
 |                   |                  | name and the value.   |               
    | 
+|                        | value that was provided in the named field. It      
 |                   |                  |                       |               
    |
+|                        | differs from using ``"fieldname:value"`` in         
 |                   |                  |                       |               
    |
+|                        | the ``q`` parameter only in that the values are not 
 |                   |                  |                       |               
    |
+|                        | analyzed. :ref:`faceting <api/ddoc/view>` must      
 |                   |                  |                       |               
    |
+|                        | be enabled for this parameter to function.          
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``group_field``        | Field that groups search matches                    
 | yes               |  String          | A string that         | no            
    |
+|                        |                                                     
 |                   |                  | contains the name of  |               
    |
+|                        |                                                     
 |                   |                  | a string field.       |               
    |
+|                        |                                                     
 |                   |                  | Fields containing     |               
    |
+|                        |                                                     
 |                   |                  | other data such as    |               
    | 
+|                        |                                                     
 |                   |                  | numbers, objects, or   |              
     |
+|                        |                                                     
 |                   |                  | arrays cannot be      |               
    |
+|                        |                                                     
 |                   |                  | used.                 |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``group_limit``        | Maximum group count. This field can be used only if 
 | yes               | Numeric          |                       | no            
    |
+|                        | ``group_field`` is specified.                       
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``group_sort``         | This field defines the order of the groups in a     
 | yes               | JSON             | This field can have   | no            
    |
+|                        | search that uses ``group_field``. The default sort  
 |                   |                  | the same values as    |               
    |  
+|                        | order is relevance.                                 
 |                   |                  | the sort field, so    |               
    |
+|                        |                                                     
 |                   |                  | single fields and     |               
    | 
+|                        |                                                     
 |                   |                  | arrays of fields are  |               
    |
+|                        |                                                     
 |                   |                  | supported.            |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``highlight_fields``   | Specifies which fields to highlight. If specified,  
 | yes               | Array of strings |                       | no            
    |
+|                        | the result object contains a ``highlights`` field   
 |                   |                  |                       |               
    |
+|                        | with an entry for each specified field.             
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``highlight_pre_tag``  | A string that is inserted before the highlighted    
 | yes, defaults     | String           |                       | yes           
    |
+|                        | word in the highlights output.                      
 | to ``<em>``       |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``highlight_post_tag`` | A string that is inserted after the highlighted 
word | yes, defaults     | String           |                       | yes       
        |
+|                        | in the highlights output.                           
 | to ``</em>``      |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``highlight_number``   | Number of fragments that are returned in 
highlights. | yes, defaults     | Numeric          |                       | 
yes               |
+|                        | If the search term occurs less often than the 
number | to 1              |                  |                       |         
          |
+|                        | of fragments that are specified, longer fragments   
 |                   |                  |                       |               
    |
+|                        | are returned.                                       
 |                   |                  |                       |               
    | 
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``highlight_size``     | Number of characters in each fragment for           
 | yes, defaults to  | Numeric          |                       | yes           
    |
+|                        | highlights.                                         
 | 100 characters    |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``include_docs``       | Include the full content of the documents in the    
 | yes               | Boolean          |                       | yes           
    |
+|                        | response.                                           
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``include_fields``     | A JSON array of field names to include in search    
 | yes, the default  | Array of strings |                       | yes           
    |
+|                        | results. Any fields that are included must be       
 | is all fields     |                  |                       |               
    |
+|                        | indexed with the ``store:true`` option.             
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``limit``              | Limit the number of the returned documents to the   
 | yes               | Numeric          | The limit value can   | yes           
    |
+|                        | specified number. For a grouped search, this        
 |                   |                  | be any positive       |               
    |
+|                        | parameter limits the number of documents per group. 
 |                   |                  | integer number up to  |               
    |
+|                        |                                                     
 |                   |                  | and including 200.    |               
    | 
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``q``                  | Abbreviation for ``query``. Runs a Lucene query.    
 | no                | String or number |                       | yes           
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``query``              | Runs a Lucene query.                                
 | no                | String or number |                       | yes           
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``ranges``             | This field defines ranges for faceted, numeric      
 | yes               | JSON             | The value must be an  | no            
    |
+|                        | search fields. The value is a JSON object where     
 |                   |                  | object with fields    |               
    |
+|                        | the fields names are faceted numeric search fields, 
 |                   |                  | that have objects as  |               
    |
+|                        | and the values of the fields are JSON objects. The  
 |                   |                  | their values. These   |               
    | 
+|                        | field names of the JSON objects are names for       
 |                   |                  | objects must have     |               
    |
+|                        | ranges. The values are strings that describe the    
 |                   |                  | strings with ranges   |               
    |
+|                        | range, for example ``"[0 TO 10]"``.                 
 |                   |                  | as their field        |               
    |
+|                        |                                                     
 |                   |                  | values.               |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``sort``               | Specifies the sort order of the results. In a       
 | yes               | JSON             | A JSON string of the  | yes           
    |
+|                        | grouped search (when ``group_field`` is             
 |                   |                  | form                  |               
    |
+|                        | used), this parameter specifies the sort order      
 |                   |                  | ``"fieldname<type>"`` |               
    |
+|                        | within a group. The default sort order is 
relevance. |                   |                  | or                    |     
              | 
+|                        |                                                     
 |                   |                  | ``-fieldname<type>``  |               
    |
+|                        |                                                     
 |                   |                  | for descending order, |               
    |
+|                        |                                                     
 |                   |                  | where ``fieldname``   |               
    |
+|                        |                                                     
 |                   |                  | is the name of a      |               
    |
+|                        |                                                     
 |                   |                  | string or number      |               
    |
+|                        |                                                     
 |                   |                  | field, and ``type``   |               
    |
+|                        |                                                     
 |                   |                  | is either a number, a |               
    |
+|                        |                                                     
 |                   |                  | string, or a JSON     |               
    |
+|                        |                                                     
 |                   |                  | array of strings. The |               
    |
+|                        |                                                     
 |                   |                  | ``type`` part is      |               
    |
+|                        |                                                     
 |                   |                  | optional, and         |               
    |
+|                        |                                                     
 |                   |                  | defaults to           |               
    |
+|                        |                                                     
 |                   |                  | ``number``. Some      |               
    |
+|                        |                                                     
 |                   |                  | examples are          |               
    |
+|                        |                                                     
 |                   |                  | ``"foo"``,            |               
    |
+|                        |                                                     
 |                   |                  | ``"-foo"``,           |               
    |
+|                        |                                                     
 |                   |                  | ``"bar<string>"``,    |               
    |
+|                        |                                                     
 |                   |                  | ``"-foo<number>"``    |               
    |
+|                        |                                                     
 |                   |                  | and ,                 |               
    |                  
+|                        |                                                     
 |                   |                  | ``["-foo<number>"     |               
    |
+|                        |                                                     
 |                   |                  | "bar<string>"]``.     |               
    |
+|                        |                                                     
 |                   |                  | String fields that    |               
    |
+|                        |                                                     
 |                   |                  | are used for sorting  |               
    |
+|                        |                                                     
 |                   |                  | must not be analyzed  |               
    |
+|                        |                                                     
 |                   |                  | fields. Fields that   |               
    |
+|                        |                                                     
 |                   |                  | are used for sorting  |               
    |
+|                        |                                                     
 |                   |                  | must be indexed by    |               
    |
+|                        |                                                     
 |                   |                  | the same indexer that |               
    |
+|                        |                                                     
 |                   |                  | is used for the       |               
    |
+|                        |                                                     
 |                   |                  | search query.         |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+| ``stale``              | Do not wait for the index to finish building to     
 | yes               | String           | OK                    | yes           
    |
+|                        | return results.                                     
 |                   |                  |                       |               
    |
++------------------------+------------------------------------------------------+-------------------+------------------+-----------------------+-------------------+
+
+.. note::
+    Do not combine the ``bookmark`` and ``stale`` options. These options 
constrain the choice of shard replicas to use for the response. When used 
together, the options might cause problems when contact is attempted with 
replicas that are slow or not available.
+
+Relevance
+^^^^^^^^^
+
+When more than one result might be returned,
+it is possible for them to be sorted.
+By default,
+the sorting order is determined by 'relevance'.
+
+Relevance is measured according to
+`Apache Lucene Scoring <https://lucene.apache.org/core/3_6_0/scoring.html>`_.
+As an example,
+if you search a simple database for the word ``example``,
+two documents might contain the word.
+If one document mentions the word ``example`` 10 times,
+but the second document mentions it only twice,
+then the first document is considered to be more 'relevant'.
+
+If you do not provide a ``sort`` parameter,
+relevance is used by default.
+The highest scoring matches are returned first.
+
+If you provide a ``sort`` parameter,
+then matches are returned in that order,
+ignoring relevance.
+
+If you want to use a ``sort`` parameter,
+and also include ordering by relevance in your search results,
+use the special fields ``-<score>`` or ``<score>`` within the ``sort`` 
parameter.
+
+POSTing search queries
+^^^^^^^^^^^^^^^^^^^^^^
+
+Instead of using the ``GET`` HTTP method,
+you can also use ``POST``.
+The main advantage of ``POST`` queries is that they can have a request body,
+so you can specify the request as a JSON object.
+Each parameter in the previous table corresponds to a field in the JSON object 
in the request body.
+
+*Example of using HTTP to ``POST`` a search request:*
+
+.. code-block:: http
+
+    POST /db/_design/ddoc/_search/searchname HTTP/1.1
+    Content-Type: application/json
+
+*Example of using the command line to ``POST`` a search request:*
+
+.. code-block:: sh
+
+    curl 'https://$HOST:5984/db/_design/ddoc/_search/searchname' -X POST -H 
'Content-Type: application/json' -d @search.json
+
+*Example JSON document that contains a search request:*
+
+.. code-block:: javascript
+
+    {
+        "q": "index:my query",
+        "sort": "foo",
+        "limit": 3
+    }
+
+Query syntax
+------------
+
+The CouchDB search query syntax is based on the
+`Lucene syntax 
<http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Overview>`_.
+Search queries take the form of ``name:value`` unless the name is omitted,
+in which case they use the default field,
+as demonstrated in the following examples:
+
+*Example search query expressions:*
+
+.. code-block:: javascript
+
+    // Birds
+    class:bird
+
+.. code-block:: text
+
+    // Animals that begin with the letter "l"
+    l*
+
+.. code-block:: text
+
+    // Carnivorous birds
+    class:bird AND diet:carnivore
+
+.. code-block:: text
+
+    // Herbivores that start with letter "l"
+    l* AND diet:herbivore
+
+.. code-block:: text
+
+    // Medium-sized herbivores
+    min_length:[1 TO 3] AND diet:herbivore
+
+.. code-block:: text
+
+    // Herbivores that are 2m long or less
+    diet:herbivore AND min_length:[-Infinity TO 2]
+
+.. code-block:: text
+
+    // Mammals that are at least 1.5m long
+    class:mammal AND min_length:[1.5 TO Infinity]
+
+.. code-block:: text
+
+    // Find "Meles meles"
+    latin_name:"Meles meles"
+
+.. code-block:: text
+
+    // Mammals who are herbivore or carnivore
+    diet:(herbivore OR omnivore) AND class:mammal
+
+.. code-block:: text
+
+    // Return all results
+    *:*
+
+Queries over multiple fields can be logically combined,
+and groups and fields can be further grouped.
+The available logical operators are case-sensitive and are ``AND``, ``+``, 
``OR``, ``NOT`` and ``-``.
+Range queries can run over strings or numbers.
+
+If you want a fuzzy search,
+you can run a query with ``~`` to find terms like the search term.
+For instance,
+``look~`` finds the terms ``book`` and ``took``.
+
+.. note::
+    If the lower and upper bounds of a range query are both strings that 
contain only numeric digits, the bounds are treated as numbers not as strings. 
For example, if you search by using the query ``mod_date:["20170101" TO 
"20171231"]``, the results include documents for which ``mod_date`` is between 
the numeric values 20170101 and 20171231, not between the strings "20170101" 
and "20171231".
+
+You can alter the importance of a search term by adding ``^`` and a positive 
number.
+This alteration makes matches containing the term more or less relevant,
+proportional to the power of the boost value.
+The default value is 1,
+which means no increase or decrease in the strength of the match.
+A decimal value of 0 - 1 reduces importance.
+making the match strength weaker.
+A value greater than one increases importance,
+making the match strength stronger.
+
+Wildcard searches are supported,
+for both single (``?``) and multiple (``*``) character searches.
+For example,
+``dat?`` would match ``date`` and ``data``,
+whereas ``dat*`` would match ``date``,
+``data``,
+``database``,
+and ``dates``.
+Wildcards must come after the search term.
+
+Use ``*:*`` to return all results.
+
+Result sets from searches are limited to 200 rows,
+and return 25 rows by default.
+The number of rows that are returned can be changed
+by using the :ref:`query-parameters <api/ddoc/view>`.
+
+If the search query does *not* specify the ``"group_field"`` argument,
+the response contains a bookmark.
+If this bookmark is later provided as a URL parameter,
+the response skips the rows that were seen already,
+making it quick and easy to get the next set of results.
+
+.. note:: 
+    The response never includes a bookmark if the ``"group_field"`` parameter 
is included in the search query. For more information, see 
:ref:`query-parameters <api/ddoc/view>`. 
+
+.. note:: 
+    The ``group_field``, ``group_limit``, and ``group_sort`` options are only 
available when making global queries.
+
+The following characters require escaping if you want to search on them:
+
+.. code-block:: sh
+
+    + - && || ! ( ) { } [ ] ^ " ~ * ? : \ /
+
+
+To escape one of these characters,
+use a preceding backslash character (``\``).
+
+The response to a search query contains an ``order`` field for each of the 
results.
+The ``order`` field is an array where the first element is the field or fields 
that are specified
+in the ``sort`` parameter. See :ref:`query-parameters <api/ddoc/view>`.
+If no ``sort`` parameter is included in the query,
+then the ``order`` field contains the `Lucene relevance score 
<https://lucene.apache.org/core/3_6_0/scoring.html>`_.
+If you use the 'sort by distance' feature as described in 
:ref:`geographical-searches <api/ddoc/view>`,
+then the first element is the distance from a point.
+The distance is measured by using either kilometers or miles.
+
+.. note:: 
+    The second element in the order array can be ignored.
+    It is used for troubleshooting purposes only.
+
+.. _api/ddoc/view/faceting:
+
+Faceting
+^^^^^^^^
+
+CouchDB Search also supports faceted searching,
+enabling discovery of aggregate information about matches quickly and easily.
+You can match all documents by using the special ``?q=*:*`` query syntax,
+and use the returned facets to refine your query.
+To indicate that a field must be indexed for faceted queries,
+set ``{"facet": true}`` in its options.
+
+*Example of search query, specifying that faceted search is enabled:*
+
+.. code-block:: javascript
+
+    function(doc) {
+        index("type", doc.type, {"facet": true});
+        index("price", doc.price, {"facet": true});
+    }
+
+To use facets,
+all the documents in the index must include all the fields that have faceting 
enabled.
+If your documents do not include all the fields,
+you receive a ``bad_request`` error with the following reason, "The 
``field_name`` does not exist."
+If each document does not contain all the fields for facets,
+create separate indexes for each field.
+If you do not create separate indexes for each field,
+you must include only documents that contain all the fields.
+Verify that the fields exist in each document by using a single ``if`` 
statement.
+
+*Example ``if`` statement to verify that the required fields exist in each 
document:*
+
+.. code-block:: javascript
+
+    if (typeof doc.town == "string" && typeof doc.name == "string") {
+        index("town", doc.town, {facet: true});
+        index("name", doc.name, {facet: true});        
+       }
+
+Counts
+^^^^^^
+
+.. note:: 
+    The ``counts`` option is only available when making global queries.
+
+The ``counts`` facet syntax takes a list of fields,
+and returns the number of query results for each unique value of each named 
field.
+
+.. note::
+    The ``count`` operation works only if the indexed values are strings.
+    The indexed values cannot be mixed types. For example,
+    if 100 strings are indexed, and one number,
+    then the index cannot be used for ``count`` operations. 
+    You can check the type by using the ``typeof`` operator, and convert it by 
using the ``parseInt``,
+    ``parseFloat``, or ``.toString()`` functions.
+
+*Example of a query using the ``counts`` facet syntax:* 
+
+.. code-block:: http
+
+    ?q=*:*&counts=["type"]
+
+*Example response after using of the ``counts`` facet syntax:*
+
+.. code-block:: javascript
+
+    {
+        "total_rows":100000,
+        "bookmark":"g...",
+        "rows":[...],
+        "counts":{
+            "type":{
+                "sofa": 10,
+                "chair": 100,
+                "lamp": 97
+            }
+        }
+    }
+
+``drilldown``
+^^^^^^^^^^^^^
+
+.. note:: 
+    The ``drilldown`` option is only available when making global queries.
+
+You can restrict results to documents with a dimension equal to the specified 
label.
+Restrict the results by adding ``drilldown=["dimension","label"]`` to a search 
query.
+You can include multiple ``drilldown`` parameters to restrict results along 
multiple dimensions.
+
+Using a ``drilldown`` parameter is similar to using ``key:value`` in the ``q`` 
parameter,
+but the ``drilldown`` parameter returns values that the analyzer might skip.
+
+For example,
+if the analyzer did not index a stop word like ``"a"``,
+using ``drilldown`` returns it when you specify ``drilldown=["key","a"]``.
+
+Ranges
+^^^^^^
+
+.. note:: 
+    The ``ranges`` option is only available when making global queries.
+
+The ``range`` facet syntax reuses the standard Lucene syntax for ranges
+to return counts of results that fit into each specified category.
+Inclusive range queries are denoted by brackets (``[``, ``]``).
+Exclusive range queries are denoted by curly brackets (``{``, ``}``).
+
+.. note::
+    The ``range`` operation works only if the indexed values are numbers. 
+    The indexed values cannot be mixed types. For example, if 100 strings are 
indexed,
+    and one number, then the index cannot be used for ``range`` operations.
+    You can check the type by using the ``typeof`` operator, and convert 
+    it by using the ``parseInt``, ``parseFloat``, or ``.toString()`` functions.
+
+*Example of a request that uses faceted search for matching ``ranges``:*
+
+.. code-block:: http
+
+    ?q=*:*&ranges={"price":{"cheap":"[0 TO 100]","expensive":"{100 TO 
Infinity}"}}
+
+*Example results after a ``ranges`` check on a faceted search:*
+
+.. code-block:: javascript
+
+    {
+        "total_rows":100000,
+        "bookmark":"g...",
+        "rows":[...],
+        "ranges": {
+            "price": {
+                "expensive": 278682,
+                "cheap": 257023
+            }
+        }
+    }
+
+.. _api/ddoc/view/geographical_searches:
+
+Geographical searches
+---------------------
 
 Review comment:
   Do we need a disclaimer here that this is Lucene's implementation, not the 
fancy geo index Cloudant also has open sourced? It probably only matters to 
geo-heads, but if I understand correctly, there's no map projection going on in 
Lucene, this is just a numerical bounding box.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [couchdb-documentation] wohali commented on a change in pull request #418: Add search index documentation

Reply via email to