[29/34] cassandra git commit: Finish fixing the CQL doc

slebresne Mon, 27 Jun 2016 11:34:24 -0700
http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/dml.rst
----------------------------------------------------------------------
diff --git a/doc/source/cql/dml.rst b/doc/source/cql/dml.rst
index 4cf34eb..58fdb05 100644
--- a/doc/source/cql/dml.rst
+++ b/doc/source/cql/dml.rst
@@ -21,586 +21,478 @@
 Data Manipulation
 -----------------
 
+This section describes the statements supported by CQL to insert, update, 
delete and query data.
+
+.. _select-statement:
+
 SELECT
 ^^^^^^
 
-*Syntax:*
+Querying data from data is done using a ``SELECT`` statement:
 
-| bc(syntax)..
-|  ::= SELECT ( JSON )? 
-|  FROM 
-|  ( WHERE )?
-|  ( ORDER BY )?
-|  ( PER PARTITION LIMIT )?
-|  ( LIMIT )?
-|  ( ALLOW FILTERING )?
+.. productionlist::
+   select_statement: SELECT [ JSON | DISTINCT ] ( `select_clause` | '*' )
+                   : FROM `table_name`
+                   : [ WHERE `where_clause` ]
+                   : [ ORDER BY `ordering_clause` ]
+                   : [ PER PARTITION LIMIT (`integer` | `bind_marker`) ]
+                   : [ LIMIT (`integer` | `bind_marker`) ]
+                   : [ ALLOW FILTERING ]
+   select_clause: `selector` [ AS `identifier` ] ( ',' `selector` [ AS 
`identifier` ] )
+   selector: `identifier`
+           : | `term`
+           : | CAST '(' `selector` AS `cql_type` ')'
+           : | `function_name` '(' [ `selector` ( ',' `selector` )* ] ')'
+           : | COUNT '(' '*' ')'
+   where_clause: `relation` ( AND `relation` )*
+   relation: `identifier` `operator` `term`
+           : '(' `identifier` ( ',' `identifier` )* ')' `operator` 
`tuple_literal`
+           : TOKEN '(' `identifier` ( ',' `identifier` )* ')' `operator` `term`
+   operator: '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS 
KEY
+   ordering_clause: `identifier` [ ASC | DESC ] ( ',' `identifier` [ ASC | 
DESC ] )*
 
-|  ::= DISTINCT? 
-|  \| COUNT â(â ( â\*â \| â1â ) â)â (AS )?
+For instance::
 
-|  ::= (AS )? ( â,â (AS )? )\*
-|  \| â\*â
+    SELECT name, occupation FROM users WHERE userid IN (199, 200, 207);
+    SELECT JSON name, occupation FROM users WHERE userid = 199;
+    SELECT name AS user_name, occupation AS user_occupation FROM users;
 
-|  ::= 
-|  \| WRITETIME â(â â)â
-|  \| TTL â(â â)â
-|  \| CAST â(â AS â)â
-|  \| â(â ( (â,â )\*)? â)â
+    SELECT time, value
+    FROM events
+    WHERE event_type = 'myEvent'
+      AND time > '2011-02-03'
+      AND time <= '2012-01-01'
 
- ::= ( AND )\*
+    SELECT COUNT (*) AS user_count FROM users;
 
-|  ::= 
-|  \| â(â (â,â )\* â)â 
-|  \| IN â(â ( ( â,â )\* )? â)â
-|  \| â(â (â,â )\* â)â IN â(â ( ( â,â )\* )? â)â
-|  \| TOKEN â(â ( â,â )\* â)â 
+The ``SELECT`` statements reads one or more columns for one or more rows in a 
table. It returns a result-set of the rows
+matching the request, where each row contains the values for the selection 
corresponding to the query. Additionally,
+:ref:`functions <cql-functions>` including :ref:`aggregation 
<aggregate-functions>` ones can be applied to the result.
 
-|  ::= â=â \| â<â \| â>â \| â<=â \| â>=â \| CONTAINS \| 
CONTAINS KEY
-|  ::= ( â,â )\*
-|  ::= ( ASC \| DESC )?
-|  ::= â(â (â,â )\* â)â
-| p.
-| *Sample:*
+A ``SELECT`` statement contains at least a :ref:`selection clause 
<selection-clause>` and the name of the table on which
+the selection is on (note that CQL does **not** joins or sub-queries and thus 
a select statement only apply to a single
+table). In most case, a select will also have a :ref:`where clause 
<where-clause>` and it can optionally have additional
+clauses to :ref:`order <ordering-clause>` or :ref:`limit <limit-clause>` the 
results. Lastly, :ref:`queries that require
+filtering <allow-filtering>` can be allowed if the ``ALLOW FILTERING`` flag is 
provided.
+
+.. _selection-clause:
 
-| bc(sample)..
-| SELECT name, occupation FROM users WHERE userid IN (199, 200, 207);
+Selection clause
+~~~~~~~~~~~~~~~~
 
-SELECT JSON name, occupation FROM users WHERE userid = 199;
+The :token:`select_clause` determines which columns needs to be queried and 
returned in the result-set, as well as any
+transformation to apply to this result before returning. It consists of a 
comma-separated list of *selectors* or,
+alternatively, of the wildcard character (``*``) to select all the columns 
defined in the table.
 
-SELECT name AS user\_name, occupation AS user\_occupation FROM users;
+Selectors
+`````````
 
-| SELECT time, value
-| FROM events
-| WHERE event\_type = âmyEventâ
-|  AND time > â2011-02-03â
-|  AND time <= â2012-01-01â
+A :token:`selector` can be one of:
 
-SELECT COUNT (\*) FROM users;
+- A column name of the table selected, to retrieve the values for that column.
+- A term, which is usually used nested inside other selectors like functions 
(if a term is selected directly, then the
+  corresponding column of the result-set will simply have the value of this 
term for every row returned).
+- A casting, which allows to convert a nested selector to a (compatible) type.
+- A function call, where the arguments are selector themselves. See the 
section on :ref:`functions <cql-functions>` for
+  more details.
+- The special call ``COUNT(*)`` to the :ref:`COUNT function <count-function>`, 
which counts all non-null results.
 
-SELECT COUNT (\*) AS user\_count FROM users;
+Aliases
+```````
 
-The ``SELECT`` statements reads one or more columns for one or more rows
-in a table. It returns a result-set of rows, where each row contains the
-collection of columns corresponding to the query. If the ``JSON``
-keyword is used, the results for each row will contain only a single
-column named âjsonâ. See the section on
-```SELECT JSON`` <#selectJson>`__ for more details.
+Every *top-level* selector can also be aliased (using `AS`). If so, the name 
of the corresponding column in the result
+set will be that of the alias. For instance::
 
-``<select-clause>``
-~~~~~~~~~~~~~~~~~~~
+    // Without alias
+    SELECT intAsBlob(4) FROM t;
+
+    //  intAsBlob(4)
+    // --------------
+    //  0x00000004
+
+    // With alias
+    SELECT intAsBlob(4) AS four FROM t;
+
+    //  four
+    // ------------
+    //  0x00000004
+
+.. note:: Currently, aliases aren't recognized anywhere else in the statement 
where they are used (not in the ``WHERE``
+   clause, not in the ``ORDER BY`` clause, ...). You must use the orignal 
column name instead.
+
+
+``WRITETIME`` and ``TTL`` function
+```````````````````````````````````
+
+Selection supports two special functions (that aren't allowed anywhere else): 
``WRITETIME`` and ``TTL``. Both function
+take only one argument and that argument *must* be a column name (so for 
instance ``TTL(3)`` is invalid).
+
+Those functions allow to retrieve meta-information that are stored internally 
for each column, namely:
+
+- the timestamp of the value of the column for ``WRITETIME``.
+- the remaining time to live (in seconds) for the value of the column if it 
set to expire (and ``null`` otherwise).
+
+.. _where-clause:
+
+The ``WHERE`` clause
+~~~~~~~~~~~~~~~~~~~~
+
+The ``WHERE`` clause specifies which rows must be queried. It is composed of 
relations on the columns that are part of
+the ``PRIMARY KEY`` and/or have a `secondary index <#createIndexStmt>`__ 
defined on them.
+
+Not all relations are allowed in a query. For instance, non-equal relations 
(where ``IN`` is considered as an equal
+relation) on a partition key are not supported (but see the use of the 
``TOKEN`` method below to do non-equal queries on
+the partition key). Moreover, for a given partition key, the clustering 
columns induce an ordering of rows and relations
+on them is restricted to the relations that allow to select a **contiguous** 
(for the ordering) set of rows. For
+instance, given::
+
+    CREATE TABLE posts (
+        userid text,
+        blog_title text,
+        posted_at timestamp,
+        entry_title text,
+        content text,
+        category int,
+        PRIMARY KEY (userid, blog_title, posted_at)
+    )
+
+The following query is allowed::
+
+    SELECT entry_title, content FROM posts
+     WHERE userid = 'john doe'
+       AND blog_title='John''s Blog'
+       AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
+
+But the following one is not, as it does not select a contiguous set of rows 
(and we suppose no secondary indexes are
+set)::
+
+    // Needs a blog_title to be set to select ranges of posted_at
+    SELECT entry_title, content FROM posts
+     WHERE userid = 'john doe'
+       AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
+
+When specifying relations, the ``TOKEN`` function can be used on the 
``PARTITION KEY`` column to query. In that case,
+rows will be selected based on the token of their ``PARTITION_KEY`` rather 
than on the value. Note that the token of a
+key depends on the partitioner in use, and that in particular the 
RandomPartitioner won't yield a meaningful order. Also
+note that ordering partitioners always order token values by bytes (so even if 
the partition key is of type int,
+``token(-1) > token(0)`` in particular). Example::
+
+    SELECT * FROM posts
+     WHERE token(userid) > token('tom') AND token(userid) < token('bob')
+
+Moreover, the ``IN`` relation is only allowed on the last column of the 
partition key and on the last column of the full
+primary key.
 
-The ``<select-clause>`` determines which columns needs to be queried and
-returned in the result-set. It consists of either the comma-separated
-list of or the wildcard character (``*``) to select all the columns
-defined for the table.
-
-A ``<selector>`` is either a column name to retrieve or a ``<function>``
-of one or more ``<term>``\ s. The function allowed are the same as for
-``<term>`` and are described in the `function section <#functions>`__.
-In addition to these generic functions, the ``WRITETIME`` (resp.
-``TTL``) function allows to select the timestamp of when the column was
-inserted (resp. the time to live (in seconds) for the column (or null if
-the column has no expiration set)) and the ```CAST`` <#castFun>`__
-function can be used to convert one data type to another.
-
-Any ``<selector>`` can be aliased using ``AS`` keyword (see examples).
-Please note that ``<where-clause>`` and ``<order-by>`` clause should
-refer to the columns by their original names and not by their aliases.
-
-The ``COUNT`` keyword can be used with parenthesis enclosing ``*``. If
-so, the query will return a single result: the number of rows matching
-the query. Note that ``COUNT(1)`` is supported as an alias.
-
-``<where-clause>``
+It is also possible to âgroupâ ``CLUSTERING COLUMNS`` together in a 
relation using the tuple notation. For instance::
+
+    SELECT * FROM posts
+     WHERE userid = 'john doe'
+       AND (blog_title, posted_at) > ('John''s Blog', '2012-01-01')
+
+will request all rows that sorts after the one having âJohn's Blogâ as 
``blog_tile`` and '2012-01-01' for ``posted_at``
+in the clustering order. In particular, rows having a ``post_at <= 
'2012-01-01'`` will be returned as long as their
+``blog_title > 'John''s Blog'``, which would not be the case for::
+
+    SELECT * FROM posts
+     WHERE userid = 'john doe'
+       AND blog_title > 'John''s Blog'
+       AND posted_at > '2012-01-01'
+
+The tuple notation may also be used for ``IN`` clauses on clustering columns::
+
+    SELECT * FROM posts
+     WHERE userid = 'john doe'
+       AND (blog_title, posted_at) IN (('John''s Blog', '2012-01-01), 
('Extreme Chess', '2014-06-01'))
+
+The ``CONTAINS`` operator may only be used on collection columns (lists, sets, 
and maps). In the case of maps,
+``CONTAINS`` applies to the map values. The ``CONTAINS KEY`` operator may only 
be used on map columns and applies to the
+map keys.
+
+.. _ordering-clause:
+
+Ordering results
+~~~~~~~~~~~~~~~~
+
+The ``ORDER BY`` clause allows to select the order of the returned results. It 
takes as argument a list of column names
+along with the order for the column (``ASC`` for ascendant and ``DESC`` for 
descendant, omitting the order being
+equivalent to ``ASC``). Currently the possible orderings are limited by the 
:ref:`clustering order <clustering-order>`
+defined on the table:
+
+- if the table has been defined without any specific ``CLUSTERING ORDER``, 
then then allowed orderings are the order
+  induced by the clustering columns and the reverse of that one.
+- otherwise, the orderings allowed are the order of the ``CLUSTERING ORDER`` 
option and the reversed one.
+
+.. _limit-clause:
+
+Limiting results
+~~~~~~~~~~~~~~~~
+
+The ``LIMIT`` option to a ``SELECT`` statement limits the number of rows 
returned by a query, while the ``PER PARTITION
+LIMIT`` option limits the number of rows returned for a given partition by the 
query. Note that both type of limit can
+used in the same statement.
+
+.. _allow-filtering:
+
+Allowing filtering
 ~~~~~~~~~~~~~~~~~~
 
-The ``<where-clause>`` specifies which rows must be queried. It is
-composed of relations on the columns that are part of the
-``PRIMARY KEY`` and/or have a `secondary index <#createIndexStmt>`__
-defined on them.
-
-Not all relations are allowed in a query. For instance, non-equal
-relations (where ``IN`` is considered as an equal relation) on a
-partition key are not supported (but see the use of the ``TOKEN`` method
-below to do non-equal queries on the partition key). Moreover, for a
-given partition key, the clustering columns induce an ordering of rows
-and relations on them is restricted to the relations that allow to
-select a **contiguous** (for the ordering) set of rows. For instance,
-given
-
-| bc(sample).
-| CREATE TABLE posts (
-|  userid text,
-|  blog\_title text,
-|  posted\_at timestamp,
-|  entry\_title text,
-|  content text,
-|  category int,
-|  PRIMARY KEY (userid, blog\_title, posted\_at)
-| )
-
-The following query is allowed:
-
-| bc(sample).
-| SELECT entry\_title, content FROM posts WHERE userid=âjohn doeâ AND
-  blog\_title=âJohnââs Blogâ AND posted\_at >= â2012-01-01â AND
-  posted\_at < â2012-01-31â
-
-But the following one is not, as it does not select a contiguous set of
-rows (and we suppose no secondary indexes are set):
-
-| bc(sample).
-| // Needs a blog\_title to be set to select ranges of posted\_at
-| SELECT entry\_title, content FROM posts WHERE userid=âjohn doeâ AND
-  posted\_at >= â2012-01-01â AND posted\_at < â2012-01-31â
-
-When specifying relations, the ``TOKEN`` function can be used on the
-``PARTITION KEY`` column to query. In that case, rows will be selected
-based on the token of their ``PARTITION_KEY`` rather than on the value.
-Note that the token of a key depends on the partitioner in use, and that
-in particular the RandomPartitioner wonât yield a meaningful order. Also
-note that ordering partitioners always order token values by bytes (so
-even if the partition key is of type int, ``token(-1) > token(0)`` in
-particular). Example:
-
-| bc(sample).
-| SELECT \* FROM posts WHERE token(userid) > token(âtomâ) AND
-  token(userid) < token(âbobâ)
-
-Moreover, the ``IN`` relation is only allowed on the last column of the
-partition key and on the last column of the full primary key.
-
-It is also possible to âgroupâ ``CLUSTERING COLUMNS`` together in a
-relation using the tuple notation. For instance:
-
-| bc(sample).
-| SELECT \* FROM posts WHERE userid=âjohn doeâ AND (blog\_title,
-  posted\_at) > (âJohnââs Blogâ, â2012-01-01â)
-
-will request all rows that sorts after the one having âJohnâs Blogâ as
-``blog_tile`` and â2012-01-01â for ``posted_at`` in the clustering
-order. In particular, rows having a ``post_at <= '2012-01-01'`` will be
-returned as long as their ``blog_title > 'John''s Blog'``, which
-wouldnât be the case for:
-
-| bc(sample).
-| SELECT \* FROM posts WHERE userid=âjohn doeâ AND blog\_title >
-  âJohnââs Blogâ AND posted\_at > â2012-01-01â
-
-The tuple notation may also be used for ``IN`` clauses on
-``CLUSTERING COLUMNS``:
-
-| bc(sample).
-| SELECT \* FROM posts WHERE userid=âjohn doeâ AND (blog\_title,
-  posted\_at) IN ((âJohnââs Blogâ, â2012-01-01), (âExtreme 
Chessâ,
-  â2014-06-01â))
-
-The ``CONTAINS`` operator may only be used on collection columns (lists,
-sets, and maps). In the case of maps, ``CONTAINS`` applies to the map
-values. The ``CONTAINS KEY`` operator may only be used on map columns
-and applies to the map keys.
-
-``<order-by>``
-~~~~~~~~~~~~~~
-
-The ``ORDER BY`` option allows to select the order of the returned
-results. It takes as argument a list of column names along with the
-order for the column (``ASC`` for ascendant and ``DESC`` for descendant,
-omitting the order being equivalent to ``ASC``). Currently the possible
-orderings are limited (which depends on the table
-```CLUSTERING ORDER`` <#createTableOptions>`__ ):
-
--  if the table has been defined without any specific
-   ``CLUSTERING ORDER``, then then allowed orderings are the order
-   induced by the clustering columns and the reverse of that one.
--  otherwise, the orderings allowed are the order of the
-   ``CLUSTERING ORDER`` option and the reversed one.
-
-``LIMIT`` and ``PER PARTITION LIMIT``
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The ``LIMIT`` option to a ``SELECT`` statement limits the number of rows
-returned by a query, while the ``PER PARTITION LIMIT`` option limits the
-number of rows returned for a given partition by the query. Note that
-both type of limit can used in the same statement.
-
-``ALLOW FILTERING``
-~~~~~~~~~~~~~~~~~~~
+By default, CQL only allows select queries that don't involve âfilteringâ 
server side, i.e. queries where we know that
+all (live) record read will be returned (maybe partly) in the result set. The 
reasoning is that those ânon filteringâ
+queries have predictable performance in the sense that they will execute in a 
time that is proportional to the amount of
+data **returned** by the query (which can be controlled through ``LIMIT``).
+
+The ``ALLOW FILTERING`` option allows to explicitly allow (some) queries that 
require filtering. Please note that a
+query using ``ALLOW FILTERING`` may thus have unpredictable performance (for 
the definition above), i.e. even a query
+that selects a handful of records **may** exhibit performance that depends on 
the total amount of data stored in the
+cluster.
+
+For instance, considering the following table holding user profiles with their 
year of birth (with a secondary index on
+it) and country of residence::
+
+    CREATE TABLE users (
+        username text PRIMARY KEY,
+        firstname text,
+        lastname text,
+        birth_year int,
+        country text
+    )
+
+    CREATE INDEX ON users(birth_year);
+
+Then the following queries are valid::
+
+    SELECT * FROM users;
+    SELECT * FROM users WHERE birth_year = 1981;
+
+because in both case, Cassandra guarantees that these queries performance will 
be proportional to the amount of data
+returned. In particular, if no users are born in 1981, then the second query 
performance will not depend of the number
+of user profile stored in the database (not directly at least: due to 
secondary index implementation consideration, this
+query may still depend on the number of node in the cluster, which indirectly 
depends on the amount of data stored.
+Nevertheless, the number of nodes will always be multiple number of magnitude 
lower than the number of user profile
+stored). Of course, both query may return very large result set in practice, 
but the amount of data returned can always
+be controlled by adding a ``LIMIT``.
 
-By default, CQL only allows select queries that donât involve
-âfilteringâ server side, i.e. queries where we know that all (live)
-record read will be returned (maybe partly) in the result set. The
-reasoning is that those ânon filteringâ queries have predictable
-performance in the sense that they will execute in a time that is
-proportional to the amount of data **returned** by the query (which can
-be controlled through ``LIMIT``).
-
-The ``ALLOW FILTERING`` option allows to explicitly allow (some) queries
-that require filtering. Please note that a query using
-``ALLOW FILTERING`` may thus have unpredictable performance (for the
-definition above), i.e. even a query that selects a handful of records
-**may** exhibit performance that depends on the total amount of data
-stored in the cluster.
-
-For instance, considering the following table holding user profiles with
-their year of birth (with a secondary index on it) and country of
-residence:
-
-| bc(sample)..
-| CREATE TABLE users (
-|  username text PRIMARY KEY,
-|  firstname text,
-|  lastname text,
-|  birth\_year int,
-|  country text
-| )
-
-| CREATE INDEX ON users(birth\_year);
-| p.
-
-Then the following queries are valid:
-
-| bc(sample).
-| SELECT \* FROM users;
-| SELECT firstname, lastname FROM users WHERE birth\_year = 1981;
-
-because in both case, Cassandra guarantees that these queries
-performance will be proportional to the amount of data returned. In
-particular, if no users are born in 1981, then the second query
-performance will not depend of the number of user profile stored in the
-database (not directly at least: due to secondary index implementation
-consideration, this query may still depend on the number of node in the
-cluster, which indirectly depends on the amount of data stored.
-Nevertheless, the number of nodes will always be multiple number of
-magnitude lower than the number of user profile stored). Of course, both
-query may return very large result set in practice, but the amount of
-data returned can always be controlled by adding a ``LIMIT``.
-
-However, the following query will be rejected:
-
-| bc(sample).
-| SELECT firstname, lastname FROM users WHERE birth\_year = 1981 AND
-  country = âFRâ;
-
-because Cassandra cannot guarantee that it wonât have to scan large
-amount of data even if the result to those query is small. Typically, it
-will scan all the index entries for users born in 1981 even if only a
-handful are actually from France. However, if you âknow what you are
-doingâ, you can force the execution of this query by using
-``ALLOW FILTERING`` and so the following query is valid:
-
-| bc(sample).
-| SELECT firstname, lastname FROM users WHERE birth\_year = 1981 AND
-  country = âFRâ ALLOW FILTERING;
+However, the following query will be rejected::
+
+    SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR';
+
+because Cassandra cannot guarantee that it won't have to scan large amount of 
data even if the result to those query is
+small. Typically, it will scan all the index entries for users born in 1981 
even if only a handful are actually from
+France. However, if you âknow what you are doingâ, you can force the 
execution of this query by using ``ALLOW
+FILTERING`` and so the following query is valid::
+
+    SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR' ALLOW 
FILTERING;
+
+.. _insert-statement:
 
 INSERT
 ^^^^^^
 
-*Syntax:*
-
-| bc(syntax)..
-|  ::= INSERT INTO 
-|  ( ( VALUES )
-|  \| ( JSON ))
-|  ( IF NOT EXISTS )?
-|  ( USING ( AND )\* )?
-
- ::= â(â ( â,â )\* â)â
-
- ::= â(â ( â,â )\* â)â
-
-|  ::= 
-|  \| 
-
-|  ::= TIMESTAMP 
-|  \| TTL 
-| p.
-| *Sample:*
-
-| bc(sample)..
-| INSERT INTO NerdMovies (movie, director, main\_actor, year)
-|  VALUES (âSerenityâ, âJoss Whedonâ, âNathan Fillionâ, 2005)
-| USING TTL 86400;
-
-| INSERT INTO NerdMovies JSON â{`movie <>`__ âSerenityâ, `director <>`__
-  âJoss Whedonâ, `year <>`__ 2005}â
-| p.
-| The ``INSERT`` statement writes one or more columns for a given row in
-  a table. Note that since a row is identified by its ``PRIMARY KEY``,
-  at least the columns composing it must be specified. The list of
-  columns to insert to must be supplied when using the ``VALUES``
-  syntax. When using the ``JSON`` syntax, they are optional. See the
-  section on ```INSERT JSON`` <#insertJson>`__ for more details.
-
-Note that unlike in SQL, ``INSERT`` does not check the prior existence
-of the row by default: the row is created if none existed before, and
-updated otherwise. Furthermore, there is no mean to know which of
-creation or update happened.
-
-It is however possible to use the ``IF NOT EXISTS`` condition to only
-insert if the row does not exist prior to the insertion. But please note
-that using ``IF NOT EXISTS`` will incur a non negligible performance
-cost (internally, Paxos will be used) so this should be used sparingly.
+Inserting data for a row is done using an ``INSERT`` statement:
+
+.. productionlist::
+   insert_statement: INSERT INTO `table_name` ( `names_values` | `json_clause` 
)
+                   : [ IF NOT EXISTS ]
+                   : [ USING `update_parameter` ( AND `update_parameter` )* ]
+   names_values: `names` VALUES `tuple_literal`
+   names: '(' `identifier` ( ',' `identifier` )* ')'
+
+For instance::
+
+    INSERT INTO NerdMovies (movie, director, main_actor, year)
+                    VALUES ('Serenity', 'Joss Whedon', 'Nathan Fillion', 2005)
+          USING TTL 86400;
+
+    INSERT INTO NerdMovies JSON '{"movie": "Serenity",
+                                  "director": "Joss Whedon",
+                                  "year": 2005}';
+
+The ``INSERT`` statement writes one or more columns for a given row in a 
table. Note that since a row is identified by
+its ``PRIMARY KEY``, at least the columns composing it must be specified. The 
list of columns to insert to must be
+supplied when using the ``VALUES`` syntax. When using the ``JSON`` syntax, 
they are optional. See the
+section on :ref:`JSON support <cql-json>` for more detail.
+
+Note that unlike in SQL, ``INSERT`` does not check the prior existence of the 
row by default: the row is created if none
+existed before, and updated otherwise. Furthermore, there is no mean to know 
which of creation or update happened.
+
+It is however possible to use the ``IF NOT EXISTS`` condition to only insert 
if the row does not exist prior to the
+insertion. But please note that using ``IF NOT EXISTS`` will incur a non 
negligible performance cost (internally, Paxos
+will be used) so this should be used sparingly.
 
 All updates for an ``INSERT`` are applied atomically and in isolation.
 
-Please refer to the ```UPDATE`` <#updateOptions>`__ section for
-information on the ``<option>`` available and to the
-`collections <#collections>`__ section for use of
-``<collection-literal>``. Also note that ``INSERT`` does not support
-counters, while ``UPDATE`` does.
+Please refer to the :ref:`UPDATE <update-parameters>` section for informations 
on the :token:`update_parameter`.
+
+Also note that ``INSERT`` does not support counters, while ``UPDATE`` does.
+
+.. _update-statement:
 
 UPDATE
 ^^^^^^
 
-*Syntax:*
-
-| bc(syntax)..
-|  ::= UPDATE 
-|  ( USING ( AND )\* )?
-|  SET ( â,â )\*
-|  WHERE 
-|  ( IF ( AND condition )\* )?
-
-|  ::= â=â 
-|  \| â=â (â+â \| â-â) ( \| \| )
-|  \| â=â â+â 
-|  \| â[â â]â â=â 
-|  \| â.â â=â 
-
-|  ::= 
-|  \| IN 
-|  \| â[â â]â 
-|  \| â[â â]â IN 
-|  \| â.â 
-|  \| â.â IN 
-
-|  ::= â<â \| â<=â \| â=â \| â!=â \| â>=â \| â>â
-|  ::= ( \| â(â ( ( â,â )\* )? â)â)
-
- ::= ( AND )\*
-
-|  ::= â=â 
-|  \| â(â (â,â )\* â)â â=â 
-|  \| IN â(â ( ( â,â )\* )? â)â
-|  \| IN 
-|  \| â(â (â,â )\* â)â IN â(â ( ( â,â )\* )? â)â
-|  \| â(â (â,â )\* â)â IN 
-
-|  ::= TIMESTAMP 
-|  \| TTL 
-| p.
-| *Sample:*
-
-| bc(sample)..
-| UPDATE NerdMovies USING TTL 400
-| SET director = âJoss Whedonâ,
-|  main\_actor = âNathan Fillionâ,
-|  year = 2005
-| WHERE movie = âSerenityâ;
-
-| UPDATE UserActions SET total = total + 2 WHERE user =
-  B70DE1D0-9908-4AE3-BE34-5573E5B09F14 AND action = âclickâ;
-| p.
-| The ``UPDATE`` statement writes one or more columns for a given row in
-  a table. The ``<where-clause>`` is used to select the row to update
-  and must include all columns composing the ``PRIMARY KEY``. Other
-  columns values are specified through ``<assignment>`` after the
-  ``SET`` keyword.
-
-Note that unlike in SQL, ``UPDATE`` does not check the prior existence
-of the row by default (except through the use of ``<condition>``, see
-below): the row is created if none existed before, and updated
-otherwise. Furthermore, there are no means to know whether a creation or
-update occurred.
-
-It is however possible to use the conditions on some columns through
-``IF``, in which case the row will not be updated unless the conditions
-are met. But, please note that using ``IF`` conditions will incur a
-non-negligible performance cost (internally, Paxos will be used) so this
-should be used sparingly.
-
-In an ``UPDATE`` statement, all updates within the same partition key
-are applied atomically and in isolation.
-
-The ``c = c + 3`` form of ``<assignment>`` is used to
-increment/decrement counters. The identifier after the â=â sign **must**
-be the same than the one before the â=â sign (Only increment/decrement
-is supported on counters, not the assignment of a specific value).
-
-The ``id = id + <collection-literal>`` and ``id[value1] = value2`` forms
-of ``<assignment>`` are for collections. Please refer to the `relevant
-section <#collections>`__ for more details.
-
-The ``id.field = <term>`` form of ``<assignemt>`` is for setting the
-value of a single field on a non-frozen user-defined types.
-
-``<options>``
-~~~~~~~~~~~~~
-
-The ``UPDATE`` and ``INSERT`` statements support the following options:
-
--  ``TIMESTAMP``: sets the timestamp for the operation. If not
-   specified, the coordinator will use the current time (in
-   microseconds) at the start of statement execution as the timestamp.
-   This is usually a suitable default.
--  ``TTL``: specifies an optional Time To Live (in seconds) for the
-   inserted values. If set, the inserted values are automatically
-   removed from the database after the specified time. Note that the TTL
-   concerns the inserted values, not the columns themselves. This means
-   that any subsequent update of the column will also reset the TTL (to
-   whatever TTL is specified in that update). By default, values never
-   expire. A TTL of 0 is equivalent to no TTL. If the table has a
-   default\_time\_to\_live, a TTL of 0 will remove the TTL for the
-   inserted or updated values.
+Updating a row is done using an ``UPDATE`` statement:
+
+.. productionlist::
+   update_statement: UPDATE `table_name`
+                   : [ USING `update_parameter` ( AND `update_parameter` )* ]
+                   : SET `assignment` ( ',' `assignment` )*
+                   : WHERE `where_clause`
+                   : [ IF ( EXISTS | `condition` ( AND `condition` )*) ]
+   update_parameter: ( TIMESTAMP | TTL ) ( `integer` | `bind_marker` )
+   assignment: `simple_selection` '=' `term`
+             :| `column_name` '=' `column_name` ( '+' | '-' ) `term`
+             :| `column_name` '=' `list_literal` '+' `column_name`
+   simple_selection: `column_name`
+                   :| `column_name` '[' `term` ']'
+                   :| `column_name` '.' `field_name
+   condition: `simple_selection` `operator` `term`
+
+For instance::
+
+    UPDATE NerdMovies USING TTL 400
+       SET director   = 'Joss Whedon',
+           main_actor = 'Nathan Fillion',
+           year       = 2005
+     WHERE movie = 'Serenity';
+
+    UPDATE UserActions
+       SET total = total + 2
+       WHERE user = B70DE1D0-9908-4AE3-BE34-5573E5B09F14
+         AND action = 'click';
+
+The ``UPDATE`` statement writes one or more columns for a given row in a 
table. The :token:`where_clause` is used to
+select the row to update and must include all columns composing the ``PRIMARY 
KEY``. Non primary key columns are then
+set using the ``SET`` keyword.
+
+Note that unlike in SQL, ``UPDATE`` does not check the prior existence of the 
row by default (except through ``IF``, see
+below): the row is created if none existed before, and updated otherwise. 
Furthermore, there are no means to know
+whether a creation or update occurred.
+
+It is however possible to use the conditions on some columns through ``IF``, 
in which case the row will not be updated
+unless the conditions are met. But, please note that using ``IF`` conditions 
will incur a non-negligible performance
+cost (internally, Paxos will be used) so this should be used sparingly.
+
+In an ``UPDATE`` statement, all updates within the same partition key are 
applied atomically and in isolation.
+
+Regarding the :token:`assignment`:
+
+- ``c = c + 3`` is used to increment/decrement counters. The column name after 
the '=' sign **must** be the same than
+  the one before the '=' sign. Note that increment/decrement is only allowed 
on counters, and are the *only* update
+  operations allowed on counters. See the section on :ref:`counters 
<counters>` for details.
+- ``id = id + <some-collection>`` and ``id[value1] = value2`` are for 
collections, see the :ref:`relevant section
+  <collections>` for details.
+- ``id.field = 3`` is for setting the value of a field on a non-frozen 
user-defined types. see the :ref:`relevant section
+  <udts>` for details.
+
+.. _update-parameters:
+
+Update parameters
+~~~~~~~~~~~~~~~~~
+
+The ``UPDATE``, ``INSERT`` (and ``DELETE`` and ``BATCH`` for the 
``TIMESTAMP``) statements support the following
+parameters:
+
+- ``TIMESTAMP``: sets the timestamp for the operation. If not specified, the 
coordinator will use the current time (in
+  microseconds) at the start of statement execution as the timestamp. This is 
usually a suitable default.
+- ``TTL``: specifies an optional Time To Live (in seconds) for the inserted 
values. If set, the inserted values are
+  automatically removed from the database after the specified time. Note that 
the TTL concerns the inserted values, not
+  the columns themselves. This means that any subsequent update of the column 
will also reset the TTL (to whatever TTL
+  is specified in that update). By default, values never expire. A TTL of 0 is 
equivalent to no TTL. If the table has a
+  default_time_to_live, a TTL of 0 will remove the TTL for the inserted or 
updated values.
+
+.. _delete_statement:
 
 DELETE
 ^^^^^^
 
-*Syntax:*
-
-| bc(syntax)..
-|  ::= DELETE ( ( â,â )\* )?
-|  FROM 
-|  ( USING TIMESTAMP )?
-|  WHERE 
-|  ( IF ( EXISTS \| ( ( AND )\*) ) )?
-
-|  ::= 
-|  \| â[â â]â
-|  \| â.â 
-
- ::= ( AND )\*
-
-|  ::= 
-|  \| â(â (â,â )\* â)â 
-|  \| IN â(â ( ( â,â )\* )? â)â
-|  \| IN 
-|  \| â(â (â,â )\* â)â IN â(â ( ( â,â )\* )? â)â
-|  \| â(â (â,â )\* â)â IN 
-
-|  ::= â=â \| â<â \| â>â \| â<=â \| â>=â
-|  ::= ( \| â(â ( ( â,â )\* )? â)â)
-
-|  ::= ( \| â!=â) 
-|  \| IN 
-|  \| â[â â]â ( \| â!=â) 
-|  \| â[â â]â IN 
-|  \| â.â ( \| â!=â) 
-|  \| â.â IN 
-
-*Sample:*
-
-| bc(sample)..
-| DELETE FROM NerdMovies USING TIMESTAMP 1240003134 WHERE movie =
-  âSerenityâ;
-
-| DELETE phone FROM Users WHERE userid IN
-  (C73DE1D3-AF08-40F3-B124-3FF3E5109F22,
-  B70DE1D0-9908-4AE3-BE34-5573E5B09F14);
-| p.
-| The ``DELETE`` statement deletes columns and rows. If column names are
-  provided directly after the ``DELETE`` keyword, only those columns are
-  deleted from the row indicated by the ``<where-clause>``. The
-  ``id[value]`` syntax in ``<selection>`` is for non-frozen collections
-  (please refer to the `collection section <#collections>`__ for more
-  details). The ``id.field`` syntax is for the deletion of non-frozen
-  user-defined types. Otherwise, whole rows are removed. The
-  ``<where-clause>`` specifies which rows are to be deleted. Multiple
-  rows may be deleted with one statement by using an ``IN`` clause. A
-  range of rows may be deleted using an inequality operator (such as
-  ``>=``).
-
-``DELETE`` supports the ``TIMESTAMP`` option with the same semantics as
-the ```UPDATE`` <#updateStmt>`__ statement.
-
-In a ``DELETE`` statement, all deletions within the same partition key
-are applied atomically and in isolation.
-
-A ``DELETE`` operation can be conditional through the use of an ``IF``
-clause, similar to ``UPDATE`` and ``INSERT`` statements. However, as
-with ``INSERT`` and ``UPDATE`` statements, this will incur a
-non-negligible performance cost (internally, Paxos will be used) and so
-should be used sparingly.
+Deleting rows or parts of rows uses the ``DELETE`` statement:
+
+.. productionlist::
+   delete_statement: DELETE [ `simple_selection` ( ',' `simple_selection` ) ]
+                   : FROM `table_name`
+                   : [ USING `update_parameter` ( AND `update_parameter` )* ]
+                   : WHERE `where_clause`
+                   : [ IF ( EXISTS | `condition` ( AND `condition` )*) ]
+
+For instance::
+
+    DELETE FROM NerdMovies USING TIMESTAMP 1240003134
+     WHERE movie = 'Serenity';
+
+    DELETE phone FROM Users
+     WHERE userid IN (C73DE1D3-AF08-40F3-B124-3FF3E5109F22, 
B70DE1D0-9908-4AE3-BE34-5573E5B09F14);
+
+The ``DELETE`` statement deletes columns and rows. If column names are 
provided directly after the ``DELETE`` keyword,
+only those columns are deleted from the row indicated by the ``WHERE`` clause. 
Otherwise, whole rows are removed.
+
+The ``WHERE`` clause specifies which rows are to be deleted. Multiple rows may 
be deleted with one statement by using an
+``IN`` operator. A range of rows may be deleted using an inequality operator 
(such as ``>=``).
+
+``DELETE`` supports the ``TIMESTAMP`` option with the same semantics as in 
:ref:`updates <update-parameters>`.
+
+In a ``DELETE`` statement, all deletions within the same partition key are 
applied atomically and in isolation.
+
+A ``DELETE`` operation can be conditional through the use of an ``IF`` clause, 
similar to ``UPDATE`` and ``INSERT``
+statements. However, as with ``INSERT`` and ``UPDATE`` statements, this will 
incur a non-negligible performance cost
+(internally, Paxos will be used) and so should be used sparingly.
+
+.. _batch_statement:
 
 BATCH
 ^^^^^
 
-*Syntax:*
-
-| bc(syntax)..
-|  ::= BEGIN ( UNLOGGED \| COUNTER ) BATCH
-|  ( USING ( AND )\* )?
-|  ( â;â )\*
-|  APPLY BATCH
-
-|  ::= 
-|  \| 
-|  \| 
-
-|  ::= TIMESTAMP 
-| p.
-| *Sample:*
-
-| bc(sample).
-| BEGIN BATCH
-|  INSERT INTO users (userid, password, name) VALUES (âuser2â,
-  âch@ngem3bâ, âsecond userâ);
-|  UPDATE users SET password = âps22dhdsâ WHERE userid = âuser3â;
-|  INSERT INTO users (userid, password) VALUES (âuser4â, âch@ngem3câ);
-|  DELETE name FROM users WHERE userid = âuser1â;
-| APPLY BATCH;
-
-The ``BATCH`` statement group multiple modification statements
-(insertions/updates and deletions) into a single statement. It serves
-several purposes:
-
-#. It saves network round-trips between the client and the server (and
-   sometimes between the server coordinator and the replicas) when
-   batching multiple updates.
-#. All updates in a ``BATCH`` belonging to a given partition key are
-   performed in isolation.
-#. By default, all operations in the batch are performed as ``LOGGED``,
-   to ensure all mutations eventually complete (or none will). See the
-   notes on ```UNLOGGED`` <#unloggedBatch>`__ for more details.
+Multiple ``INSERT``, ``UPDATE`` and ``DELETE`` can be executed in a single 
statement by grouping them through a
+``BATCH`` statement:
+
+.. productionlist::
+   batch_statement: BEGIN [ UNLOGGED | COUNTER ] BATCH
+                   : [ USING `update_parameter` ( AND `update_parameter` )* ]
+                   : `modification_statement` ( ';' `modification_statement` )*
+                   : APPLY BATCH
+   modification_statement: `insert_statement` | `update_statement` | 
`delete_statement`
+
+For instance::
+
+    BEGIN BATCH
+       INSERT INTO users (userid, password, name) VALUES ('user2', 
'ch@ngem3b', 'second user');
+       UPDATE users SET password = 'ps22dhds' WHERE userid = 'user3';
+       INSERT INTO users (userid, password) VALUES ('user4', 'ch@ngem3c');
+       DELETE name FROM users WHERE userid = 'user1';
+    APPLY BATCH;
+
+The ``BATCH`` statement group multiple modification statements 
(insertions/updates and deletions) into a single
+statement. It serves several purposes:
+
+- It saves network round-trips between the client and the server (and 
sometimes between the server coordinator and the
+  replicas) when batching multiple updates.
+- All updates in a ``BATCH`` belonging to a given partition key are performed 
in isolation.
+- By default, all operations in the batch are performed as *logged*, to ensure 
all mutations eventually complete (or
+  none will). See the notes on :ref:`UNLOGGED batches <unlogged-batches>` for 
more details.
 
 Note that:
 
--  ``BATCH`` statements may only contain ``UPDATE``, ``INSERT`` and
-   ``DELETE`` statements.
--  Batches are *not* a full analogue for SQL transactions.
--  If a timestamp is not specified for each operation, then all
-   operations will be applied with the same timestamp. Due to
-   Cassandraâs conflict resolution procedure in the case of `timestamp
-   ties <http://wiki.apache.org/cassandra/FAQ#clocktie>`__, operations
-   may be applied in an order that is different from the order they are
-   listed in the ``BATCH`` statement. To force a particular operation
-   ordering, you must specify per-operation timestamps.
-
-``UNLOGGED``
-~~~~~~~~~~~~
-
-By default, Cassandra uses a batch log to ensure all operations in a
-batch eventually complete or none will (note however that operations are
-only isolated within a single partition).
-
-There is a performance penalty for batch atomicity when a batch spans
-multiple partitions. If you do not want to incur this penalty, you can
-tell Cassandra to skip the batchlog with the ``UNLOGGED`` option. If the
-``UNLOGGED`` option is used, a failed batch might leave the patch only
-partly applied.
-
-``COUNTER``
-~~~~~~~~~~~
+- ``BATCH`` statements may only contain ``UPDATE``, ``INSERT`` and ``DELETE`` 
statements (not other batches for instance).
+- Batches are *not* a full analogue for SQL transactions.
+- If a timestamp is not specified for each operation, then all operations will 
be applied with the same timestamp
+  (either one generated automatically, or the timestamp provided at the batch 
level). Due to Cassandra's conflict
+  resolution procedure in the case of `timestamp ties 
<http://wiki.apache.org/cassandra/FAQ#clocktie>`__, operations may
+  be applied in an order that is different from the order they are listed in 
the ``BATCH`` statement. To force a
+  particular operation ordering, you must specify per-operation timestamps.
 
-Use the ``COUNTER`` option for batched counter updates. Unlike other
-updates in Cassandra, counter updates are not idempotent.
+.. _unlogged-batches:
+
+``UNLOGGED`` batches
+~~~~~~~~~~~~~~~~~~~~
 
-``<option>``
-~~~~~~~~~~~~
+By default, Cassandra uses a batch log to ensure all operations in a batch 
eventually complete or none will (note
+however that operations are only isolated within a single partition).
 
-``BATCH`` supports both the ``TIMESTAMP`` option, with similar semantic
-to the one described in the ```UPDATE`` <#updateOptions>`__ statement
-(the timestamp applies to all the statement inside the batch). However,
-if used, ``TIMESTAMP`` **must not** be used in the statements within the
-batch.
+There is a performance penalty for batch atomicity when a batch spans multiple 
partitions. If you do not want to incur
+this penalty, you can tell Cassandra to skip the batchlog with the 
``UNLOGGED`` option. If the ``UNLOGGED`` option is
+used, a failed batch might leave the patch only partly applied.
+
+``COUNTER`` batches
+~~~~~~~~~~~~~~~~~~~
+
+Use the ``COUNTER`` option for batched counter updates. Unlike other
+updates in Cassandra, counter updates are not idempotent.
[29/34] cassandra git commit: Finish fixing the CQL doc

Reply via email to