Author: tylerhobbs
Date: Wed Feb 18 18:05:44 2015
New Revision: 1660693
URL: http://svn.apache.org/r1660693
Log:
Add CQL docs for 2.1, make them the default
Added:
cassandra/site/publish/doc/cql3/CQL-2.1.html
Modified:
cassandra/site/publish/doc/cql3/CQL.html
Added: cassandra/site/publish/doc/cql3/CQL-2.1.html
URL:
http://svn.apache.org/viewvc/cassandra/site/publish/doc/cql3/CQL-2.1.html?rev=1660693&view=auto
==============================================================================
--- cassandra/site/publish/doc/cql3/CQL-2.1.html (added)
+++ cassandra/site/publish/doc/cql3/CQL-2.1.html Wed Feb 18 18:05:44 2015
@@ -0,0 +1,399 @@
+<?xml version='1.0' encoding='utf-8' ?><!DOCTYPE html PUBLIC "-//W3C//DTD
XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html
xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type"
content="text/html; charset=utf-8"/><title>CQL</title></head><body><p><link
rel="StyleSheet" href="CQL.css" type="text/css" media="screen"></p><h1
id="CassandraQueryLanguageCQLv3.2.0">Cassandra Query Language (CQL)
v3.2.0</h1><span id="tableOfContents"><ol style="list-style: none;"><li><a
href="CQL.html#CassandraQueryLanguageCQLv3.2.0">Cassandra Query Language (CQL)
v3.2.0</a><ol style="list-style: none;"><li><a href="CQL.html#CQLSyntax">CQL
Syntax</a><ol style="list-style: none;"><li><a
href="CQL.html#Preamble">Preamble</a></li><li><a
href="CQL.html#Conventions">Conventions</a></li><li><a
href="CQL.html#identifiers">Identifiers and keywords</a></li><li><a
href="CQL.html#constants">Constants</a></li><li><a
href="CQL.html#Comments">Comments</a></l
i><li><a href="CQL.html#statements">Statements</a></li><li><a
href="CQL.html#preparedStatement">Prepared Statement</a></li></ol></li><li><a
href="CQL.html#dataDefinition">Data Definition</a><ol style="list-style:
none;"><li><a href="CQL.html#createKeyspaceStmt">CREATE KEYSPACE</a></li><li><a
href="CQL.html#useStmt">USE</a></li><li><a
href="CQL.html#alterKeyspaceStmt">ALTER KEYSPACE</a></li><li><a
href="CQL.html#dropKeyspaceStmt">DROP KEYSPACE</a></li><li><a
href="CQL.html#createTableStmt">CREATE TABLE</a></li><li><a
href="CQL.html#alterTableStmt">ALTER TABLE</a></li><li><a
href="CQL.html#dropTableStmt">DROP TABLE</a></li><li><a
href="CQL.html#truncateStmt">TRUNCATE</a></li><li><a
href="CQL.html#createIndexStmt">CREATE INDEX</a></li><li><a
href="CQL.html#dropIndexStmt">DROP INDEX</a></li><li><a
href="CQL.html#createTypeStmt">CREATE TYPE</a></li><li><a
href="CQL.html#alterTypeStmt">ALTER TYPE</a></li><li><a
href="CQL.html#dropTypeStmt">DROP TYPE</a></li><li><a href="CQL.html#createTri
ggerStmt">CREATE TRIGGER</a></li><li><a href="CQL.html#dropTriggerStmt">DROP
TRIGGER</a></li></ol></li><li><a href="CQL.html#dataManipulation">Data
Manipulation</a><ol style="list-style: none;"><li><a
href="CQL.html#insertStmt">INSERT</a></li><li><a
href="CQL.html#updateStmt">UPDATE</a></li><li><a
href="CQL.html#deleteStmt">DELETE</a></li><li><a
href="CQL.html#batchStmt">BATCH</a></li></ol></li><li><a
href="CQL.html#queries">Queries</a><ol style="list-style: none;"><li><a
href="CQL.html#selectStmt">SELECT</a></li></ol></li><li><a
href="CQL.html#types">Data Types</a><ol style="list-style: none;"><li><a
href="CQL.html#usingdates">Working with dates</a></li><li><a
href="CQL.html#counters">Counters</a></li><li><a
href="CQL.html#collections">Working with collections</a></li></ol></li><li><a
href="CQL.html#functions">Functions</a><ol style="list-style: none;"><li><a
href="CQL.html#tokenFun">Token</a></li><li><a
href="CQL.html#uuidFun">Uuid</a></li><li><a href="CQL.html#timeuuidFun">Timeuu
id functions</a></li><li><a href="CQL.html#blobFun">Blob conversion
functions</a></li></ol></li><li><a href="CQL.html#appendixA">Appendix A: CQL
Keywords</a></li><li><a href="CQL.html#appendixB">Appendix B: CQL Reserved
Types</a></li><li><a href="CQL.html#changes">Changes</a><ol style="list-style:
none;"><li><a href="CQL.html#a3.2.0">3.2.0</a></li><li><a
href="CQL.html#a3.1.7">3.1.7</a></li><li><a
href="CQL.html#a3.1.6">3.1.6</a></li><li><a
href="CQL.html#a3.1.5">3.1.5</a></li><li><a
href="CQL.html#a3.1.4">3.1.4</a></li><li><a
href="CQL.html#a3.1.3">3.1.3</a></li><li><a
href="CQL.html#a3.1.2">3.1.2</a></li><li><a
href="CQL.html#a3.1.1">3.1.1</a></li><li><a
href="CQL.html#a3.1.0">3.1.0</a></li><li><a
href="CQL.html#a3.0.5">3.0.5</a></li><li><a
href="CQL.html#a3.0.4">3.0.4</a></li><li><a
href="CQL.html#a3.0.3">3.0.3</a></li><li><a
href="CQL.html#a3.0.2">3.0.2</a></li><li><a
href="CQL.html#a3.0.1">3.0.1</a></li></ol></li><li><a
href="CQL.html#Versioning">Versioning</a></li></ol></li></
ol></span><h2 id="CQLSyntax">CQL Syntax</h2><h3
id="Preamble">Preamble</h3><p>This document describes the Cassandra Query
Language (CQL) version 3. CQL v3 is not backward compatible with CQL v2 and
differs from it in numerous ways. Note that this document describes the last
version of the languages. However, the <a href="#changes">changes</a> section
provides the diff between the different versions of CQL v3.</p><p>CQL v3 offers
a model very close to SQL in the sense that data is put in <em>tables</em>
containing <em>rows</em> of <em>columns</em>. For that reason, when used in
this document, these terms (tables, rows and columns) have the same definition
than they have in SQL. But please note that as such, they do
<strong>not</strong> refer to the concept of rows and columns found in the
internal implementation of Cassandra and in the thrift and CQL v2 API.</p><h3
id="Conventions">Conventions</h3><p>To aid in specifying the CQL syntax, we
will use the following conventions in this d
ocument:</p><ul><li>Language rules will be given in a <a
href="http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form">BNF</a> -like
notation:</li></ul><pre class="syntax"><pre><start> ::= TERMINAL
<non-terminal1> <non-terminal1>
+</pre></pre><ul><li>Nonterminal symbols will have <code><angle
brackets></code>.</li><li>As additional shortcut notations to BNF, we’ll
use traditional regular expression’s symbols (<code>?</code>,
<code>+</code> and <code>*</code>) to signify that a given symbol is optional
and/or can be repeated. We’ll also allow parentheses to group symbols and
the <code>[<characters>]</code> notation to represent any one of
<code><characters></code>.</li><li>The grammar is provided for documentation
purposes and leave some minor details out. For instance, the last column
definition in a <code>CREATE TABLE</code> statement is optional but supported
if present even though the provided grammar in this document suggest it is not
supported. </li><li>Sample code will be provided in a code block:</li></ul><pre
class="sample"><pre>SELECT sample_usage FROM cql;
+</pre></pre><ul><li>References to keywords or pieces of CQL code in running
text will be shown in a <code>fixed-width font</code>.</li></ul><h3
id="identifiers">Identifiers and keywords</h3><p>The CQL language uses
<em>identifiers</em> (or <em>names</em>) to identify tables, columns and other
objects. An identifier is a token matching the regular expression <code
lang="a-zA-Z">[a-zA-Z0-9_]</code><code>*</code>.</p><p>A number of such
identifiers, like <code>SELECT</code> or <code>WITH</code>, are
<em>keywords</em>. They have a fixed meaning for the language and most are
reserved. The list of those keywords can be found in <a
href="#appendixA">Appendix A</a>.</p><p>Identifiers and (unquoted) keywords are
case insensitive. Thus <code>SELECT</code> is the same than <code>select</code>
or <code>sElEcT</code>, and <code>myId</code> is the same than
<code>myid</code> or <code>MYID</code> for instance. A convention often used
(in particular by the samples of this documentation) is to use u
pper case for keywords and lower case for other identifiers.</p><p>There is a
second kind of identifiers called <em>quoted identifiers</em> defined by
enclosing an arbitrary sequence of characters in double-quotes(<code>"</code>).
Quoted identifiers are never keywords. Thus <code>"select"</code> is not a
reserved keyword and can be used to refer to a column, while
<code>select</code> would raise a parse error. Also, contrarily to unquoted
identifiers and keywords, quoted identifiers are case sensitive (<code>"My
Quoted Id"</code> is <em>different</em> from <code>"my quoted id"</code>). A
fully lowercase quoted identifier that matches <code
lang="a-zA-Z">[a-zA-Z0-9_]</code><code>*</code> is equivalent to the unquoted
identifier obtained by removing the double-quote (so <code>"myid"</code> is
equivalent to <code>myid</code> and to <code>myId</code> but different from
<code>"myId"</code>). Inside a quoted identifier, the double-quote character
can be repeated to escape it, so <code>"fo
o "" bar"</code> is a valid identifier.</p><h3
id="constants">Constants</h3><p>CQL defines the following kind of
<em>constants</em>: strings, integers, floats, booleans, uuids and
blobs:</p><ul><li>A string constant is an arbitrary sequence of characters
characters enclosed by single-quote(<code>'</code>). One can include a
single-quote in a string by repeating it, e.g. <code>'It''s raining
today'</code>. Those are not to be confused with quoted identifiers that use
double-quotes.</li><li>An integer constant is defined by
<code>'-'?[0-9]+</code>.</li><li>A float constant is defined by
<code>'-'?[0-9]+('.'[0-9]*)?([eE][+-]?[0-9+])?</code>. On top of that,
<code>NaN</code> and <code>Infinity</code> are also float constants.</li><li>A
boolean constant is either <code>true</code> or <code>false</code> up to
case-insensitivity (i.e. <code>True</code> is a valid boolean
constant).</li><li>A <a
href="http://en.wikipedia.org/wiki/Universally_unique_identifier">UUID</a>
constant is defined b
y <code>hex{8}-hex{4}-hex{4}-hex{4}-hex{12}</code> where <code>hex</code> is
an hexadecimal character, e.g. <code>[0-9a-fA-F]</code> and <code>{4}</code> is
the number of such characters.</li><li>A blob constant is an hexadecimal number
defined by <code>0[xX](hex)+</code> where <code>hex</code> is an hexadecimal
character, e.g. <code>[0-9a-fA-F]</code>.</li></ul><p>For how these constants
are typed, see the <a href="#types">data types section</a>.</p><h3
id="Comments">Comments</h3><p>A comment in CQL is a line beginning by either
double dashes (<code>--</code>) or double slash
(<code>//</code>).</p><p>Multi-line comments are also supported through
enclosure within <code>/*</code> and <code>*/</code> (but nesting is not
supported).</p><pre class="sample"><pre>-- This is a comment
+// This is a comment too
+/* This is
+ a multi-line comment */
+</pre></pre><h3 id="statements">Statements</h3><p>CQL consists of statements.
As in SQL, these statements can be divided in 3 categories:</p><ul><li>Data
definition statements, that allow to set and change the way data is
stored.</li><li>Data manipulation statements, that allow to change
data</li><li>Queries, to look up data</li></ul><p>All statements end with a
semicolon (<code>;</code>) but that semicolon can be omitted when dealing with
a single statement. The supported statements are described in the following
sections. When describing the grammar of said statements, we will reuse the
non-terminal symbols defined below:</p><pre class="syntax"><pre><identifier>
::= any quoted or unquoted identifier, excluding reserved keywords
+ <tablename> ::= (<identifier> '.')? <identifier>
+
+ <string> ::= a string constant
+ <integer> ::= an integer constant
+ <float> ::= a float constant
+ <number> ::= <integer> | <float>
+ <uuid> ::= a uuid constant
+ <boolean> ::= a boolean constant
+ <hex> ::= a blob constant
+
+ <constant> ::= <string>
+ | <number>
+ | <uuid>
+ | <boolean>
+ | <hex>
+ <variable> ::= '?'
+ | ':' <identifier>
+ <term> ::= <constant>
+ | <collection-literal>
+ | <variable>
+ | <function> '(' (<term> (',' <term>)*)? ')'
+
+ <collection-literal> ::= <map-literal>
+ | <set-literal>
+ | <list-literal>
+ <map-literal> ::= '{' ( <term> ':' <term> ( ',' <term>
':' <term> )* )? '}'
+ <set-literal> ::= '{' ( <term> ( ',' <term> )* )? '}'
+ <list-literal> ::= '[' ( <term> ( ',' <term> )* )? ']'
+
+ <function> ::= <ident>
+
+ <properties> ::= <property> (AND <property>)*
+ <property> ::= <identifier> '=' ( <identifier> | <constant> |
<map-literal> )
+</pre></pre><p><br/>Please note that not every possible productions of the
grammar above will be valid in practice. Most notably,
<code><variable></code> and nested <code><collection-literal></code> are
currently not allowed inside <code><collection-literal></code>.</p><p>A
<code><variable></code> can be either anonymous (a question mark
(<code>?</code>)) or named (an identifier preceded by <code>:</code>). Both
declare a bind variables for <a href="#preparedStatement">prepared
statements</a>. The only difference between an anymous and a named variable is
that a named one will be easier to refer to (how exactly depends on the client
driver used).</p><p>The <code><properties></code> production is use by
statement that create and alter keyspaces and tables. Each
<code><property></code> is either a <em>simple</em> one, in which case it
just has a value, or a <em>map</em> one, in which case it’s value is a
map grouping sub-options. The following will refer to one
or the other as the <em>kind</em> (<em>simple</em> or <em>map</em>) of the
property.</p><p>A <code><tablename></code> will be used to identify a table.
This is an identifier representing the table name that can be preceded by a
keyspace name. The keyspace name, if provided, allow to identify a table in
another keyspace than the currently active one (the currently active keyspace
is set through the <a href="#useStmt"><tt>USE</tt></a> statement).</p><p>For
supported <code><function></code>, see the section on <a
href="#functions">functions</a>.</p><h3 id="preparedStatement">Prepared
Statement</h3><p>CQL supports <em>prepared statements</em>. Prepared statement
is an optimization that allows to parse a query only once but execute it
multiple times with different concrete values.</p><p>In a statement, each time
a column value is expected (in the data manipulation and query statements), a
<code><variable></code> (see above) can be used instead. A statement with
bind variables m
ust then be <em>prepared</em>. Once it has been prepared, it can executed by
providing concrete values for the bind variables. The exact procedure to
prepare a statement and execute a prepared statement depends on the CQL driver
used and is beyond the scope of this document.</p><h2 id="dataDefinition">Data
Definition</h2><h3 id="createKeyspaceStmt">CREATE
KEYSPACE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><create-keyspace-stmt> ::= CREATE KEYSPACE (IF NOT
EXISTS)? <identifier> WITH <properties>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>CREATE KEYSPACE
Excelsior
+ WITH replication = {'class': 'SimpleStrategy', 'replication_factor'
: 3};
+
+CREATE KEYSPACE Excalibur
+ WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 1,
'DC2' : 3}
+ AND durable_writes = false;
+</pre></pre><p><br/>The <code>CREATE KEYSPACE</code> statement creates a new
top-level <em>keyspace</em>. A keyspace is a namespace that defines a
replication strategy and some options for a set of tables. Valid keyspaces
names are identifiers composed exclusively of alphanumerical characters and
whose length is lesser or equal to 32. Note that as identifiers, keyspace names
are case insensitive: use a quoted identifier for case sensitive keyspace
names.</p><p>The supported <code><properties></code> for <code>CREATE
KEYSPACE</code> are:</p><table><tr><th>name </th><th>kind
</th><th>mandatory </th><th>default
</th><th>description</th></tr><tr><td><code>replication</code>
</td><td><em>map</em> </td><td>yes </td><td> </td><td>The
replication strategy and options to use for the keyspace.
</td></tr><tr><td><code>durable_writes</code> </td><td><em>simple</em>
</td><td>no </td><td>true </td><td>Whether to use the commit log
for updates on
this keyspace (disable this option at your own risk!).
</td></tr></table><p>The <code>replication</code> <code><property></code> is
mandatory. It must at least contains the <code>'class'</code> sub-option which
defines the replication strategy class to use. The rest of the sub-options
depends on that replication strategy class. By default, Cassandra support the
following <code>'class'</code>:</p><ul><li><code>'SimpleStrategy'</code>: A
simple strategy that defines a simple replication factor for the whole cluster.
The only sub-options supported is <code>'replication_factor'</code> to define
that replication factor and is
mandatory.</li><li><code>'NetworkTopologyStrategy'</code>: A replication
strategy that allows to set the replication factor independently for each
data-center. The rest of the sub-options are key-value pairs where each time
the key is the name of a datacenter and the value the replication factor for
that data-center.</li><li><code>'OldNetworkTopologyStrategy'</co
de>: A legacy replication strategy. You should avoid this strategy for new
keyspaces and prefer
<code>'NetworkTopologyStrategy'</code>.</li></ul><p>Attempting to create an
already existing keyspace will return an error unless the <code>IF NOT
EXISTS</code> option is used. If it is used, the statement will be a no-op if
the keyspace already exists.</p><h3
id="useStmt">USE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><use-stmt> ::= USE <identifier>
+</pre></pre><p><i>Sample:</i></p><pre class="sample"><pre>USE myApp;
+</pre></pre><p>The <code>USE</code> statement takes an existing keyspace name
as argument and set it as the per-connection current working keyspace. All
subsequent keyspace-specific actions will be performed in the context of the
selected keyspace, unless <a href="#statements">otherwise specified</a>, until
another USE statement is issued or the connection terminates.</p><h3
id="alterKeyspaceStmt">ALTER KEYSPACE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><create-keyspace-stmt> ::= ALTER KEYSPACE
<identifier> WITH <properties>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>ALTER KEYSPACE
Excelsior
+ WITH replication = {'class': 'SimpleStrategy', 'replication_factor'
: 4};
+
+</pre></pre><p><br/>The <code>ALTER KEYSPACE</code> statement alter the
properties of an existing keyspace. The supported <code><properties></code>
are the same that for the <a href="#createKeyspaceStmt"><code>CREATE
TABLE</code></a> statement.</p><h3 id="dropKeyspaceStmt">DROP
KEYSPACE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><drop-keyspace-stmt> ::= DROP KEYSPACE ( IF EXISTS )?
<identifier>
+</pre></pre><p><i>Sample:</i></p><pre class="sample"><pre>DROP KEYSPACE myApp;
+</pre></pre><p>A <code>DROP KEYSPACE</code> statement results in the
immediate, irreversible removal of an existing keyspace, including all column
families in it, and all data contained in those column families.</p><p>If the
keyspace does not exists, the statement will return an error, unless <code>IF
EXISTS</code> is used in which case the operation is a no-op.</p><h3
id="createTableStmt">CREATE TABLE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><create-table-stmt> ::= CREATE ( TABLE | COLUMNFAMILY )
( IF NOT EXISTS )? <tablename>
+ '(' <column-definition> ( ','
<column-definition> )* ')'
+ ( WITH <option> ( AND <option>)* )?
+
+<column-definition> ::= <identifier> <type> ( STATIC )? ( PRIMARY KEY
)?
+ | PRIMARY KEY '(' <partition-key> ( ','
<identifier> )* ')'
+
+<partition-key> ::= <identifier>
+ | '(' <identifier> (',' <identifier> )* ')'
+
+<option> ::= <property>
+ | COMPACT STORAGE
+ | CLUSTERING ORDER
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>CREATE TABLE
monkeySpecies (
+ species text PRIMARY KEY,
+ common_name text,
+ population varint,
+ average_size int
+) WITH comment='Important biological records'
+ AND read_repair_chance = 1.0;
+
+CREATE TABLE timeline (
+ userid uuid,
+ posted_month int,
+ posted_time uuid,
+ body text,
+ posted_by text,
+ PRIMARY KEY (userid, posted_month, posted_time)
+) WITH compaction = { 'class' : 'LeveledCompactionStrategy' };
+</pre></pre><p><br/>The <code>CREATE TABLE</code> statement creates a new
table. Each such table is a set of <em>rows</em> (usually representing related
entities) for which it defines a number of properties. A table is defined by a
<a href="#createTableName">name</a>, it defines the <a
href="#createTableColumn"><it>columns</it></a> composing rows of the table and
have a number of <a href="#createTableOptions">options</a>. Note that the
<code>CREATE COLUMNFAMILY</code> syntax is supported as an alias for
<code>CREATE TABLE</code> (for historical reasons).</p><p>Attempting to create
an already existing table will return an error unless the <code>IF NOT
EXISTS</code> option is used. If it is used, the statement will be a no-op if
the table already exists.</p><h4
id="createTableName"><code><tablename></code></h4><p>Valid table names are
the same as valid <a href="#createKeyspaceStmt">keyspace names</a> (up to 32
characters long alphanumerical identifiers). If the table name is provid
ed alone, the table is created within the current keyspace (see <a
href="#useStmt"><tt>USE</tt></a>), but if it is prefixed by an existing
keyspace name (see <a href="#statements"><code><tablename></code></a>
grammar), it is created in the specified keyspace (but does
<strong>not</strong> change the current keyspace).</p><h4
id="createTableColumn"><code><column-definition></code></h4><p>A
<code>CREATE TABLE</code> statement defines the columns that rows of the table
can have. A <em>column</em> is defined by its name (an identifier) and its type
(see the <a href="#types">data types</a> section for more details on allowed
types and their properties).</p><p>Within a table, a row is uniquely identified
by its <code>PRIMARY KEY</code> (or more simply the key), and hence all table
definitions <strong>must</strong> define a PRIMARY KEY (and only one). A
<code>PRIMARY KEY</code> is composed of one or more of the columns defined in
the table. If the <code>PRIMARY KEY</code> is only one
column, this can be specified directly after the column definition.
Otherwise, it must be specified by following <code>PRIMARY KEY</code> by the
comma-separated list of column names composing the key within parenthesis. Note
that:</p><pre class="sample"><pre>CREATE TABLE t (
+ k int PRIMARY KEY,
+ other text
+)
+</pre></pre><p>is equivalent to</p><pre class="sample"><pre>CREATE TABLE t (
+ k int,
+ other text,
+ PRIMARY KEY (k)
+)
+</pre></pre><h4 id="createTablepartitionClustering">Partition key and
clustering columns</h4><p>In CQL, the order in which columns are defined for
the <code>PRIMARY KEY</code> matters. The first column of the key is called the
<i>partition key</i>. It has the property that all the rows sharing the same
partition key (even across table in fact) are stored on the same physical node.
Also, insertion/update/deletion on rows sharing the same partition key for a
given table are performed <i>atomically</i> and in <i>isolation</i>. Note that
it is possible to have a composite partition key, i.e. a partition key formed
of multiple columns, using an extra set of parentheses to define which columns
forms the partition key.</p><p>The remaining columns of the <code>PRIMARY
KEY</code> definition, if any, are called __clustering columns. On a given
physical node, rows for a given partition key are stored in the order induced
by the clustering columns, making the retrieval of rows in that clusterin
g order particularly efficient (see <a
href="#selectStmt"><tt>SELECT</tt></a>).</p><h4
id="createTableStatic"><code>STATIC</code> columns</h4><p>Some columns can be
declared as <code>STATIC</code> in a table definition. A column that is static
will be “shared” by all the rows belonging to the same partition
(having the same partition key). For instance, in:</p><pre
class="sample"><pre>CREATE TABLE test (
+ pk int,
+ t int,
+ v text,
+ s text static,
+ PRIMARY KEY (pk, t)
+);
+INSERT INTO test(pk, t, v, s) VALUES (0, 0, 'val0', 'static0');
+INSERT INTO test(pk, t, v, s) VALUES (0, 1, 'val1', 'static1');
+SELECT * FROM test WHERE pk=0 AND t=0;
+</pre></pre><p>the last query will return <code>'static1'</code> as value for
<code>s</code>, since <code>s</code> is static and thus the 2nd insertion
modified this “shared” value. Note however that static columns are
only static within a given partition, and if in the example above both rows
where from different partitions (i.e. if they had different value for
<code>pk</code>), then the 2nd insertion would not have modified the value of
<code>s</code> for the first row.</p><p>A few restrictions applies to when
static columns are allowed:</p><ul><li>tables with the <code>COMPACT
STORAGE</code> option (see below) cannot have them</li><li>a table without
clustering columns cannot have static columns (in a table without clustering
columns, every partition has only one row, and so every column is inherently
static).</li><li>only non <code>PRIMARY KEY</code> columns can be
static</li></ul><h4 id="createTableOptions"><code><option></code></h4><p>The
<code>CREATE TABLE</cod
e> statement supports a number of options that controls the configuration of a
new table. These options can be specified after the <code>WITH</code>
keyword.</p><p>The first of these option is <code>COMPACT STORAGE</code>. This
option is mainly targeted towards backward compatibility for definitions
created before CQL3 (see <a
href="http://www.datastax.com/dev/blog/thrift-to-cql3">www.datastax.com/dev/blog/thrift-to-cql3</a>
for more details). The option also provides a slightly more compact layout of
data on disk but at the price of diminished flexibility and extensibility for
the table. Most notably, <code>COMPACT STORAGE</code> tables cannot have
collections nor static columns and a <code>COMPACT STORAGE</code> table with at
least one clustering column supports exactly one (as in not 0 nor more than 1)
column not part of the <code>PRIMARY KEY</code> definition (which imply in
particular that you cannot add nor remove columns after creation). For those
reasons, <code>COMPACT STO
RAGE</code> is not recommended outside of the backward compatibility reason
evoked above.</p><p>Another option is <code>CLUSTERING ORDER</code>. It allows
to define the ordering of rows on disk. It takes the list of the clustering
column names with, for each of them, the on-disk order (Ascending or
descending). Note that this option affects <a href="#selectOrderBy">what
<code>ORDER BY</code> are allowed during <code>SELECT</code></a>.</p><p>Table
creation supports the following other
<code><property></code>:</p><table><tr><th>option
</th><th>kind </th><th>default
</th><th>description</th></tr><tr><td><code>comment</code>
</td><td><em>simple</em> </td><td>none </td><td>A free-form,
human-readable comment.</td></tr><tr><td><code>read_repair_chance</code>
</td><td><em>simple</em> </td><td>0.1 </td><td>The probability with
which to query extra nodes (e.g. more nodes than required by the consistency
level) for the purpos
e of read repairs.</td></tr><tr><td><code>dclocal_read_repair_chance</code>
</td><td><em>simple</em> </td><td>0 </td><td>The probability with
which to query extra nodes (e.g. more nodes than required by the consistency
level) belonging to the same data center than the read coordinator for the
purpose of read repairs.</td></tr><tr><td><code>gc_grace_seconds</code>
</td><td><em>simple</em> </td><td>864000 </td><td>Time to wait before
garbage collecting tombstones (deletion
markers).</td></tr><tr><td><code>bloom_filter_fp_chance</code>
</td><td><em>simple</em> </td><td>0.00075 </td><td>The target probability
of false positive of the sstable bloom filters. Said bloom filters will be
sized to provide the provided probability (thus lowering this value impact the
size of bloom filters in-memory and
on-disk)</td></tr><tr><td><code>compaction</code>
</td><td><em>map</em> </td><td><em>see below</em> </td><td>The compaction
options to use, se
e below.</td></tr><tr><td><code>compression</code>
</td><td><em>map</em> </td><td><em>see below</em> </td><td>Compression
options, see below. </td></tr><tr><td><code>caching</code>
</td><td><em>simple</em> </td><td>keys_only </td><td>Whether to cache keys
(“key cache”) and/or rows (“row cache”) for this table.
Valid values are: <code>all</code>, <code>keys_only</code>,
<code>rows_only</code> and <code>none</code>.
</td></tr><tr><td><code>default_time_to_live</code>
</td><td><em>simple</em> </td><td>0 </td><td>The default expiration
time (“TTL”) in seconds for a table.</td></tr></table><h4
id="compactionOptions"><code>compaction</code> options</h4><p>The
<code>compaction</code> property must at least define the <code>'class'</code>
sub-option, that defines the compaction strategy class to use. The default
supported class are <code>'SizeTieredCompactionStrategy'</code> and
<code>'LeveledCompacti
onStrategy'</code>. Custom strategy can be provided by specifying the full
class name as a <a href="#constants">string constant</a>. The rest of the
sub-options depends on the chosen class. The sub-options supported by the
default classes are:</p><table><tr><th>option
</th><th>supported compaction strategy </th><th>default </th><th>description
</th></tr><tr><td><code>enabled</code>
</td><td><em>all</em> </td><td>true </td><td>A
boolean denoting whether compaction should be enabled or
not.</td></tr><tr><td><code>tombstone_threshold</code>
</td><td><em>all</em> </td><td>0.2 </td><td>A
ratio such that if a sstable has more than this ratio of gcable tombstones over
all contained columns, the sstable will be compacted (with no other sstables)
for the purpose of purging those tombstones.
</td></tr><tr><td><code>tombstone_compaction_interval</code>
</td><td><em>all</em>
</td><td>1 day </td><td>The minimum time to
wait after an sstable creation time before considering it for “tombstone
compaction”, where “tombstone compaction” is the compaction
triggered if the sstable has more gcable tombstones than
<code>tombstone_threshold</code>.
</td></tr><tr><td><code>unchecked_tombstone_compaction</code>
</td><td><em>all</em> </td><td>false
</td><td>Setting this to true enables more aggressive tombstone compactions
– single sstable tombstone compactions will run without checking how
likely it is that they will be successful.
</td></tr><tr><td><code>min_sstable_size</code>
</td><td>SizeTieredCompactionStrategy </td><td>50MB </td><td>The size
tiered strategy groups SSTables to compact in buckets. A bucket groups SSTables
that differs from less than 50% in size. However, for small sizes, this would
result in a bucketing that is too fine grained. <code
>min_sstable_size</code> defines a size threshold (in bytes) below which all
>SSTables belong to one unique
>bucket</td></tr><tr><td><code>min_threshold</code>
></td><td>SizeTieredCompactionStrategy </td><td>4 </td><td>Minimum
>number of SSTables needed to start a minor
>compaction.</td></tr><tr><td><code>max_threshold</code>
></td><td>SizeTieredCompactionStrategy </td><td>32 </td><td>Maximum
>number of SSTables processed by one minor
>compaction.</td></tr><tr><td><code>bucket_low</code>
></td><td>SizeTieredCompactionStrategy </td><td>0.5 </td><td>Size
>tiered consider sstables to be within the same bucket if their size is within
>[average_size * <code>bucket_low</code>, average_size *
><code>bucket_high</code> ] (i.e the default groups sstable whose sizes
>diverges by at most 50%)</td></tr><tr><td><code>bucket_high</code>
> </td><td>SizeTieredCompactionStrategy </td><td>1.5
></td><td>Siz
e tiered consider sstables to be within the same bucket if their size is
within [average_size * <code>bucket_low</code>, average_size *
<code>bucket_high</code> ] (i.e the default groups sstable whose sizes diverges
by at most 50%).</td></tr><tr><td><code>sstable_size_in_mb</code>
</td><td>LeveledCompactionStrategy </td><td>5MB </td><td>The target
size (in MB) for sstables in the leveled strategy. Note that while sstable
sizes should stay less or equal to <code>sstable_size_in_mb</code>, it is
possible to exceptionally have a larger sstable as during compaction, data for
a given partition key are never split into 2 sstables</td></tr></table><p>For
the <code>compression</code> property, the following default sub-options are
available:</p><table><tr><th>option </th><th>default
</th><th>description </th></tr><tr><td><code>sstable_compression</code>
</td><td>LZ4Compressor </td><td>The compression algorithm to use. Default
compressor are: LZ
4Compressor, SnappyCompressor and DeflateCompressor. Use an empty string
(<code>''</code>) to disable compression. Custom compressor can be provided by
specifying the full class name as a <a href="#constants">string
constant</a>.</td></tr><tr><td><code>chunk_length_kb</code> </td><td>64KB
</td><td>On disk SSTables are compressed by block (to allow random
reads). This defines the size (in KB) of said block. Bigger values may improve
the compression rate, but increases the minimum size of data to be read from
disk for a read </td></tr><tr><td><code>crc_check_chance</code> </td><td>1.0
</td><td>When compression is enabled, each compressed block
includes a checksum of that block for the purpose of detecting disk bitrot and
avoiding the propagation of corruption to other replica. This option defines
the probability with which those checksums are checked during read. By default
they are always checked. Set to 0 to disable checksum checking and to 0.5 for in
stance to check them every other read</td></tr></table><h4
id="Otherconsiderations">Other considerations:</h4><ul><li>When <a
href="#insertStmt/"updating":#updateStmt">inserting</a> a given row,
not all columns needs to be defined (except for those part of the key), and
missing columns occupy no space on disk. Furthermore, adding new columns (see
<a href=#alterStmt><tt>ALTER TABLE</tt></a>) is a constant time operation.
There is thus no need to try to anticipate future usage (or to cry when you
haven’t) when creating a table.</li></ul><h3 id="alterTableStmt">ALTER
TABLE</h3><p><i>Syntax:</i></p><pre class="syntax"><pre><alter-table-stmt>
::= ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction>
+
+<instruction> ::= ALTER <identifier> TYPE <type>
+ | ADD <identifier> <type>
+ | DROP <identifier>
+ | WITH <option> ( AND <option> )*
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>ALTER TABLE
addamsFamily
+ALTER lastKnownLocation TYPE uuid;
+
+ALTER TABLE addamsFamily
+ADD gravesite varchar;
+
+ALTER TABLE addamsFamily
+WITH comment = 'A most excellent and useful column family'
+ AND read_repair_chance = 0.2;
+</pre></pre><p><br/>The <code>ALTER</code> statement is used to manipulate
table definitions. It allows for adding new columns, dropping existing ones,
changing the type of existing columns, or updating the table options. As with
table creation, <code>ALTER COLUMNFAMILY</code> is allowed as an alias for
<code>ALTER TABLE</code>.</p><p>The <code><tablename></code> is the table
name optionally preceded by the keyspace name. The
<code><instruction></code> defines the alteration to
perform:</p><ul><li><code>ALTER</code>: Update the type of a given defined
column. Note that the type of the <a
href="#createTablepartitionClustering">clustering columns</a> cannot be
modified as it induces the on-disk ordering of rows. Columns on which a <a
href="#createIndexStmt">secondary index</a> is defined have the same
restriction. Other columns are free from those restrictions (no validation of
existing data is performed), but it is usually a bad idea to change the type to
a non-compatible one,
unless no data have been inserted for that column yet, as this could confuse
CQL drivers/tools.</li><li><code>ADD</code>: Adds a new column to the table.
The <code><identifier></code> for the new column must not conflict with an
existing column. Moreover, columns cannot be added to tables defined with the
<code>COMPACT STORAGE</code> option.</li><li><code>DROP</code>: Removes a
column from the table. Dropped columns will immediately become unavailable in
the queries and will not be included in compacted sstables in the future. If a
column is readded, queries won’t return values written before the column
was last dropped. It is assumed that timestamps represent actual time, so if
this is not your case, you should NOT readd previously dropped columns. Columns
can’t be dropped from tables defined with the <code>COMPACT
STORAGE</code> option.</li><li><code>WITH</code>: Allows to update the options
of the table. The <a href="#createTableOptions">supported <code><option
></code></a> (and syntax) are the same as for the <code>CREATE TABLE</code>
>statement except that <code>COMPACT STORAGE</code> is not supported. Note
>that setting any <code>compaction</code> sub-options has the effect of
>erasing all previous <code>compaction</code> options, so you need to
>re-specify all the sub-options if you want to keep them. The same note
>applies to the set of <code>compression</code> sub-options.</li></ul><h3
>id="dropTableStmt">DROP TABLE</h3><p><i>Syntax:</i></p><pre
>class="syntax"><pre><drop-table-stmt> ::= DROP TABLE ( IF EXISTS )?
><tablename>
+</pre></pre><p><i>Sample:</i></p><pre class="sample"><pre>DROP TABLE
worldSeriesAttendees;
+</pre></pre><p>The <code>DROP TABLE</code> statement results in the immediate,
irreversible removal of a table, including all data contained in it. As for
table creation, <code>DROP COLUMNFAMILY</code> is allowed as an alias for
<code>DROP TABLE</code>.</p><p>If the table does not exist, the statement will
return an error, unless <code>IF EXISTS</code> is used in which case the
operation is a no-op.</p><h3
id="truncateStmt">TRUNCATE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><truncate-stmt> ::= TRUNCATE <tablename>
+</pre></pre><p><i>Sample:</i></p><pre class="sample"><pre>TRUNCATE
superImportantData;
+</pre></pre><p>The <code>TRUNCATE</code> statement permanently removes all
data from a table.</p><h3 id="createIndexStmt">CREATE
INDEX</h3><p><i>Syntax:</i></p><pre class="syntax"><pre><create-index-stmt>
::= CREATE ( CUSTOM )? INDEX ( IF NOT EXISTS )? ( <indexname> )?
+ ON <tablename> '(' <index-identifier> ')'
+ ( USING <string> ( WITH OPTIONS =
<map-literal> )? )?
+
+<index-identifier> ::= <identifier>
+ | keys( <identifier> )
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>CREATE INDEX
userIndex ON NerdMovies (user);
+CREATE INDEX ON Mutants (abilityId);
+CREATE INDEX ON users (keys(favs));
+CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass';
+CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass' WITH
OPTIONS = {'storage': '/mnt/ssd/indexes/'};
+</pre></pre><p>The <code>CREATE INDEX</code> statement is used to create a new
(automatic) secondary index for a given (existing) column in a given table. A
name for the index itself can be specified before the <code>ON</code> keyword,
if desired. If data already exists for the column, it will be indexed
asynchronously. After the index is created, new data for the column is indexed
automatically at insertion time.</p><p>Attempting to create an already existing
index will return an error unless the <code>IF NOT EXISTS</code> option is
used. If it is used, the statement will be a no-op if the index already
exists.</p><h4 id="keysIndex">Indexes on Map Keys</h4><p>When creating an index
on a <a href="#map">map column</a>, you may index either the keys or the
values. If the column identifier is placed within the <code>keys()</code>
function, the index will be on the map keys, allowing you to use <code>CONTAINS
KEY</code> in <code>WHERE</code> clauses. Otherwise, the index will be on th
e map values.</p><h3 id="dropIndexStmt">DROP
INDEX</h3><p><i>Syntax:</i></p><pre class="syntax"><pre><drop-index-stmt>
::= DROP INDEX ( IF EXISTS )? ( <keyspace> '.' )? <identifier>
+</pre></pre><p><i>Sample:</i></p><pre class="sample"><pre>DROP INDEX userIndex;
+
+DROP INDEX userkeyspace.address_index;
+</pre></pre><p><br/>The <code>DROP INDEX</code> statement is used to drop an
existing secondary index. The argument of the statement is the index name,
which may optionally specify the keyspace of the index.</p><p>If the index does
not exists, the statement will return an error, unless <code>IF EXISTS</code>
is used in which case the operation is a no-op.</p><h3
id="createTypeStmt">CREATE TYPE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><create-type-stmt> ::= CREATE TYPE ( IF NOT EXISTS )?
<typename>
+ '(' <field-definition> ( ',' <field-definition>
)* ')'
+
+<typename> ::= ( <keyspace-name> '.' )? <identifier>
+
+<field-definition> ::= <identifier> <type>
+
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>CREATE TYPE
address (
+ street_name text,
+ street_number int,
+ city text,
+ state text,
+ zip int
+)
+
+CREATE TYPE work_and_home_addresses (
+ home_address address,
+ work_address address
+)
+</pre></pre><p><br/>The <code>CREATE TYPE</code> statement creates a new
user-defined type. Each type is a set of named, typed fields. Field types may
be any valid type, including collections and other existing user-defined
types.</p><p>Attempting to create an already existing type will result in an
error unless the <code>IF NOT EXISTS</code> option is used. If it is used, the
statement will be a no-op if the type already exists.</p><h4
id="createTypeName"><code><typename></code></h4><p>Valid type names are
identifiers. The names of existing CQL types and <a href="#appendixB">reserved
type names</a> may not be used.</p><p>If the type name is provided alone, the
type is created with the current keyspace (see <a
href="#useStmt"><tt>USE</tt></a>). If it is prefixed by an existing keyspace
name, the type is created within the specified keyspace instead of the current
keyspace.</p><h3 id="alterTypeStmt">ALTER TYPE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><alter-type-st
mt> ::= ALTER TYPE <typename> <instruction>
+
+<instruction> ::= ALTER <field-name> TYPE <type>
+ | ADD <field-name> <type>
+ | RENAME <field-name> TO <field-name> ( AND
<field-name> TO <field-name> )*
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>ALTER TYPE
address ALTER zip TYPE varint
+
+ALTER TYPE address ADD country text
+
+ALTER TYPE address RENAME zip TO zipcode AND street_name TO street
+</pre></pre><p><br/>The <code>ALTER TYPE</code> statement is used to
manipulate type definitions. It allows for adding new fields, renaming existing
fields, or changing the type of existing fields.</p><p>When altering the type
of a column, the new type must be compatible with the previous type.</p><h3
id="dropTypeStmt">DROP TYPE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><drop-type-stmt> ::= DROP TYPE ( IF EXISTS )?
<typename>
+</pre></pre><p><br/>The <code>DROP TYPE</code> statement results in the
immediate, irreversible removal of a type. Attempting to drop a type that is
still in use by another type or a table will result in an error.</p><p>If the
type does not exist, an error will be returned unless <code>IF EXISTS</code> is
used, in which case the operation is a no-op.</p><h3
id="createTriggerStmt">CREATE TRIGGER</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><create-trigger-stmt> ::= CREATE TRIGGER ( IF NOT EXISTS
)? ( <triggername> )?
+ ON <tablename>
+ USING <string>
+
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>CREATE TRIGGER
myTrigger ON myTable USING 'org.apache.cassandra.triggers.InvertedIndex';
+</pre></pre><p>The actual logic that makes up the trigger can be written in
any Java (JVM) language and exists outside the database. You place the trigger
code in a <code>lib/triggers</code> subdirectory of the Cassandra installation
directory, it loads during cluster startup, and exists on every node that
participates in a cluster. The trigger defined on a table fires before a
requested DML statement occurs, which ensures the atomicity of the
transaction.</p><h3 id="dropTriggerStmt">DROP
TRIGGER</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><drop-trigger-stmt> ::= DROP TRIGGER ( IF EXISTS )? (
<triggername> )?
+ ON <tablename>
+
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>DROP TRIGGER
myTrigger ON myTable;
+</pre></pre><p><code>DROP TRIGGER</code> statement removes the registration of
a trigger created using <code>CREATE TRIGGER</code>.</p><h2
id="dataManipulation">Data Manipulation</h2><h3
id="insertStmt">INSERT</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><insertStatement> ::= INSERT INTO <tablename>
+ '(' <identifier> ( ',' <identifier> )* ')'
+ VALUES '(' <term-or-literal> ( ','
<term-or-literal> )* ')'
+ ( IF NOT EXISTS )?
+ ( USING <option> ( AND <option> )* )?
+
+<term-or-literal> ::= <term>
+ | <collection-literal>
+
+<option> ::= TIMESTAMP <integer>
+ | TTL <integer>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>INSERT INTO
NerdMovies (movie, director, main_actor, year)
+ VALUES ('Serenity', 'Joss Whedon', 'Nathan Fillion', 2005)
+USING TTL 86400;
+</pre></pre><p>The <code>INSERT</code> statement writes one or more columns
for a given row in a table. Note that since a row is identified by its
<code>PRIMARY KEY</code>, at least the columns composing it must be
specified.</p><p>Note that unlike in SQL, <code>INSERT</code> does not check
the prior existence of the row by default: the row is created if none existed
before, and updated otherwise. Furthermore, there is no mean to know which of
creation or update happened.</p><p>It is however possible to use the <code>IF
NOT EXISTS</code> condition to only insert if the row does not exist prior to
the insertion. But please note that using <code>IF NOT EXISTS</code> will incur
a non negligible performance cost (internally, Paxos will be used) so this
should be used sparingly.</p><p>All updates for an <code>INSERT</code> are
applied atomically and in isolation.</p><p>Please refer to the <a
href="#updateOptions"><code>UPDATE</code></a> section for information on the
<code><option></c
ode> available and to the <a href="#collections">collections</a> section for
use of <code><collection-literal></code>. Also note that <code>INSERT</code>
does not support counters, while <code>UPDATE</code> does.</p><h3
id="updateStmt">UPDATE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><update-stmt> ::= UPDATE <tablename>
+ ( USING <option> ( AND <option> )* )?
+ SET <assignment> ( ',' <assignment> )*
+ WHERE <where-clause>
+ ( IF <condition> ( AND condition )* )?
+
+<assignment> ::= <identifier> '=' <term>
+ | <identifier> '=' <identifier> ('+' | '-')
(<int-term> | <set-literal> | <list-literal>)
+ | <identifier> '=' <identifier> '+' <map-literal>
+ | <identifier> '[' <term> ']' '=' <term>
+
+<condition> ::= <identifier> '=' <term>
+ | <identifier> '[' <term> ']' '=' <term>
+
+<where-clause> ::= <relation> ( AND <relation> )*
+
+<relation> ::= <identifier> '=' <term>
+ | <identifier> IN '(' ( <term> ( ',' <term> )* )? ')'
+ | <identifier> IN '?'
+
+<option> ::= TIMESTAMP <integer>
+ | TTL <integer>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>UPDATE
NerdMovies USING TTL 400
+SET director = 'Joss Whedon',
+ main_actor = 'Nathan Fillion',
+ year = 2005
+WHERE movie = 'Serenity';
+
+UPDATE UserActions SET total = total + 2 WHERE user =
B70DE1D0-9908-4AE3-BE34-5573E5B09F14 AND action = 'click';
+</pre></pre><p><br/>The <code>UPDATE</code> statement writes one or more
columns for a given row in a table. The <code><where-clause></code> is used
to select the row to update and must include all columns composing the
<code>PRIMARY KEY</code> (the <code>IN</code> relation is only supported for
the last column of the partition key). Other columns values are specified
through <code><assignment></code> after the <code>SET</code>
keyword.</p><p>Note that unlike in SQL, <code>UPDATE</code> does not check the
prior existence of the row by default: the row is created if none existed
before, and updated otherwise. Furthermore, there is no mean to know which of
creation or update happened.</p><p>It is however possible to use the conditions
on some columns through <code>IF</code>, in which case the row will not be
updated unless such condition are met. But please note that using
<code>IF</code> conditions will incur a non negligible performance cost
(internally, Paxos will be used) so
this should be used sparingly.</p><p>In an <code>UPDATE</code> statement, all
updates within the same partition key are applied atomically and in
isolation.</p><p>The <code>c = c + 3</code> form of
<code><assignment></code> is used to increment/decrement counters. The
identifier after the ‘=’ sign <strong>must</strong> be the same
than the one before the ‘=’ sign (Only increment/decrement is
supported on counters, not the assignment of a specific value).</p><p>The
<code>id = id + <collection-literal></code> and <code>id[value1] =
value2</code> forms of <code><assignment></code> are for collections. Please
refer to the <a href="#collections">relevant section</a> for more
details.</p><h4 id="updateOptions"><code><options></code></h4><p>The
<code>UPDATE</code> and <code>INSERT</code> statements allows to specify the
following options for the insertion:</p><ul><li><code>TIMESTAMP</code>: sets
the timestamp for the operation. If not specified, the coo
rdinator will use the current time (in microseconds) at the start of statement
execution as the timestamp. This is usually a suitable
default.</li><li><code>TTL</code>: allows to specify an optional Time To Live
(in seconds) for the inserted values. If set, the inserted values are
automatically removed from the database after the specified time. Note that the
TTL concerns the inserted values, not the column themselves. This means that
any subsequent update of the column will also reset the TTL (to whatever TTL is
specified in that update). By default, values never expire. A TTL of 0 or a
negative one is equivalent to no TTL.</li></ul><h3
id="deleteStmt">DELETE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><delete-stmt> ::= DELETE ( <selection> ( ','
<selection> )* )?
+ FROM <tablename>
+ ( USING TIMESTAMP <integer>)?
+ WHERE <where-clause>
+ ( IF ( EXISTS | ( <condition> ( AND <condition> )*) )
)?
+
+<selection> ::= <identifier> ( '[' <term> ']' )?
+
+<where-clause> ::= <relation> ( AND <relation> )*
+
+<relation> ::= <identifier> '=' <term>
+ | <identifier> IN '(' ( <term> ( ',' <term> )* )? ')'
+ | <identifier> IN '?'
+
+<condition> ::= <identifier> '=' <term>
+ | <identifier> '[' <term> ']' '=' <term>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>DELETE FROM
NerdMovies USING TIMESTAMP 1240003134 WHERE movie = 'Serenity';
+
+DELETE phone FROM Users WHERE userid IN (C73DE1D3-AF08-40F3-B124-3FF3E5109F22,
B70DE1D0-9908-4AE3-BE34-5573E5B09F14);
+</pre></pre><p><br/>The <code>DELETE</code> statement deletes columns and
rows. If column names are provided directly after the <code>DELETE</code>
keyword, only those columns are deleted from the row indicated by the
<code><where-clause></code> (the <code>id[value]</code> syntax in
<code><selection></code> is for collection, please refer to the <a
href="#collections">collection section</a> for more details). Otherwise whole
rows are removed. The <code><where-clause></code> allows to specify the key
for the row(s) to delete (the <code>IN</code> relation is only supported for
the last column of the partition key).</p><p><code>DELETE</code> supports the
<code>TIMESTAMP</code> options with the same semantic that in the <a
href="#updateStmt"><code>UPDATE</code></a> statement.</p><p>In a
<code>DELETE</code> statement, all deletions within the same partition key are
applied atomically and in isolation.</p><p>A <code>DELETE</code> operation
application can be conditioned using <c
ode>IF</code> like for <code>UPDATE</code> and <code>INSERT</code>. But please
not that as for the later, this will incur a non negligible performance cost
(internally, Paxos will be used) and so should be used sparingly.</p><h3
id="batchStmt">BATCH</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><batch-stmt> ::= BEGIN ( UNLOGGED | COUNTER ) BATCH
+ ( USING <option> ( AND <option> )* )?
+ <modification-stmt> ( ';' <modification-stmt> )*
+ APPLY BATCH
+
+<modification-stmt> ::= <insert-stmt>
+ | <update-stmt>
+ | <delete-stmt>
+
+<option> ::= TIMESTAMP <integer>
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>BEGIN BATCH
+ INSERT INTO users (userid, password, name) VALUES ('user2', 'ch@ngem3b',
'second user');
+ UPDATE users SET password = 'ps22dhds' WHERE userid = 'user3';
+ INSERT INTO users (userid, password) VALUES ('user4', 'ch@ngem3c');
+ DELETE name FROM users WHERE userid = 'user1';
+APPLY BATCH;
+</pre></pre><p>The <code>BATCH</code> statement group multiple modification
statements (insertions/updates and deletions) into a single statement. It
serves several purposes:</p><ol><li>It saves network round-trips between the
client and the server (and sometimes between the server coordinator and the
replicas) when batching multiple updates.</li><li>All updates in a
<code>BATCH</code> belonging to a given partition key are performed in
isolation.</li><li>By default, all operations in the batch are performed
atomically. See the notes on <a
href="#unloggedBatch"><code>UNLOGGED</code></a> for more
details.</li></ol><p>Note that:</p><ul><li><code>BATCH</code> statements may
only contain <code>UPDATE</code>, <code>INSERT</code> and <code>DELETE</code>
statements.</li><li>Batches are <em>not</em> a full analogue for SQL
transactions.</li><li>If a timestamp is not specified for each operation, then
all operations will be applied with the same timestamp. Due to
Cassandra’s conflict
resolution procedure in the case of <a
href="http://wiki.apache.org/cassandra/FAQ#clocktie">timestamp ties</a>,
operations may be applied in an order that is different from the order they are
listed in the <code>BATCH</code> statement. To force a particular operation
ordering, you must specify per-operation timestamps.</li></ul><h4
id="unloggedBatch"><code>UNLOGGED</code></h4><p>By default, Cassandra uses a
batch log to ensure all operations in a batch are applied atomically. (Note
that the operations are still only isolated within a single
partition.)</p><p>There is a performance penalty for batch atomicity when a
batch spans multiple partitions. If you do not want to incur this penalty, you
can tell Cassandra to skip the batchlog with the <code>UNLOGGED</code> option.
If the <code>UNLOGGED</code> option is used, operations are only atomic within
a single partition.</p><h4 id="counterBatch"><code>COUNTER</code></h4><p>Use
the <code>COUNTER</code> option for batched counter updates.
Unlike other updates in Cassandra, counter updates are not
idempotent.</p><h4
id="batchOptions"><code><option></code></h4><p><code>BATCH</code> supports
both the <code>TIMESTAMP</code> option, with similar semantic to the one
described in the <a href="#updateOptions"><code>UPDATE</code></a> statement
(the timestamp applies to all the statement inside the batch). However, if
used, <code>TIMESTAMP</code> <strong>must not</strong> be used in the
statements within the batch.</p><h2 id="queries">Queries</h2><h3
id="selectStmt">SELECT</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre><select-stmt> ::= SELECT <select-clause>
+ FROM <tablename>
+ ( WHERE <where-clause> )?
+ ( ORDER BY <order-by> )?
+ ( LIMIT <integer> )?
+ ( ALLOW FILTERING )?
+
+<select-clause> ::= DISTINCT? <selection-list>
+ | COUNT '(' ( '*' | '1' ) ')' (AS <identifier>)?
+
+<selection-list> ::= <selector> (AS <identifier>)? ( ','
<selector> (AS <identifier>)? )*
+ | '*'
+
+<selector> ::= <identifier>
+ | WRITETIME '(' <identifier> ')'
+ | TTL '(' <identifier> ')'
+ | <function> '(' (<selector> (',' <selector>)*)? ')'
+
+<where-clause> ::= <relation> ( AND <relation> )*
+
+<relation> ::= <identifier> <op> <term>
+ | '(' <identifier> (',' <identifier>)* ')' <op>
<term-tuple>
+ | <identifier> IN '(' ( <term> ( ',' <term>)* )? ')'
+ | '(' <identifier> (',' <identifier>)* ')' IN '(' (
<term-tuple> ( ',' <term-tuple>)* )? ')'
+ | TOKEN '(' <identifier> ( ',' <identifer>)* ')' <op>
<term>
+
+<op> ::= '=' | '<' | '>' | '<=' | '>=' | CONTAINS | CONTAINS KEY
+<order-by> ::= <ordering> ( ',' <odering> )*
+<ordering> ::= <identifer> ( ASC | DESC )?
+<term-tuple> ::= '(' <term> (',' <term>)* ')'
+</pre></pre><p><br/><i>Sample:</i></p><pre class="sample"><pre>SELECT name,
occupation FROM users WHERE userid IN (199, 200, 207);
+
+SELECT name AS user_name, occupation AS user_occupation FROM users;
+
+SELECT time, value
+FROM events
+WHERE event_type = 'myEvent'
+ AND time > '2011-02-03'
+ AND time <= '2012-01-01'
+
+SELECT COUNT(*) FROM users;
+
+SELECT COUNT(*) AS user_count FROM users;
+
+</pre></pre><p><br/>The <code>SELECT</code> statements reads one or more
columns for one or more rows in a table. It returns a result-set of rows, where
each row contains the collection of columns corresponding to the query.</p><h4
id="selectSelection"><code><select-clause></code></h4><p>The
<code><select-clause></code> determines which columns needs to be queried
and returned in the result-set. It consists of either the comma-separated list
of <selector> or the wildcard character (<code>*</code>) to select all the
columns defined for the table.</p><p>A <code><selector></code> is either a
column name to retrieve, or a <code><function></code> of one or multiple
column names. The functions allows are the same that for <code><term></code>
and are describe in the <a href="#function">function section</a>. In addition
to these generic functions, the <code>WRITETIME</code> (resp. <code>TTL</code>)
function allows to select the timestamp of when the column was inserted (resp.
the time to live (in seconds) for the column (or null if the column has no
expiration set)).</p><p>Any <code><selector></code> can be aliased using
<code>AS</code> keyword (see examples). Please note that
<code><where-clause></code> and <code><order-by></code> clause should
refer to the columns by their original names and not by their
aliases.</p><p>The <code>COUNT</code> keyword can be used with parenthesis
enclosing <code>*</code>. If so, the query will return a single result: the
number of rows matching the query. Note that <code>COUNT(1)</code> is supported
as an alias.</p><h4 id="selectWhere"><code><where-clause></code></h4><p>The
<code><where-clause></code> specifies which rows must be queried. It is
composed of relations on the columns that are part of the <code>PRIMARY
KEY</code> and/or have a <a href="#createIndexStmt">secondary index</a> defined
on them.</p><p>Not all relations are allowed in a query. For instance,
non-equal relations (where <code>IN</code>
is considered as an equal relation) on a partition key are not supported (but
see the use of the <code>TOKEN</code> method below to do non-equal queries on
the partition key). Moreover, for a given partition key, the clustering columns
induce an ordering of rows and relations on them is restricted to the relations
that allow to select a <strong>contiguous</strong> (for the ordering) set of
rows. For instance, given</p><pre class="sample"><pre>CREATE TABLE posts (
+ userid text,
+ blog_title text,
+ posted_at timestamp,
+ entry_title text,
+ content text,
+ category int,
+ PRIMARY KEY (userid, blog_title, posted_at)
+)
+</pre></pre><p>The following query is allowed:</p><pre
class="sample"><pre>SELECT entry_title, content FROM posts WHERE userid='john
doe' AND blog_title='John''s Blog' AND posted_at >= '2012-01-01' AND posted_at
< '2012-01-31'
+</pre></pre><p>But the following one is not, as it does not select a
contiguous set of rows (and we suppose no secondary indexes are set):</p><pre
class="sample"><pre>// Needs a blog_title to be set to select ranges of
posted_at
+SELECT entry_title, content FROM posts WHERE userid='john doe' AND posted_at
>= '2012-01-01' AND posted_at < '2012-01-31'
+</pre></pre><p>When specifying relations, the <code>TOKEN</code> function can
be used on the <code>PARTITION KEY</code> column to query. In that case, rows
will be selected based on the token of their <code>PARTITION_KEY</code> rather
than on the value. Note that the token of a key depends on the partitioner in
use, and that in particular the RandomPartitioner won’t yield a
meaningful order. Also note that ordering partitioners always order token
values by bytes (so even if the partition key is of type int, <code>token(-1) >
token(0)</code> in particular). Example:</p><pre class="sample"><pre>SELECT *
FROM posts WHERE token(userid) > token('tom') AND token(userid) <
token('bob')
+</pre></pre><p>Moreover, the <code>IN</code> relation is only allowed on the
last column of the partition key and on the last column of the full primary
key.</p><p>It is also possible to “group” <code>CLUSTERING
COLUMNS</code> together in a relation using the tuple notation. For
instance:</p><pre class="sample"><pre>SELECT * FROM posts WHERE userid='john
doe' AND (blog_title, posted_at) > ('John''s Blog', '2012-01-01')
+</pre></pre><p>will request all rows that sorts after the one having
“John's Blog” as <code>blog_tile</code> and
‘2012-01-01’ for <code>posted_at</code> in the clustering order. In
particular, rows having a <code>post_at <= '2012-01-01'</code> will be
returned as long as their <code>blog_title > 'John''s Blog'</code>, which
wouldn’t be the case for:</p><pre class="sample"><pre>SELECT * FROM posts
WHERE userid='john doe' AND blog_title > 'John''s Blog' AND posted_at >
'2012-01-01'
+</pre></pre><p>The tuple notation may also be used for <code>IN</code> clauses
on <code>CLUSTERING COLUMNS</code>:</p><pre class="sample"><pre>SELECT * FROM
posts WHERE userid='john doe' AND (blog_title, posted_at) IN (('John''s Blog',
'2012-01-01), ('Extreme Chess', '2014-06-01'))
+</pre></pre><p>The <code>CONTAINS</code> operator may only be used on
collection columns (lists, sets, and maps). In the case of maps,
<code>CONTAINS</code> applies to the map values. The <code>CONTAINS KEY</code>
operator may only be used on map columns and applies to the map keys.</p><h4
id="selectOrderBy"><code><order-by></code></h4><p>The <code>ORDER BY</code>
option allows to select the order of the returned results. It takes as argument
a list of column names along with the order for the column (<code>ASC</code>
for ascendant and <code>DESC</code> for descendant, omitting the order being
equivalent to <code>ASC</code>). Currently the possible orderings are limited
(which depends on the table <a href="#createTableOptions"><code>CLUSTERING
ORDER</code></a>):</p><ul><li>if the table has been defined without any
specific <code>CLUSTERING ORDER</code>, then then allowed orderings are the
order induced by the clustering columns and the reverse of that
one.</li><li>otherwise, the
orderings allowed are the order of the <code>CLUSTERING ORDER</code> option
and the reversed one.</li></ul><h4
id="selectLimit"><code>LIMIT</code></h4><p>The <code>LIMIT</code> option to a
<code>SELECT</code> statement limits the number of rows returned by a
query.</p><h4 id="selectAllowFiltering"><code>ALLOW FILTERING</code></h4><p>By
default, CQL only allows select queries that don’t involve
“filtering” server side, i.e. queries where we know that all (live)
record read will be returned (maybe partly) in the result set. The reasoning is
that those “non filtering” queries have predictable performance in
the sense that they will execute in a time that is proportional to the amount
of data <strong>returned</strong> by the query (which can be controlled through
<code>LIMIT</code>).</p><p>The <code>ALLOW FILTERING</code> option allows to
explicitly allow (some) queries that require filtering. Please note that a
query using <code>ALLOW FILTERING</code> may
thus have unpredictable performance (for the definition above), i.e. even a
query that selects a handful of records <strong>may</strong> exhibit
performance that depends on the total amount of data stored in the
cluster.</p><p>For instance, considering the following table holding user
profiles with their year of birth (with a secondary index on it) and country of
residence:</p><pre class="sample"><pre>CREATE TABLE users (
+ username text PRIMARY KEY,
+ firstname text,
+ lastname text,
+ birth_year int,
+ country text
+)
+
+CREATE INDEX ON users(birth_year);
+</pre></pre><p></p><p>Then the following queries are valid:</p><pre
class="sample"><pre>SELECT * FROM users;
+SELECT firstname, lastname FROM users WHERE birth_year = 1981;
+</pre></pre><p>because in both case, Cassandra guarantees that these queries
performance will be proportional to the amount of data returned. In particular,
if no users are born in 1981, then the second query performance will not depend
of the number of user profile stored in the database (not directly at least:
due to secondary index implementation consideration, this query may still
depend on the number of node in the cluster, which indirectly depends on the
amount of data stored. Nevertheless, the number of nodes will always be
multiple number of magnitude lower than the number of user profile stored). Of
course, both query may return very large result set in practice, but the amount
of data returned can always be controlled by adding a
<code>LIMIT</code>.</p><p>However, the following query will be
rejected:</p><pre class="sample"><pre>SELECT firstname, lastname FROM users
WHERE birth_year = 1981 AND country = 'FR';
+</pre></pre><p>because Cassandra cannot guarantee that it won’t have to
scan large amount of data even if the result to those query is small.
Typically, it will scan all the index entries for users born in 1981 even if
only a handful are actually from France. However, if you “know what you
are doing”, you can force the execution of this query by using
<code>ALLOW FILTERING</code> and so the following query is valid:</p><pre
class="sample"><pre>SELECT firstname, lastname FROM users WHERE birth_year =
1981 AND country = 'FR' ALLOW FILTERING;
+</pre></pre><h2 id="types">Data Types</h2><p>CQL supports a rich set of data
types for columns defined in a table, including collection types. On top of
those native and collection types, users can also provide custom types (through
a JAVA class extending <code>AbstractType</code> loadable by Cassandra). The
syntax of types is thus:</p><pre class="syntax"><pre><type> ::=
<native-type>
+ | <collection-type>
+ | <tuple-type>
+ | <string> // Used for custom types. The fully-qualified
name of a JAVA class
+
+<native-type> ::= ascii
+ | bigint
+ | blob
+ | boolean
+ | counter
+ | decimal
+ | double
+ | float
+ | inet
+ | int
+ | text
+ | timestamp
+ | timeuuid
+ | uuid
+ | varchar
+ | varint
+
+<collection-type> ::= list '<' <native-type> '>'
+ | set '<' <native-type> '>'
+ | map '<' <native-type> ',' <native-type> '>'
+<tuple-type> ::= tuple '<' <type> (',' <type>)* '>'
+</pre></pre><p>Note that the native types are keywords and as such are
case-insensitive. They are however not reserved ones.</p><p>The following table
gives additional informations on the native data types, and on which kind of <a
href="#constants">constants</a> each type supports:</p><table><tr><th>type
</th><th>constants
supported</th><th>description</th></tr><tr><td><code>ascii</code> </td><td>
strings </td><td>ASCII character
string</td></tr><tr><td><code>bigint</code> </td><td> integers
</td><td>64-bit signed long</td></tr><tr><td><code>blob</code> </td><td>
blobs </td><td>Arbitrary bytes (no
validation)</td></tr><tr><td><code>boolean</code> </td><td> booleans
</td><td>true or false</td></tr><tr><td><code>counter</code> </td><td>
integers </td><td>Counter column (64-bit signed value). See <a
href="#counters">Counters</a> for details</td></tr><tr><td><code>decimal</code>
</td><td> integers, floats </td>
<td>Variable-precision decimal</td></tr><tr><td><code>double</code>
</td><td> integers </td><td>64-bit IEEE-754 floating
point</td></tr><tr><td><code>float</code> </td><td> integers, floats
</td><td>32-bit IEEE-754 floating point</td></tr><tr><td><code>inet</code>
</td><td> strings </td><td>An IP address. It can be either 4 bytes
long (IPv4) or 16 bytes long (IPv6). There is no <code>inet</code> constant, IP
address should be inputed as strings</td></tr><tr><td><code>int</code>
</td><td> integers </td><td>32-bit signed
int</td></tr><tr><td><code>text</code> </td><td> strings
</td><td>UTF8 encoded string</td></tr><tr><td><code>timestamp</code></td><td>
integers, strings </td><td>A timestamp. Strings constant are allow to input
timestamps as dates, see <a href="#usingdates">Working with dates</a> below for
more information.</td></tr><tr><td><code>timeuuid</code> </td><td> uuids
</td><td>Type 1 UUID. Thi
s is generally used as a “conflict-free” timestamp. Also see the
<a href="#timeuuidFun">functions on
Timeuuid</a></td></tr><tr><td><code>uuid</code> </td><td> uuids
</td><td>Type 1 or type 4 UUID</td></tr><tr><td><code>varchar</code>
</td><td> strings </td><td>UTF8 encoded
string</td></tr><tr><td><code>varint</code> </td><td> integers
</td><td>Arbitrary-precision integer</td></tr></table><p>For more information
on how to use the collection types, see the <a href="#collections">Working with
collections</a> section below.</p><h3 id="usingdates">Working with
dates</h3><p>Values of the <code>timestamp</code> type are encoded as 64-bit
signed integers representing a number of milliseconds since the standard base
time known as “the epoch”: January 1 1970 at 00:00:00
GMT.</p><p>Timestamp can be input in CQL as simple long integers, giving the
number of milliseconds since the epoch, as defined above.</p><p>They can also
be
input as string literals in any of the following ISO 8601 formats, each
representing the time and date Mar 2, 2011, at 04:05:00 AM,
GMT.:</p><ul><li><code>2011-02-03 04:05+0000</code></li><li><code>2011-02-03
04:05:00+0000</code></li><li><code>2011-02-03
04:05:00.000+0000</code></li><li><code>2011-02-03T04:05+0000</code></li><li><code>2011-02-03T04:05:00+0000</code></li><li><code>2011-02-03T04:05:00.000+0000</code></li></ul><p>The
<code>+0000</code> above is an RFC 822 4-digit time zone specification;
<code>+0000</code> refers to GMT. US Pacific Standard Time is
<code>-0800</code>. The time zone may be omitted if desired— the date
will be interpreted as being in the time zone under which the coordinating
Cassandra node is configured.</p><ul><li><code>2011-02-03
04:05</code></li><li><code>2011-02-03 04:05:00</code></li><li><code>2011-02-03
04:05:00.000</code></li><li><code>2011-02-03T04:05</code></li><li><code>2011-02-03T04:05:00</code></li><li><code>2011-02-03T04:05:00.000</co
de></li></ul><p>There are clear difficulties inherent in relying on the time
zone configuration being as expected, though, so it is recommended that the
time zone always be specified for timestamps when feasible.</p><p>The time of
day may also be omitted, if the date is the only piece that
matters:</p><ul><li><code>2011-02-03</code></li><li><code>2011-02-03+0000</code></li></ul><p>In
that case, the time of day will default to 00:00:00, in the specified or
default time zone.</p><h3 id="counters">Counters</h3><p>The
<code>counter</code> type is used to define <em>counter columns</em>. A counter
column is a column whose value is a 64-bit signed integer and on which 2
operations are supported: incrementation and decrementation (see <a
href="#updateStmt"><code>UPDATE</code></a> for syntax). Note the value of a
counter cannot be set. A counter doesn’t exist until first
incremented/decremented, and the first incrementation/decrementation is made as
if the previous value was 0. Delet
ion of counter columns is supported but have some limitations (see the <a
href="http://wiki.apache.org/cassandra/Counters">Cassandra Wiki</a> for more
information).</p><p>The use of the counter type is limited in the following
way:</p><ul><li>It cannot be used for column that is part of the <code>PRIMARY
KEY</code> of a table.</li><li>A table that contains a counter can only contain
counters. In other words, either all the columns of a table outside the
<code>PRIMARY KEY</code> have the counter type, or none of them have
it.</li></ul><h3 id="collections">Working with collections</h3><h4
id="Noteworthycharacteristics">Noteworthy characteristics</h4><p>Collections
are meant for storing/denormalizing relatively small amount of data. They work
well for things like “the phone numbers of a given user”,
“labels applied to an email”, etc. But when items are expected to
grow unbounded (“all the messages sent by a given user”,
“events registered by a
sensor”, ...), then collections are not appropriate anymore and a
specific table (with clustering columns) should be used. Concretely,
collections have the following limitations:</p><ul><li>Collections are always
read in their entirety (and reading one is not paged
internally).</li><li>Collections cannot have more than 65535 elements. More
precisely, while it may be possible to insert more than 65535 elements, it is
not possible to read more than the 65535 first elements (see <a
href="https://issues.apache.org/jira/browse/CASSANDRA-5428">CASSANDRA-5428</a>
for details).</li><li>While insertion operations on sets and maps never incur a
read-before-write internally, some operations on lists do (see the section on
lists below for details). It is thus advised to prefer sets over lists when
possible.</li></ul><p>Please note that while some of those limitations may or
may not be loosen in the future, the general rule that collections are for
denormalizing small amount of data is mea
nt to stay.</p><h4 id="map">Maps</h4><p>A <code>map</code> is a <a
href="#types">typed</a> set of key-value pairs, where keys are unique.
Furthermore, note that the map are internally sorted by their keys and will
thus always be returned in that order. To create a column of type
<code>map</code>, use the <code>map</code> keyword suffixed with
comma-separated key and value types, enclosed in angle brackets. For
example:</p><pre class="sample"><pre>CREATE TABLE users (
+ id text PRIMARY KEY,
+ given text,
+ surname text,
+ favs map<text, text> // A map of text keys, and text values
+)
+</pre></pre><p>Writing <code>map</code> data is accomplished with a
JSON-inspired syntax. To write a record using <code>INSERT</code>, specify the
entire map as a JSON-style associative array. <em>Note: This form will always
replace the entire map.</em></p><pre class="sample"><pre>// Inserting (or
Updating)
+INSERT INTO users (id, given, surname, favs)
+ VALUES ('jsmith', 'John', 'Smith', { 'fruit' : 'apple', 'band' :
'Beatles' })
+</pre></pre><p>Adding or updating key-values of a (potentially) existing map
can be accomplished either by subscripting the map column in an
<code>UPDATE</code> statement or by adding a new map literal:</p><pre
class="sample"><pre>// Updating (or inserting)
+UPDATE users SET favs['author'] = 'Ed Poe' WHERE id = 'jsmith'
+UPDATE users SET favs = favs + { 'movie' : 'Cassablanca' } WHERE id = 'jsmith'
+</pre></pre><p>Note that TTLs are allowed for both <code>INSERT</code> and
<code>UPDATE</code>, but in both case the TTL set only apply to the newly
inserted/updated <em>values</em>. In other words,</p><pre
class="sample"><pre>// Updating (or inserting)
+UPDATE users USING TTL 10 SET favs['color'] = 'green' WHERE id = 'jsmith'
+</pre></pre><p>will only apply the TTL to the <code>{ 'color' : 'green'
}</code> record, the rest of the map remaining unaffected.</p><p>Deleting a map
record is done with:</p><pre class="sample"><pre>DELETE favs['author'] FROM
users WHERE id = 'jsmith'
+</pre></pre><h4 id="set">Sets</h4><p>A <code>set</code> is a <a
href="#types">typed</a> collection of unique values. Sets are ordered by their
values. To create a column of type <code>set</code>, use the <code>set</code>
keyword suffixed with the value type enclosed in angle brackets. For
example:</p><pre class="sample"><pre>CREATE TABLE images (
+ name text PRIMARY KEY,
+ owner text,
+ date timestamp,
+ tags set<text>
+);
+</pre></pre><p>Writing a <code>set</code> is accomplished by comma separating
the set values, and enclosing them in curly braces. <em>Note: An
<code>INSERT</code> will always replace the entire set.</em></p><pre
class="sample"><pre>INSERT INTO images (name, owner, date, tags)
+ VALUES ('cat.jpg', 'jsmith', 'now', { 'kitten', 'cat', 'pet' });
+</pre></pre><p>Adding and removing values of a set can be accomplished with an
<code>UPDATE</code> by adding/removing new set values to an existing
<code>set</code> column.</p><pre class="sample"><pre>UPDATE images SET tags =
tags + { 'cute', 'cuddly' } WHERE name = 'cat.jpg';
+UPDATE images SET tags = tags - { 'lame' } WHERE name = 'cat.jpg';
+</pre></pre><p>As with <a href="#map">maps</a>, TTLs if used only apply to the
newly inserted/updated <em>values</em>.</p><h4 id="list">Lists</h4><p>A
<code>list</code> is a <a href="#types">typed</a> collection of non-unique
values where elements are ordered by there position in the list. To create a
column of type <code>list</code>, use the <code>list</code> keyword suffixed
with the value type enclosed in angle brackets. For example:</p><pre
class="sample"><pre>CREATE TABLE plays (
+ id text PRIMARY KEY,
+ game text,
+ players int,
+ scores list<int>
+)
+</pre></pre><p>Do note that as explained below, lists have some limitations
and performance considerations to take into account, and it is advised to
prefer <a href="#set">sets</a> over lists when this is possible.</p><p>Writing
<code>list</code> data is accomplished with a JSON-style syntax. To write a
record using <code>INSERT</code>, specify the entire list as a JSON array.
<em>Note: An <code>INSERT</code> will always replace the entire
list.</em></p><pre class="sample"><pre>INSERT INTO plays (id, game, players,
scores)
+ VALUES ('123-afde', 'quake', 3, [17, 4, 2]);
+</pre></pre><p>Adding (appending or prepending) values to a list can be
accomplished by adding a new JSON-style array to an existing <code>list</code>
column.</p><pre class="sample"><pre>UPDATE plays SET players = 5, scores =
scores + [ 14, 21 ] WHERE id = '123-afde';
+UPDATE plays SET players = 5, scores = [ 12 ] + scores WHERE id = '123-afde';
+</pre></pre><p>It should be noted that append and prepend are not idempotent
operations. This means that if during an append or a prepend the operation
timeout, it is not always safe to retry the operation (as this could result in
the record appended or prepended twice).</p><p>Lists also provides the
following operation: setting an element by its position in the list, removing
an element by its position in the list and remove all the occurrence of a given
value in the list. <em>However, and contrarily to all the other collection
operations, these three operations induce an internal read before the update,
and will thus typically have slower performance characteristics</em>. Those
operations have the following syntax:</p><pre class="sample"><pre>UPDATE plays
SET scores[1] = 7 WHERE id = '123-afde'; // sets the 2nd element
of scores to 7 (raises an error is scores has less than 2 elements)
+DELETE scores[1] FROM plays WHERE id = '123-afde'; //
deletes the 2nd element of scores (raises an error is scores has less than 2
elements)
+UPDATE plays SET scores = scores - [ 12, 21 ] WHERE id = '123-afde'; //
removes all occurrences of 12 and 21 from scores
+</pre></pre><p>As with <a href="#map">maps</a>, TTLs if used only apply to the
newly inserted/updated <em>values</em>.</p><h2
id="functions">Functions</h2><p>CQL3 supports a few functions (more to come).
Currently, it only support functions on values (functions that transform one or
more column values into a new value) and in particular aggregation functions
are not supported. The functions supported are described below:</p><h3
id="tokenFun">Token</h3><p>The <code>token</code> function allows to compute
the token for a given partition key. The exact signature of the token function
depends on the table concerned and of the partitioner used by the
cluster.</p><p>The type of the arguments of the <code>token</code> depend on
the type of the partition key columns. The return type depend on the
partitioner in use:</p><ul><li>For Murmur3Partitioner, the return type is
<code>bigint</code>.</li><li>For RandomPartitioner, the return type is
<code>varint</code>.</li><li>For ByteOrderedPartitio
ner, the return type is <code>blob</code>.</li></ul><p>For instance, in a
cluster using the default Murmur3Partitioner, if a table is defined by</p><pre
class="sample"><pre>CREATE TABLE users (
+ userid text PRIMARY KEY,
+ username text,
+ ...
+)
+</pre></pre><p>then the <code>token</code> function will take a single
argument of type <code>text</code> (in that case, the partition key is
<code>userid</code> (there is no clustering columns so the partition key is the
same than the primary key)), and the return type will be
<code>bigint</code>.</p><h3 id="uuidFun">Uuid</h3><p>The <code>uuid</code>
function takes no parameters and generates a random type 4 uuid suitable for
use in INSERT or SET statements.</p><h3 id="timeuuidFun">Timeuuid
functions</h3><h4 id="now"><code>now</code></h4><p>The <code>now</code>
function takes no arguments and generates a new unique timeuuid (at the time
where the statement using it is executed). Note that this method is useful for
insertion but is largely non-sensical in <code>WHERE</code> clauses. For
instance, a query of the form</p><pre class="sample"><pre>SELECT * FROM myTable
WHERE t = now()
+</pre></pre><p>will never return any result by design, since the value
returned by <code>now()</code> is guaranteed to be unique.</p><h4
id="minTimeuuidandmaxTimeuuid"><code>minTimeuuid</code> and
<code>maxTimeuuid</code></h4><p>The <code>minTimeuuid</code> (resp.
<code>maxTimeuuid</code>) function takes a <code>timestamp</code> value
<code>t</code> (which can be <a href="#usingdates">either a timestamp or a date
string</a>) and return a <em>fake</em> <code>timeuuid</code> corresponding to
the <em>smallest</em> (resp. <em>biggest</em>) possible <code>timeuuid</code>
having for timestamp <code>t</code>. So for instance:</p> <pre
class="sample"><pre>SELECT * FROM myTable WHERE t > maxTimeuuid('2013-01-01
00:05+0000') AND t < minTimeuuid('2013-02-02 10:00+0000')
[... 3 lines stripped ...]
Modified: cassandra/site/publish/doc/cql3/CQL.html
URL:
http://svn.apache.org/viewvc/cassandra/site/publish/doc/cql3/CQL.html?rev=1660693&r1=1660692&r2=1660693&view=diff
==============================================================================
--- cassandra/site/publish/doc/cql3/CQL.html (original)
+++ cassandra/site/publish/doc/cql3/CQL.html Wed Feb 18 18:05:44 2015
@@ -1 +1 @@
-link CQL-2.0.html
\ No newline at end of file
+link CQL-2.1.html
\ No newline at end of file