http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/functions.rst ---------------------------------------------------------------------- diff --git a/doc/source/cql/functions.rst b/doc/source/cql/functions.rst index cf52ace..efcdf32 100644 --- a/doc/source/cql/functions.rst +++ b/doc/source/cql/functions.rst @@ -16,109 +16,113 @@ .. highlight:: sql -.. _cql_functions: +.. _cql-functions: + +.. Need some intro for UDF and native functions in general and point those to it. +.. _udfs: +.. _native-functions: Functions --------- -CQL3 distinguishes between built-in functions (so called ânative -functionsâ) and `user-defined functions <#udfs>`__. CQL3 includes -several native functions, described below: +CQL supports 2 main categories of functions: + +- the :ref:`scalar functions <scalar-functions>`, which simply take a number of values and produce an output with it. +- the :ref:`aggregate functions <aggregate-functions>`, which are used to aggregate multiple rows results from a + ``SELECT`` statement. + +In both cases, CQL provides a number of native "hard-coded" functions as well as the ability to create new user-defined +functions. + +.. note:: By default, the use of user-defined functions is disabled by default for security concerns (even when + enabled, the execution of user-defined functions is sandboxed and a "rogue" function should not be allowed to do + evil, but no sandbox is perfect so using user-defined functions is opt-in). See the ``enable_user_defined_functions`` + in ``cassandra.yaml`` to enable them. + +.. _scalar-functions: Scalar functions ^^^^^^^^^^^^^^^^ +.. _scalar-native-functions: + Native functions ~~~~~~~~~~~~~~~~ Cast ```` -The ``cast`` function can be used to converts one native datatype to -another. - -The following table describes the conversions supported by the ``cast`` -function. Cassandra will silently ignore any cast converting a datatype -into its own datatype. - -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| from | to | -+=================+=========================================================================================================================+ -| ``ascii`` | ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``bigint`` | ``tinyint``, ``smallint``, ``int``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``boolean`` | ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``counter`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``date`` | ``timestamp`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``decimal`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``double`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``float`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``inet`` | ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``int`` | ``tinyint``, ``smallint``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``smallint`` | ``tinyint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``time`` | ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``timestamp`` | ``date``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``timeuuid`` | ``timestamp``, ``date``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``tinyint`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``uuid`` | ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ -| ``varint`` | ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``text``, ``varchar`` | -+-----------------+-------------------------------------------------------------------------------------------------------------------------+ - -The conversions rely strictly on Javaâs semantics. For example, the -double value 1 will be converted to the text value â1.0â. - -| bc(sample). -| SELECT avg(cast(count as double)) FROM myTable +The ``cast`` function can be used to converts one native datatype to another. + +The following table describes the conversions supported by the ``cast`` function. Cassandra will silently ignore any +cast converting a datatype into its own datatype. + +=============== ======================================================================================================= + From To +=============== ======================================================================================================= + ``ascii`` ``text``, ``varchar`` + ``bigint`` ``tinyint``, ``smallint``, ``int``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, + ``varchar`` + ``boolean`` ``text``, ``varchar`` + ``counter`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, + ``text``, ``varchar`` + ``date`` ``timestamp`` + ``decimal`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``varint``, ``text``, + ``varchar`` + ``double`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``decimal``, ``varint``, ``text``, + ``varchar`` + ``float`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``double``, ``decimal``, ``varint``, ``text``, + ``varchar`` + ``inet`` ``text``, ``varchar`` + ``int`` ``tinyint``, ``smallint``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, + ``varchar`` + ``smallint`` ``tinyint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, ``text``, + ``varchar`` + ``time`` ``text``, ``varchar`` + ``timestamp`` ``date``, ``text``, ``varchar`` + ``timeuuid`` ``timestamp``, ``date``, ``text``, ``varchar`` + ``tinyint`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``varint``, + ``text``, ``varchar`` + ``uuid`` ``text``, ``varchar`` + ``varint`` ``tinyint``, ``smallint``, ``int``, ``bigint``, ``float``, ``double``, ``decimal``, ``text``, + ``varchar`` +=============== ======================================================================================================= + +The conversions rely strictly on Java's semantics. For example, the double value 1 will be converted to the text value +'1.0'. For instance:: + + SELECT avg(cast(count as double)) FROM myTable Token ````` -The ``token`` function allows to compute the token for a given partition -key. The exact signature of the token function depends on the table -concerned and of the partitioner used by the cluster. +The ``token`` function allows to compute the token for a given partition key. The exact signature of the token function +depends on the table concerned and of the partitioner used by the cluster. -The type of the arguments of the ``token`` depend on the type of the -partition key columns. The return type depend on the partitioner in use: +The type of the arguments of the ``token`` depend on the type of the partition key columns. The return type depend on +the partitioner in use: -- For Murmur3Partitioner, the return type is ``bigint``. -- For RandomPartitioner, the return type is ``varint``. -- For ByteOrderedPartitioner, the return type is ``blob``. +- For Murmur3Partitioner, the return type is ``bigint``. +- For RandomPartitioner, the return type is ``varint``. +- For ByteOrderedPartitioner, the return type is ``blob``. -For instance, in a cluster using the default Murmur3Partitioner, if a -table is defined by +For instance, in a cluster using the default Murmur3Partitioner, if a table is defined by:: -| bc(sample). -| CREATE TABLE users ( -| userid text PRIMARY KEY, -| username text, -| ⦠-| ) + CREATE TABLE users ( + userid text PRIMARY KEY, + username text, + ) -then the ``token`` function will take a single argument of type ``text`` -(in that case, the partition key is ``userid`` (there is no clustering -columns so the partition key is the same than the primary key)), and the -return type will be ``bigint``. +then the ``token`` function will take a single argument of type ``text`` (in that case, the partition key is ``userid`` +(there is no clustering columns so the partition key is the same than the primary key)), and the return type will be +``bigint``. Uuid ```` +The ``uuid`` function takes no parameters and generates a random type 4 uuid suitable for use in ``INSERT`` or +``UPDATE`` statements. -The ``uuid`` function takes no parameters and generates a random type 4 -uuid suitable for use in INSERT or SET statements. +.. _timeuuid-functions: Timeuuid functions `````````````````` @@ -126,252 +130,199 @@ Timeuuid functions ``now`` ####### -The ``now`` function takes no arguments and generates, on the -coordinator node, a new unique timeuuid (at the time where the statement -using it is executed). Note that this method is useful for insertion but -is largely non-sensical in ``WHERE`` clauses. For instance, a query of -the form +The ``now`` function takes no arguments and generates, on the coordinator node, a new unique timeuuid (at the time where +the statement using it is executed). Note that this method is useful for insertion but is largely non-sensical in +``WHERE`` clauses. For instance, a query of the form:: -| bc(sample). -| SELECT \* FROM myTable WHERE t = now() + SELECT * FROM myTable WHERE t = now() -will never return any result by design, since the value returned by -``now()`` is guaranteed to be unique. +will never return any result by design, since the value returned by ``now()`` is guaranteed to be unique. ``minTimeuuid`` and ``maxTimeuuid`` ################################### -The ``minTimeuuid`` (resp. ``maxTimeuuid``) function takes a -``timestamp`` value ``t`` (which can be `either a timestamp or a date -string <#usingtimestamps>`__ ) and return a *fake* ``timeuuid`` -corresponding to the *smallest* (resp. *biggest*) possible ``timeuuid`` -having for timestamp ``t``. So for instance: - -| bc(sample). -| SELECT \* FROM myTable WHERE t > maxTimeuuid(â2013-01-01 00:05+0000â) - AND t < minTimeuuid(â2013-02-02 10:00+0000â) - -will select all rows where the ``timeuuid`` column ``t`` is strictly -older than â2013-01-01 00:05+0000â but strictly younger than â2013-02-02 -10:00+0000â. Please note that -``t >= maxTimeuuid('2013-01-01 00:05+0000')`` would still *not* select a -``timeuuid`` generated exactly at â2013-01-01 00:05+0000â and is -essentially equivalent to ``t > maxTimeuuid('2013-01-01 00:05+0000')``. - -*Warning*: We called the values generated by ``minTimeuuid`` and -``maxTimeuuid`` *fake* UUID because they do no respect the Time-Based -UUID generation process specified by the `RFC -4122 <http://www.ietf.org/rfc/rfc4122.txt>`__. In particular, the value -returned by these 2 methods will not be unique. This means you should -only use those methods for querying (as in the example above). Inserting -the result of those methods is almost certainly *a bad idea*. +The ``minTimeuuid`` (resp. ``maxTimeuuid``) function takes a ``timestamp`` value ``t`` (which can be `either a timestamp +or a date string <timestamps>`) and return a *fake* ``timeuuid`` corresponding to the *smallest* (resp. *biggest*) +possible ``timeuuid`` having for timestamp ``t``. So for instance:: + + SELECT * FROM myTable + WHERE t > maxTimeuuid('2013-01-01 00:05+0000') + AND t < minTimeuuid('2013-02-02 10:00+0000') + +will select all rows where the ``timeuuid`` column ``t`` is strictly older than ``'2013-01-01 00:05+0000'`` but strictly +younger than ``'2013-02-02 10:00+0000'``. Please note that ``t >= maxTimeuuid('2013-01-01 00:05+0000')`` would still +*not* select a ``timeuuid`` generated exactly at '2013-01-01 00:05+0000' and is essentially equivalent to ``t > +maxTimeuuid('2013-01-01 00:05+0000')``. + +.. note:: We called the values generated by ``minTimeuuid`` and ``maxTimeuuid`` *fake* UUID because they do no respect + the Time-Based UUID generation process specified by the `RFC 4122 <http://www.ietf.org/rfc/rfc4122.txt>`__. In + particular, the value returned by these 2 methods will not be unique. This means you should only use those methods + for querying (as in the example above). Inserting the result of those methods is almost certainly *a bad idea*. Time conversion functions ````````````````````````` -A number of functions are provided to âconvertâ a ``timeuuid``, a -``timestamp`` or a ``date`` into another ``native`` type. - -+-----------------------+-----------------+-------------------------------------------------------------------+ -| function name | input type | description | -+=======================+=================+===================================================================+ -| ``toDate`` | ``timeuuid`` | Converts the ``timeuuid`` argument into a ``date`` type | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toDate`` | ``timestamp`` | Converts the ``timestamp`` argument into a ``date`` type | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toTimestamp`` | ``timeuuid`` | Converts the ``timeuuid`` argument into a ``timestamp`` type | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toTimestamp`` | ``date`` | Converts the ``date`` argument into a ``timestamp`` type | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toUnixTimestamp`` | ``timeuuid`` | Converts the ``timeuuid`` argument into a ``bigInt`` raw value | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toUnixTimestamp`` | ``timestamp`` | Converts the ``timestamp`` argument into a ``bigInt`` raw value | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``toUnixTimestamp`` | ``date`` | Converts the ``date`` argument into a ``bigInt`` raw value | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``dateOf`` | ``timeuuid`` | Similar to ``toTimestamp(timeuuid)`` (DEPRECATED) | -+-----------------------+-----------------+-------------------------------------------------------------------+ -| ``unixTimestampOf`` | ``timeuuid`` | Similar to ``toUnixTimestamp(timeuuid)`` (DEPRECATED) | -+-----------------------+-----------------+-------------------------------------------------------------------+ +A number of functions are provided to âconvertâ a ``timeuuid``, a ``timestamp`` or a ``date`` into another ``native`` +type. + +===================== =============== =================================================================== + Function name Input type Description +===================== =============== =================================================================== + ``toDate`` ``timeuuid`` Converts the ``timeuuid`` argument into a ``date`` type + ``toDate`` ``timestamp`` Converts the ``timestamp`` argument into a ``date`` type + ``toTimestamp`` ``timeuuid`` Converts the ``timeuuid`` argument into a ``timestamp`` type + ``toTimestamp`` ``date`` Converts the ``date`` argument into a ``timestamp`` type + ``toUnixTimestamp`` ``timeuuid`` Converts the ``timeuuid`` argument into a ``bigInt`` raw value + ``toUnixTimestamp`` ``timestamp`` Converts the ``timestamp`` argument into a ``bigInt`` raw value + ``toUnixTimestamp`` ``date`` Converts the ``date`` argument into a ``bigInt`` raw value + ``dateOf`` ``timeuuid`` Similar to ``toTimestamp(timeuuid)`` (DEPRECATED) + ``unixTimestampOf`` ``timeuuid`` Similar to ``toUnixTimestamp(timeuuid)`` (DEPRECATED) +===================== =============== =================================================================== Blob conversion functions ````````````````````````` +A number of functions are provided to âconvertâ the native types into binary data (``blob``). For every +``<native-type>`` ``type`` supported by CQL (a notable exceptions is ``blob``, for obvious reasons), the function +``typeAsBlob`` takes a argument of type ``type`` and return it as a ``blob``. Conversely, the function ``blobAsType`` +takes a 64-bit ``blob`` argument and convert it to a ``bigint`` value. And so for instance, ``bigintAsBlob(3)`` is +``0x0000000000000003`` and ``blobAsBigint(0x0000000000000003)`` is ``3``. -A number of functions are provided to âconvertâ the native types into -binary data (``blob``). For every ``<native-type>`` ``type`` supported -by CQL3 (a notable exceptions is ``blob``, for obvious reasons), the -function ``typeAsBlob`` takes a argument of type ``type`` and return it -as a ``blob``. Conversely, the function ``blobAsType`` takes a 64-bit -``blob`` argument and convert it to a ``bigint`` value. And so for -instance, ``bigintAsBlob(3)`` is ``0x0000000000000003`` and -``blobAsBigint(0x0000000000000003)`` is ``3``. +.. _user-defined-scalar-functions: User-defined functions ~~~~~~~~~~~~~~~~~~~~~~ -User-defined functions allow execution of user-provided code in -Cassandra. By default, Cassandra supports defining functions in *Java* -and *JavaScript*. Support for other JSR 223 compliant scripting -languages (such as Python, Ruby, and Scala) can be added by adding a JAR -to the classpath. - -UDFs are part of the Cassandra schema. As such, they are automatically -propagated to all nodes in the cluster. - -UDFs can be *overloaded* - i.e. multiple UDFs with different argument -types but the same function name. Example: - -| bc(sample). -| CREATE FUNCTION sample ( arg int ) â¦; -| CREATE FUNCTION sample ( arg text ) â¦; - -User-defined functions are susceptible to all of the normal problems -with the chosen programming language. Accordingly, implementations -should be safe against null pointer exceptions, illegal arguments, or -any other potential source of exceptions. An exception during function -execution will result in the entire statement failing. - -It is valid to use *complex* types like collections, tuple types and -user-defined types as argument and return types. Tuple types and -user-defined types are handled by the conversion functions of the -DataStax Java Driver. Please see the documentation of the Java Driver -for details on handling tuple types and user-defined types. - -Arguments for functions can be literals or terms. Prepared statement -placeholders can be used, too. - -Note that you can use the double-quoted string syntax to enclose the UDF -source code. For example: - -| bc(sample).. -| CREATE FUNCTION some\_function ( arg int ) -| RETURNS NULL ON NULL INPUT -| RETURNS int -| LANGUAGE java -| AS $$ return arg; $$; - -| SELECT some\_function(column) FROM atable â¦; -| UPDATE atable SET col = some\_function(?) â¦; -| p. - -| bc(sample). -| CREATE TYPE custom\_type (txt text, i int); -| CREATE FUNCTION fct\_using\_udt ( udtarg frozen ) -| RETURNS NULL ON NULL INPUT -| RETURNS text -| LANGUAGE java -| AS $$ return udtarg.getString(âtxtâ); $$; - -User-defined functions can be used in ```SELECT`` <#selectStmt>`__, -```INSERT`` <#insertStmt>`__ and ```UPDATE`` <#updateStmt>`__ -statements. - -The implicitly available ``udfContext`` field (or binding for script -UDFs) provides the neccessary functionality to create new UDT and tuple -values. - -| bc(sample). -| CREATE TYPE custom\_type (txt text, i int); -| CREATE FUNCTION fct\_using\_udt ( somearg int ) -| RETURNS NULL ON NULL INPUT -| RETURNS custom\_type -| LANGUAGE java -| AS $$ -| UDTValue udt = udfContext.newReturnUDTValue(); -| udt.setString(âtxtâ, âsome stringâ); -| udt.setInt(âiâ, 42); -| return udt; -| $$; - -The definition of the ``UDFContext`` interface can be found in the -Apache Cassandra source code for +User-defined functions allow execution of user-provided code in Cassandra. By default, Cassandra supports defining +functions in *Java* and *JavaScript*. Support for other JSR 223 compliant scripting languages (such as Python, Ruby, and +Scala) can be added by adding a JAR to the classpath. + +UDFs are part of the Cassandra schema. As such, they are automatically propagated to all nodes in the cluster. + +UDFs can be *overloaded* - i.e. multiple UDFs with different argument types but the same function name. Example:: + + CREATE FUNCTION sample ( arg int ) ...; + CREATE FUNCTION sample ( arg text ) ...; + +User-defined functions are susceptible to all of the normal problems with the chosen programming language. Accordingly, +implementations should be safe against null pointer exceptions, illegal arguments, or any other potential source of +exceptions. An exception during function execution will result in the entire statement failing. + +It is valid to use *complex* types like collections, tuple types and user-defined types as argument and return types. +Tuple types and user-defined types are handled by the conversion functions of the DataStax Java Driver. Please see the +documentation of the Java Driver for details on handling tuple types and user-defined types. + +Arguments for functions can be literals or terms. Prepared statement placeholders can be used, too. + +Note that you can use the double-quoted string syntax to enclose the UDF source code. For example:: + + CREATE FUNCTION some_function ( arg int ) + RETURNS NULL ON NULL INPUT + RETURNS int + LANGUAGE java + AS $$ return arg; $$; + + SELECT some_function(column) FROM atable ...; + UPDATE atable SET col = some_function(?) ...; + + CREATE TYPE custom_type (txt text, i int); + CREATE FUNCTION fct_using_udt ( udtarg frozen ) + RETURNS NULL ON NULL INPUT + RETURNS text + LANGUAGE java + AS $$ return udtarg.getString("txt"); $$; + +User-defined functions can be used in ``SELECT``, ``INSERT`` and ``UPDATE`` statements. + +The implicitly available ``udfContext`` field (or binding for script UDFs) provides the necessary functionality to +create new UDT and tuple values:: + + CREATE TYPE custom\_type (txt text, i int); + CREATE FUNCTION fct\_using\_udt ( somearg int ) + RETURNS NULL ON NULL INPUT + RETURNS custom\_type + LANGUAGE java + AS $$ + UDTValue udt = udfContext.newReturnUDTValue(); + udt.setString(âtxtâ, âsome stringâ); + udt.setInt(âiâ, 42); + return udt; + $$; + +The definition of the ``UDFContext`` interface can be found in the Apache Cassandra source code for ``org.apache.cassandra.cql3.functions.UDFContext``. -| bc(sample). -| public interface UDFContext -| { -| UDTValue newArgUDTValue(String argName); -| UDTValue newArgUDTValue(int argNum); -| UDTValue newReturnUDTValue(); -| UDTValue newUDTValue(String udtName); -| TupleValue newArgTupleValue(String argName); -| TupleValue newArgTupleValue(int argNum); -| TupleValue newReturnTupleValue(); -| TupleValue newTupleValue(String cqlDefinition); -| } - -| Java UDFs already have some imports for common interfaces and classes - defined. These imports are: -| Please note, that these convenience imports are not available for - script UDFs. - -| bc(sample). -| import java.nio.ByteBuffer; -| import java.util.List; -| import java.util.Map; -| import java.util.Set; -| import org.apache.cassandra.cql3.functions.UDFContext; -| import com.datastax.driver.core.TypeCodec; -| import com.datastax.driver.core.TupleValue; -| import com.datastax.driver.core.UDTValue; - -See ```CREATE FUNCTION`` <#createFunctionStmt>`__ and -```DROP FUNCTION`` <#dropFunctionStmt>`__. +.. code-block:: java -CREATE FUNCTION -``````````````` + public interface UDFContext + { + UDTValue newArgUDTValue(String argName); + UDTValue newArgUDTValue(int argNum); + UDTValue newReturnUDTValue(); + UDTValue newUDTValue(String udtName); + TupleValue newArgTupleValue(String argName); + TupleValue newArgTupleValue(int argNum); + TupleValue newReturnTupleValue(); + TupleValue newTupleValue(String cqlDefinition); + } -*Syntax:* - -| bc(syntax).. -| ::= CREATE ( OR REPLACE )? -| FUNCTION ( IF NOT EXISTS )? -| ( â.â )? -| â(â ( â,â )\* â)â -| ( CALLED \| RETURNS NULL ) ON NULL INPUT -| RETURNS -| LANGUAGE -| AS -| p. -| *Sample:* - -| bc(sample). -| CREATE OR REPLACE FUNCTION somefunction -| ( somearg int, anotherarg text, complexarg frozen, listarg list ) -| RETURNS NULL ON NULL INPUT -| RETURNS text -| LANGUAGE java -| AS $$ -| // some Java code -| $$; -| CREATE FUNCTION akeyspace.fname IF NOT EXISTS -| ( someArg int ) -| CALLED ON NULL INPUT -| RETURNS text -| LANGUAGE java -| AS $$ -| // some Java code -| $$; - -``CREATE FUNCTION`` creates or replaces a user-defined function. +Java UDFs already have some imports for common interfaces and classes defined. These imports are: -Function Signature -################## +.. code-block:: java -Signatures are used to distinguish individual functions. The signature -consists of: + import java.nio.ByteBuffer; + import java.util.List; + import java.util.Map; + import java.util.Set; + import org.apache.cassandra.cql3.functions.UDFContext; + import com.datastax.driver.core.TypeCodec; + import com.datastax.driver.core.TupleValue; + import com.datastax.driver.core.UDTValue; -#. The fully qualified function name - i.e *keyspace* plus - *function-name* -#. The concatenated list of all argument types +Please note, that these convenience imports are not available for script UDFs. -Note that keyspace names, function names and argument types are subject -to the default naming conventions and case-sensitivity rules. +.. _create-function-statement: -``CREATE FUNCTION`` with the optional ``OR REPLACE`` keywords either -creates a function or replaces an existing one with the same signature. -A ``CREATE FUNCTION`` without ``OR REPLACE`` fails if a function with -the same signature already exists. +CREATE FUNCTION +``````````````` + +Creating a new user-defined function uses the ``CREATE FUNCTION`` statement: + +.. productionlist:: + create_function_statement: CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS] + : `function_name` '(' `arguments_declaration` ')' + : [ CALLED | RETURNS NULL ] ON NULL INPUT + : RETURNS `cql_type` + : LANGUAGE `identifier` + : AS `string` + arguments_declaration: `identifier` `cql_type` ( ',' `identifier` `cql_type` )* + +For instance:: + + CREATE OR REPLACE FUNCTION somefunction(somearg int, anotherarg text, complexarg frozen<someUDT>, listarg list) + RETURNS NULL ON NULL INPUT + RETURNS text + LANGUAGE java + AS $$ + // some Java code + $$; + + CREATE FUNCTION IF NOT EXISTS akeyspace.fname(someArg int) + CALLED ON NULL INPUT + RETURNS text + LANGUAGE java + AS $$ + // some Java code + $$; + +``CREATE FUNCTION`` with the optional ``OR REPLACE`` keywords either creates a function or replaces an existing one with +the same signature. A ``CREATE FUNCTION`` without ``OR REPLACE`` fails if a function with the same signature already +exists. + +If the optional ``IF NOT EXISTS`` keywords are used, the function will +only be created if another function with the same signature does not +exist. + +``OR REPLACE`` and ``IF NOT EXISTS`` cannot be used together. Behavior on invocation with ``null`` values must be defined for each function. There are two options: @@ -381,281 +332,222 @@ function. There are two options: #. ``CALLED ON NULL INPUT`` declares that the function will always be executed. -If the optional ``IF NOT EXISTS`` keywords are used, the function will -only be created if another function with the same signature does not -exist. +Function Signature +################## -``OR REPLACE`` and ``IF NOT EXISTS`` cannot be used together. +Signatures are used to distinguish individual functions. The signature consists of: + +#. The fully qualified function name - i.e *keyspace* plus *function-name* +#. The concatenated list of all argument types + +Note that keyspace names, function names and argument types are subject to the default naming conventions and +case-sensitivity rules. -Functions belong to a keyspace. If no keyspace is specified in -``<function-name>``, the current keyspace is used (i.e. the keyspace -specified using the ```USE`` <#useStmt>`__ statement). It is not -possible to create a user-defined function in one of the system -keyspaces. +Functions belong to a keyspace. If no keyspace is specified in ``<function-name>``, the current keyspace is used (i.e. +the keyspace specified using the ``USE`` statement). It is not possible to create a user-defined function in one of the +system keyspaces. -See the section on `user-defined functions <#udfs>`__ for more -information. +.. _drop-function-statement: DROP FUNCTION ````````````` -*Syntax:* +Dropping a function uses the ``DROP FUNCTION`` statement: -| bc(syntax).. -| ::= DROP FUNCTION ( IF EXISTS )? -| ( â.â )? -| ( â(â ( â,â )\* â)â )? +.. productionlist:: + drop_function_statement: DROP FUNCTION [ IF EXISTS ] `function_name` [ '(' `arguments_signature` ')' ] + arguments_signature: `cql_type` ( ',' `cql_type` )* -*Sample:* +For instance:: -| bc(sample). -| DROP FUNCTION myfunction; -| DROP FUNCTION mykeyspace.afunction; -| DROP FUNCTION afunction ( int ); -| DROP FUNCTION afunction ( text ); + DROP FUNCTION myfunction; + DROP FUNCTION mykeyspace.afunction; + DROP FUNCTION afunction ( int ); + DROP FUNCTION afunction ( text ); -| ``DROP FUNCTION`` statement removes a function created using - ``CREATE FUNCTION``. -| You must specify the argument types - (`signature <#functionSignature>`__ ) of the function to drop if there - are multiple functions with the same name but a different signature - (overloaded functions). +You must specify the argument types (:token:`arguments_signature`) of the function to drop if there are multiple +functions with the same name but a different signature (overloaded functions). -``DROP FUNCTION`` with the optional ``IF EXISTS`` keywords drops a -function if it exists. +``DROP FUNCTION`` with the optional ``IF EXISTS`` keywords drops a function if it exists, but does not throw an error if +it doesn't + +.. _aggregate-functions: Aggregate functions ^^^^^^^^^^^^^^^^^^^ -| Aggregate functions work on a set of rows. They receive values for - each row and returns one value for the whole set. -| If ``normal`` columns, ``scalar functions``, ``UDT`` fields, - ``writetime`` or ``ttl`` are selected together with aggregate - functions, the values returned for them will be the ones of the first - row matching the query. +Aggregate functions work on a set of rows. They receive values for each row and returns one value for the whole set. -CQL3 distinguishes between built-in aggregates (so called ânative -aggregatesâ) and `user-defined aggregates <#udas>`__. CQL3 includes -several native aggregates, described below: +If ``normal`` columns, ``scalar functions``, ``UDT`` fields, ``writetime`` or ``ttl`` are selected together with +aggregate functions, the values returned for them will be the ones of the first row matching the query. Native aggregates ~~~~~~~~~~~~~~~~~ +.. _count-function: + Count ````` -The ``count`` function can be used to count the rows returned by a -query. Example: +The ``count`` function can be used to count the rows returned by a query. Example:: -| bc(sample). -| SELECT COUNT (\*) FROM plays; -| SELECT COUNT (1) FROM plays; + SELECT COUNT (*) FROM plays; + SELECT COUNT (1) FROM plays; -It also can be used to count the non null value of a given column. -Example: +It also can be used to count the non null value of a given column:: -| bc(sample). -| SELECT COUNT (scores) FROM plays; + SELECT COUNT (scores) FROM plays; Max and Min ``````````` -The ``max`` and ``min`` functions can be used to compute the maximum and -the minimum value returned by a query for a given column. +The ``max`` and ``min`` functions can be used to compute the maximum and the minimum value returned by a query for a +given column. For instance:: -| bc(sample). -| SELECT MIN (players), MAX (players) FROM plays WHERE game = âquakeâ; + SELECT MIN (players), MAX (players) FROM plays WHERE game = 'quake'; Sum ``` -The ``sum`` function can be used to sum up all the values returned by a -query for a given column. +The ``sum`` function can be used to sum up all the values returned by a query for a given column. For instance:: -| bc(sample). -| SELECT SUM (players) FROM plays; + SELECT SUM (players) FROM plays; Avg ``` -The ``avg`` function can be used to compute the average of all the -values returned by a query for a given column. +The ``avg`` function can be used to compute the average of all the values returned by a query for a given column. For +instance:: + + SELECT AVG (players) FROM plays; -| bc(sample). -| SELECT AVG (players) FROM plays; +.. _user-defined-aggregates-functions: User-Defined Aggregates ~~~~~~~~~~~~~~~~~~~~~~~ -User-defined aggregates allow creation of custom aggregate functions -using `UDFs <#udfs>`__. Common examples of aggregate functions are +User-defined aggregates allow the creation of custom aggregate functions. Common examples of aggregate functions are *count*, *min*, and *max*. -Each aggregate requires an *initial state* (``INITCOND``, which defaults -to ``null``) of type ``STYPE``. The first argument of the state function -must have type ``STYPE``. The remaining arguments of the state function -must match the types of the user-defined aggregate arguments. The state -function is called once for each row, and the value returned by the -state function becomes the new state. After all rows are processed, the -optional ``FINALFUNC`` is executed with last state value as its -argument. - -``STYPE`` is mandatory in order to be able to distinguish possibly -overloaded versions of the state and/or final function (since the -overload can appear after creation of the aggregate). - -User-defined aggregates can be used in ```SELECT`` <#selectStmt>`__ -statement. - -A complete working example for user-defined aggregates (assuming that a -keyspace has been selected using the ```USE`` <#useStmt>`__ statement): - -| bc(sample).. -| CREATE OR REPLACE FUNCTION averageState ( state tuple, val int ) -| CALLED ON NULL INPUT -| RETURNS tuple -| LANGUAGE java -| AS â - if (val != null) { - state.setInt(0, state.getInt(0)+1); - state.setLong(1, state.getLong(1)+val.intValue()); - } - return state; - â; - -| CREATE OR REPLACE FUNCTION averageFinal ( state tuple ) -| CALLED ON NULL INPUT -| RETURNS double -| LANGUAGE java -| AS â - double r = 0; - if (state.getInt(0) == 0) return null; - r = state.getLong(1); - r /= state.getInt(0); - return Double.valueOf®; - â; - -| CREATE OR REPLACE AGGREGATE average ( int ) -| SFUNC averageState -| STYPE tuple -| FINALFUNC averageFinal -| INITCOND (0, 0); - -| CREATE TABLE atable ( -| pk int PRIMARY KEY, -| val int); -| INSERT INTO atable (pk, val) VALUES (1,1); -| INSERT INTO atable (pk, val) VALUES (2,2); -| INSERT INTO atable (pk, val) VALUES (3,3); -| INSERT INTO atable (pk, val) VALUES (4,4); -| SELECT average(val) FROM atable; -| p. - -See ```CREATE AGGREGATE`` <#createAggregateStmt>`__ and -```DROP AGGREGATE`` <#dropAggregateStmt>`__. +Each aggregate requires an *initial state* (``INITCOND``, which defaults to ``null``) of type ``STYPE``. The first +argument of the state function must have type ``STYPE``. The remaining arguments of the state function must match the +types of the user-defined aggregate arguments. The state function is called once for each row, and the value returned by +the state function becomes the new state. After all rows are processed, the optional ``FINALFUNC`` is executed with last +state value as its argument. + +``STYPE`` is mandatory in order to be able to distinguish possibly overloaded versions of the state and/or final +function (since the overload can appear after creation of the aggregate). + +User-defined aggregates can be used in ``SELECT`` statement. + +A complete working example for user-defined aggregates (assuming that a keyspace has been selected using the ``USE`` +statement):: + + CREATE OR REPLACE FUNCTION averageState(state tuple<int,bigint>, val int) + CALLED ON NULL INPUT + RETURNS tuple + LANGUAGE java + AS ' + if (val != null) { + state.setInt(0, state.getInt(0)+1); + state.setLong(1, state.getLong(1)+val.intValue()); + } + return state; + '; + + CREATE OR REPLACE FUNCTION averageFinal (state tuple<int,bigint>) + CALLED ON NULL INPUT + RETURNS double + LANGUAGE java + AS ' + double r = 0; + if (state.getInt(0) == 0) return null; + r = state.getLong(1); + r /= state.getInt(0); + return Double.valueOf®; + '; + + CREATE OR REPLACE AGGREGATE average(int) + SFUNC averageState + STYPE tuple + FINALFUNC averageFinal + INITCOND (0, 0); + + CREATE TABLE atable ( + pk int PRIMARY KEY, + val int + ); + + INSERT INTO atable (pk, val) VALUES (1,1); + INSERT INTO atable (pk, val) VALUES (2,2); + INSERT INTO atable (pk, val) VALUES (3,3); + INSERT INTO atable (pk, val) VALUES (4,4); + + SELECT average(val) FROM atable; + +.. _create-aggregate-statement: CREATE AGGREGATE ```````````````` -*Syntax:* +Creating (or replacing) a user-defined aggregate function uses the ``CREATE AGGREGATE`` statement: -| bc(syntax).. -| ::= CREATE ( OR REPLACE )? -| AGGREGATE ( IF NOT EXISTS )? -| ( â.â )? -| â(â ( â,â )\* â)â -| SFUNC -| STYPE -| ( FINALFUNC )? -| ( INITCOND )? -| p. -| *Sample:* +.. productionlist:: + create_aggregate_statement: CREATE [ OR REPLACE ] AGGREGATE [ IF NOT EXISTS ] + : `function_name` '(' `arguments_signature` ')' + : SFUNC `function_name` + : STYPE `cql_type` + : [ FINALFUNC `function_name` ] + : [ INITCOND `term` ] -| bc(sample). -| CREATE AGGREGATE myaggregate ( val text ) -| SFUNC myaggregate\_state -| STYPE text -| FINALFUNC myaggregate\_final -| INITCOND âfooâ; +See above for a complete example. -See the section on `user-defined aggregates <#udas>`__ for a complete -example. +``CREATE AGGREGATE`` with the optional ``OR REPLACE`` keywords either creates an aggregate or replaces an existing one +with the same signature. A ``CREATE AGGREGATE`` without ``OR REPLACE`` fails if an aggregate with the same signature +already exists. -``CREATE AGGREGATE`` creates or replaces a user-defined aggregate. - -``CREATE AGGREGATE`` with the optional ``OR REPLACE`` keywords either -creates an aggregate or replaces an existing one with the same -signature. A ``CREATE AGGREGATE`` without ``OR REPLACE`` fails if an -aggregate with the same signature already exists. - -``CREATE AGGREGATE`` with the optional ``IF NOT EXISTS`` keywords either -creates an aggregate if it does not already exist. +``CREATE AGGREGATE`` with the optional ``IF NOT EXISTS`` keywords either creates an aggregate if it does not already +exist. ``OR REPLACE`` and ``IF NOT EXISTS`` cannot be used together. -Aggregates belong to a keyspace. If no keyspace is specified in -``<aggregate-name>``, the current keyspace is used (i.e. the keyspace -specified using the ```USE`` <#useStmt>`__ statement). It is not -possible to create a user-defined aggregate in one of the system -keyspaces. - -Signatures for user-defined aggregates follow the `same -rules <#functionSignature>`__ as for user-defined functions. - ``STYPE`` defines the type of the state value and must be specified. -The optional ``INITCOND`` defines the initial state value for the -aggregate. It defaults to ``null``. A non-\ ``null`` ``INITCOND`` must -be specified for state functions that are declared with -``RETURNS NULL ON NULL INPUT``. +The optional ``INITCOND`` defines the initial state value for the aggregate. It defaults to ``null``. A non-\ ``null`` +``INITCOND`` must be specified for state functions that are declared with ``RETURNS NULL ON NULL INPUT``. -``SFUNC`` references an existing function to be used as the state -modifying function. The type of first argument of the state function -must match ``STYPE``. The remaining argument types of the state function -must match the argument types of the aggregate function. State is not -updated for state functions declared with ``RETURNS NULL ON NULL INPUT`` -and called with ``null``. +``SFUNC`` references an existing function to be used as the state modifying function. The type of first argument of the +state function must match ``STYPE``. The remaining argument types of the state function must match the argument types of +the aggregate function. State is not updated for state functions declared with ``RETURNS NULL ON NULL INPUT`` and called +with ``null``. -The optional ``FINALFUNC`` is called just before the aggregate result is -returned. It must take only one argument with type ``STYPE``. The return -type of the ``FINALFUNC`` may be a different type. A final function -declared with ``RETURNS NULL ON NULL INPUT`` means that the aggregateâs -return value will be ``null``, if the last state is ``null``. +The optional ``FINALFUNC`` is called just before the aggregate result is returned. It must take only one argument with +type ``STYPE``. The return type of the ``FINALFUNC`` may be a different type. A final function declared with ``RETURNS +NULL ON NULL INPUT`` means that the aggregate's return value will be ``null``, if the last state is ``null``. -If no ``FINALFUNC`` is defined, the overall return type of the aggregate -function is ``STYPE``. If a ``FINALFUNC`` is defined, it is the return -type of that function. +If no ``FINALFUNC`` is defined, the overall return type of the aggregate function is ``STYPE``. If a ``FINALFUNC`` is +defined, it is the return type of that function. -See the section on `user-defined aggregates <#udas>`__ for more -information. +.. _drop-aggregate-statement: DROP AGGREGATE `````````````` -*Syntax:* - -| bc(syntax).. -| ::= DROP AGGREGATE ( IF EXISTS )? -| ( â.â )? -| ( â(â ( â,â )\* â)â )? -| p. +Dropping an user-defined aggregate function uses the ``DROP AGGREGATE`` statement: -*Sample:* +.. productionlist:: + drop_aggregate_statement: DROP AGGREGATE [ IF EXISTS ] `function_name` [ '(' `arguments_signature` ')' ] -| bc(sample). -| DROP AGGREGATE myAggregate; -| DROP AGGREGATE myKeyspace.anAggregate; -| DROP AGGREGATE someAggregate ( int ); -| DROP AGGREGATE someAggregate ( text ); +For instance:: -The ``DROP AGGREGATE`` statement removes an aggregate created using -``CREATE AGGREGATE``. You must specify the argument types of the -aggregate to drop if there are multiple aggregates with the same name -but a different signature (overloaded aggregates). + DROP AGGREGATE myAggregate; + DROP AGGREGATE myKeyspace.anAggregate; + DROP AGGREGATE someAggregate ( int ); + DROP AGGREGATE someAggregate ( text ); -``DROP AGGREGATE`` with the optional ``IF EXISTS`` keywords drops an -aggregate if it exists, and does nothing if a function with the -signature does not exist. +The ``DROP AGGREGATE`` statement removes an aggregate created using ``CREATE AGGREGATE``. You must specify the argument +types of the aggregate to drop if there are multiple aggregates with the same name but a different signature (overloaded +aggregates). -Signatures for user-defined aggregates follow the `same -rules <#functionSignature>`__ as for user-defined functions. +``DROP AGGREGATE`` with the optional ``IF EXISTS`` keywords drops an aggregate if it exists, and does nothing if a +function with the signature does not exist.
http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/index.rst ---------------------------------------------------------------------- diff --git a/doc/source/cql/index.rst b/doc/source/cql/index.rst index c736beb..718959c 100644 --- a/doc/source/cql/index.rst +++ b/doc/source/cql/index.rst @@ -14,8 +14,6 @@ .. See the License for the specific language governing permissions and .. limitations under the License. -.. _UUID: https://en.wikipedia.org/wiki/Universally_unique_identifier - The Cassandra Query Language (CQL) ================================== http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/indexes.rst ---------------------------------------------------------------------- diff --git a/doc/source/cql/indexes.rst b/doc/source/cql/indexes.rst index 5791022..40b9f49 100644 --- a/doc/source/cql/indexes.rst +++ b/doc/source/cql/indexes.rst @@ -16,69 +16,60 @@ .. highlight:: sql -Indexes -------- +.. _secondary-indexes: + +Secondary Indexes +----------------- + +.. _create-index-statement: CREATE INDEX ^^^^^^^^^^^^ -*Syntax:* - -| bc(syntax).. -| ::= CREATE ( CUSTOM )? INDEX ( IF NOT EXISTS )? ( )? -| ON â(â â)â -| ( USING ( WITH OPTIONS = )? )? - -| ::= -| \| keys( ) -| p. -| *Sample:* - -| bc(sample). -| CREATE INDEX userIndex ON NerdMovies (user); -| CREATE INDEX ON Mutants (abilityId); -| CREATE INDEX ON users (keys(favs)); -| CREATE CUSTOM INDEX ON users (email) USING âpath.to.the.IndexClassâ; -| CREATE CUSTOM INDEX ON users (email) USING âpath.to.the.IndexClassâ - WITH OPTIONS = {âstorageâ: â/mnt/ssd/indexes/â}; - -The ``CREATE INDEX`` statement is used to create a new (automatic) -secondary index for a given (existing) column in a given table. A name -for the index itself can be specified before the ``ON`` keyword, if -desired. If data already exists for the column, it will be indexed -asynchronously. After the index is created, new data for the column is -indexed automatically at insertion time. - -Attempting to create an already existing index will return an error -unless the ``IF NOT EXISTS`` option is used. If it is used, the -statement will be a no-op if the index already exists. +Creating a secondary index on a table uses the ``CREATE INDEX`` statement: + +.. productionlist:: + create_index_statement: CREATE [ CUSTOM ] INDEX [ IF NOT EXISTS ] [ `index_name` ] + : ON `table_name` '(' `index_identifier` ')' + : [ USING `string` [ WITH OPTIONS = `map_literal` ] ] + index_identifier: `column_name` + :| ( KEYS | VALUES | ENTRIES | FULL ) '(' `column_name` ')' + +For instance:: + + CREATE INDEX userIndex ON NerdMovies (user); + CREATE INDEX ON Mutants (abilityId); + CREATE INDEX ON users (keys(favs)); + CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass'; + CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass' WITH OPTIONS = {'storage': '/mnt/ssd/indexes/'}; + +The ``CREATE INDEX`` statement is used to create a new (automatic) secondary index for a given (existing) column in a +given table. A name for the index itself can be specified before the ``ON`` keyword, if desired. If data already exists +for the column, it will be indexed asynchronously. After the index is created, new data for the column is indexed +automatically at insertion time. + +Attempting to create an already existing index will return an error unless the ``IF NOT EXISTS`` option is used. If it +is used, the statement will be a no-op if the index already exists. Indexes on Map Keys ~~~~~~~~~~~~~~~~~~~ -When creating an index on a `map column <#map>`__, you may index either -the keys or the values. If the column identifier is placed within the -``keys()`` function, the index will be on the map keys, allowing you to -use ``CONTAINS KEY`` in ``WHERE`` clauses. Otherwise, the index will be -on the map values. +When creating an index on a :ref:`maps <maps>`, you may index either the keys or the values. If the column identifier is +placed within the ``keys()`` function, the index will be on the map keys, allowing you to use ``CONTAINS KEY`` in +``WHERE`` clauses. Otherwise, the index will be on the map values. + +.. _drop-index-statement: DROP INDEX ^^^^^^^^^^ -*Syntax:* - -bc(syntax). ::= DROP INDEX ( IF EXISTS )? ( â.â )? - -*Sample:* +Dropping a secondary index uses the ``DROP INDEX`` statement: -| bc(sample).. -| DROP INDEX userIndex; +.. productionlist:: + drop_index_statement: DROP INDEX [ IF EXISTS ] `index_name` -| DROP INDEX userkeyspace.address\_index; -| p. -| The ``DROP INDEX`` statement is used to drop an existing secondary - index. The argument of the statement is the index name, which may - optionally specify the keyspace of the index. +The ``DROP INDEX`` statement is used to drop an existing secondary index. The argument of the statement is the index +name, which may optionally specify the keyspace of the index. -If the index does not exists, the statement will return an error, unless -``IF EXISTS`` is used in which case the operation is a no-op. +If the index does not exists, the statement will return an error, unless ``IF EXISTS`` is used in which case the +operation is a no-op. http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/json.rst ---------------------------------------------------------------------- diff --git a/doc/source/cql/json.rst b/doc/source/cql/json.rst index 4f442b7..6482fd6 100644 --- a/doc/source/cql/json.rst +++ b/doc/source/cql/json.rst @@ -16,131 +16,97 @@ .. highlight:: sql +.. _cql-json: + JSON Support ------------ -Cassandra 2.2 introduces JSON support to ```SELECT`` <#selectStmt>`__ -and ```INSERT`` <#insertStmt>`__ statements. This support does not -fundamentally alter the CQL API (for example, the schema is still -enforced), it simply provides a convenient way to work with JSON -documents. +Cassandra 2.2 introduces JSON support to :ref:`SELECT <select-statement>` and :ref:`INSERT <insert-statement>` +statements. This support does not fundamentally alter the CQL API (for example, the schema is still enforced), it simply +provides a convenient way to work with JSON documents. SELECT JSON ^^^^^^^^^^^ -With ``SELECT`` statements, the new ``JSON`` keyword can be used to -return each row as a single ``JSON`` encoded map. The remainder of the -``SELECT`` statment behavior is the same. +With ``SELECT`` statements, the ``JSON`` keyword can be used to return each row as a single ``JSON`` encoded map. The +remainder of the ``SELECT`` statement behavior is the same. -The result map keys are the same as the column names in a normal result -set. For example, a statement like â``SELECT JSON a, ttl(b) FROM ...``â -would result in a map with keys ``"a"`` and ``"ttl(b)"``. However, this -is one notable exception: for symmetry with ``INSERT JSON`` behavior, -case-sensitive column names with upper-case letters will be surrounded -with double quotes. For example, â``SELECT JSON myColumn FROM ...``â -would result in a map key ``"\"myColumn\""`` (note the escaped quotes). +The result map keys are the same as the column names in a normal result set. For example, a statement like ``SELECT JSON +a, ttl(b) FROM ...`` would result in a map with keys ``"a"`` and ``"ttl(b)"``. However, this is one notable exception: +for symmetry with ``INSERT JSON`` behavior, case-sensitive column names with upper-case letters will be surrounded with +double quotes. For example, ``SELECT JSON myColumn FROM ...`` would result in a map key ``"\"myColumn\""`` (note the +escaped quotes). -The map values will ``JSON``-encoded representations (as described -below) of the result set values. +The map values will ``JSON``-encoded representations (as described below) of the result set values. INSERT JSON ^^^^^^^^^^^ -With ``INSERT`` statements, the new ``JSON`` keyword can be used to -enable inserting a ``JSON`` encoded map as a single row. The format of -the ``JSON`` map should generally match that returned by a -``SELECT JSON`` statement on the same table. In particular, -case-sensitive column names should be surrounded with double quotes. For -example, to insert into a table with two columns named âmyKeyâ and -âvalueâ, you would do the following: +With ``INSERT`` statements, the new ``JSON`` keyword can be used to enable inserting a ``JSON`` encoded map as a single +row. The format of the ``JSON`` map should generally match that returned by a ``SELECT JSON`` statement on the same +table. In particular, case-sensitive column names should be surrounded with double quotes. For example, to insert into a +table with two columns named "myKey" and "value", you would do the following:: -| bc(sample). -| INSERT INTO mytable JSON â{â\\âmyKey\\ââ: 0, `value <>`__ 0}â + INSERT INTO mytable JSON '{ "\"myKey\"": 0, "value": 0}' -Any columns which are ommitted from the ``JSON`` map will be defaulted -to a ``NULL`` value (which will result in a tombstone being created). +Any columns which are omitted from the ``JSON`` map will be defaulted to a ``NULL`` value (which will result in a +tombstone being created). JSON Encoding of Cassandra Data Types ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Where possible, Cassandra will represent and accept data types in their -native ``JSON`` representation. Cassandra will also accept string -representations matching the CQL literal format for all single-field -types. For example, floats, ints, UUIDs, and dates can be represented by -CQL literal strings. However, compound types, such as collections, -tuples, and user-defined types must be represented by native ``JSON`` -collections (maps and lists) or a JSON-encoded string representation of -the collection. - -The following table describes the encodings that Cassandra will accept -in ``INSERT JSON`` values (and ``fromJson()`` arguments) as well as the -format Cassandra will use when returning data for ``SELECT JSON`` -statements (and ``fromJson()``): - -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| type | formats accepted | return format | notes | -+=================+==========================+=================+===============================================================================================================================================================================================================+ -| ``ascii`` | string | string | Uses JSONâs ``\u`` character escape | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``bigint`` | integer, string | integer | String must be valid 64 bit integer | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``blob`` | string | string | String should be 0x followed by an even number of hex digits | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``boolean`` | boolean, string | boolean | String must be âtrueâ or âfalseâ | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``date`` | string | string | Date in format ``YYYY-MM-DD``, timezone UTC | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``decimal`` | integer, float, string | float | May exceed 32 or 64-bit IEEE-754 floating point precision in client-side decoder | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``double`` | integer, float, string | float | String must be valid integer or float | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``float`` | integer, float, string | float | String must be valid integer or float | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``inet`` | string | string | IPv4 or IPv6 address | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``int`` | integer, string | integer | String must be valid 32 bit integer | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``list`` | list, string | list | Uses JSONâs native list representation | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``map`` | map, string | map | Uses JSONâs native map representation | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``smallint`` | integer, string | integer | String must be valid 16 bit integer | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``set`` | list, string | list | Uses JSONâs native list representation | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``text`` | string | string | Uses JSONâs ``\u`` character escape | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``time`` | string | string | Time of day in format ``HH-MM-SS[.fffffffff]`` | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``timestamp`` | integer, string | string | A timestamp. Strings constant are allow to input timestamps as dates, see `Working with dates <#usingdates>`__ below for more information. Datestamps with format ``YYYY-MM-DD HH:MM:SS.SSS`` are returned. | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``timeuuid`` | string | string | Type 1 UUID. See `Constants <#constants>`__ for the UUID format | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``tinyint`` | integer, string | integer | String must be valid 8 bit integer | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``tuple`` | list, string | list | Uses JSONâs native list representation | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``UDT`` | map, string | map | Uses JSONâs native map representation with field names as keys | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``uuid`` | string | string | See `Constants <#constants>`__ for the UUID format | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``varchar`` | string | string | Uses JSONâs ``\u`` character escape | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``varint`` | integer, string | integer | Variable length; may overflow 32 or 64 bit integers in client-side decoder | -+-----------------+--------------------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +Where possible, Cassandra will represent and accept data types in their native ``JSON`` representation. Cassandra will +also accept string representations matching the CQL literal format for all single-field types. For example, floats, +ints, UUIDs, and dates can be represented by CQL literal strings. However, compound types, such as collections, tuples, +and user-defined types must be represented by native ``JSON`` collections (maps and lists) or a JSON-encoded string +representation of the collection. + +The following table describes the encodings that Cassandra will accept in ``INSERT JSON`` values (and ``fromJson()`` +arguments) as well as the format Cassandra will use when returning data for ``SELECT JSON`` statements (and +``fromJson()``): + +=============== ======================== =============== ============================================================== + Type Formats accepted Return format Notes +=============== ======================== =============== ============================================================== + ``ascii`` string string Uses JSON's ``\u`` character escape + ``bigint`` integer, string integer String must be valid 64 bit integer + ``blob`` string string String should be 0x followed by an even number of hex digits + ``boolean`` boolean, string boolean String must be "true" or "false" + ``date`` string string Date in format ``YYYY-MM-DD``, timezone UTC + ``decimal`` integer, float, string float May exceed 32 or 64-bit IEEE-754 floating point precision in + client-side decoder + ``double`` integer, float, string float String must be valid integer or float + ``float`` integer, float, string float String must be valid integer or float + ``inet`` string string IPv4 or IPv6 address + ``int`` integer, string integer String must be valid 32 bit integer + ``list`` list, string list Uses JSON's native list representation + ``map`` map, string map Uses JSON's native map representation + ``smallint`` integer, string integer String must be valid 16 bit integer + ``set`` list, string list Uses JSON's native list representation + ``text`` string string Uses JSON's ``\u`` character escape + ``time`` string string Time of day in format ``HH-MM-SS[.fffffffff]`` + ``timestamp`` integer, string string A timestamp. Strings constant allows to input :ref:`timestamps + as dates <timestamps>`. Datestamps with format ``YYYY-MM-DD + HH:MM:SS.SSS`` are returned. + ``timeuuid`` string string Type 1 UUID. See :token:`constant` for the UUID format + ``tinyint`` integer, string integer String must be valid 8 bit integer + ``tuple`` list, string list Uses JSON's native list representation + ``UDT`` map, string map Uses JSON's native map representation with field names as keys + ``uuid`` string string See :token:`constant` for the UUID format + ``varchar`` string string Uses JSON's ``\u`` character escape + ``varint`` integer, string integer Variable length; may overflow 32 or 64 bit integers in + client-side decoder +=============== ======================== =============== ============================================================== The fromJson() Function ^^^^^^^^^^^^^^^^^^^^^^^ -The ``fromJson()`` function may be used similarly to ``INSERT JSON``, -but for a single column value. It may only be used in the ``VALUES`` -clause of an ``INSERT`` statement or as one of the column values in an -``UPDATE``, ``DELETE``, or ``SELECT`` statement. For example, it cannot -be used in the selection clause of a ``SELECT`` statement. +The ``fromJson()`` function may be used similarly to ``INSERT JSON``, but for a single column value. It may only be used +in the ``VALUES`` clause of an ``INSERT`` statement or as one of the column values in an ``UPDATE``, ``DELETE``, or +``SELECT`` statement. For example, it cannot be used in the selection clause of a ``SELECT`` statement. The toJson() Function ^^^^^^^^^^^^^^^^^^^^^ -The ``toJson()`` function may be used similarly to ``SELECT JSON``, but -for a single column value. It may only be used in the selection clause -of a ``SELECT`` statement. +The ``toJson()`` function may be used similarly to ``SELECT JSON``, but for a single column value. It may only be used +in the selection clause of a ``SELECT`` statement. http://git-wip-us.apache.org/repos/asf/cassandra/blob/0e624238/doc/source/cql/mvs.rst ---------------------------------------------------------------------- diff --git a/doc/source/cql/mvs.rst b/doc/source/cql/mvs.rst index 41f601b..c6e3ef7 100644 --- a/doc/source/cql/mvs.rst +++ b/doc/source/cql/mvs.rst @@ -16,80 +16,145 @@ .. highlight:: sql +.. _materialized-views: + Materialized Views ------------------ +.. _create-materialized-view-statement: + CREATE MATERIALIZED VIEW ^^^^^^^^^^^^^^^^^^^^^^^^ -*Syntax:* - -| bc(syntax).. -| ::= CREATE MATERIALIZED VIEW ( IF NOT EXISTS )? AS -| SELECT ( â(â ( â,â ) \* â)â \| â\*â ) -| FROM -| ( WHERE )? -| PRIMARY KEY â(â ( â,â )\* â)â -| ( WITH ( AND )\* )? -| p. -| *Sample:* - -| bc(sample).. -| CREATE MATERIALIZED VIEW monkeySpecies\_by\_population AS -| SELECT \* -| FROM monkeySpecies -| WHERE population IS NOT NULL AND species IS NOT NULL -| PRIMARY KEY (population, species) -| WITH comment=âAllow query by population instead of speciesâ; -| p. -| The ``CREATE MATERIALIZED VIEW`` statement creates a new materialized - view. Each such view is a set of *rows* which corresponds to rows - which are present in the underlying, or base, table specified in the - ``SELECT`` statement. A materialized view cannot be directly updated, - but updates to the base table will cause corresponding updates in the - view. - -Attempting to create an already existing materialized view will return -an error unless the ``IF NOT EXISTS`` option is used. If it is used, the -statement will be a no-op if the materialized view already exists. - -``WHERE`` Clause -~~~~~~~~~~~~~~~~ - -The ``<where-clause>`` is similar to the `where clause of a ``SELECT`` -statement <#selectWhere>`__, with a few differences. First, the where -clause must contain an expression that disallows ``NULL`` values in -columns in the viewâs primary key. If no other restriction is desired, -this can be accomplished with an ``IS NOT NULL`` expression. Second, -only columns which are in the base tableâs primary key may be restricted -with expressions other than ``IS NOT NULL``. (Note that this second -restriction may be lifted in the future.) +You can create a materialized view on a table using a ``CREATE MATERIALIZED VIEW`` statement: + +.. productionlist:: + create_materialized_view_statement: CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] `view_name` AS + : `select_statement` + : PRIMARY KEY '(' `primary_key` ')' + : WITH `table_options` + +For instance:: + + CREATE MATERIALIZED VIEW monkeySpecies_by_population AS + SELECT * FROM monkeySpecies + WHERE population IS NOT NULL AND species IS NOT NULL + PRIMARY KEY (population, species) + WITH comment=âAllow query by population instead of speciesâ; + +The ``CREATE MATERIALIZED VIEW`` statement creates a new materialized view. Each such view is a set of *rows* which +corresponds to rows which are present in the underlying, or base, table specified in the ``SELECT`` statement. A +materialized view cannot be directly updated, but updates to the base table will cause corresponding updates in the +view. + +Creating a materialized view has 3 main parts: + +- The :ref:`select statement <mv-select>` that restrict the data included in the view. +- The :ref:`primary key <mv-primary-key>` definition for the view. +- The :ref:`options <mv-options>` for the view. + +Attempting to create an already existing materialized view will return an error unless the ``IF NOT EXISTS`` option is +used. If it is used, the statement will be a no-op if the materialized view already exists. + +.. _mv-select: + +MV select statement +``````````````````` + +The select statement of a materialized view creation defines which of the base table is included in the view. That +statement is limited in a number of ways: + +- the :ref:`selection <selection-clause>` is limited to those that only select columns of the base table. In other + words, you can't use any function (aggregate or not), casting, term, etc. Aliases are also not supported. You can + however use `*` as a shortcut of selecting all columns. Further, :ref:`static columns <static-columns>` cannot be + included in a materialized view (which means ``SELECT *`` isn't allowed if the base table has static columns). +- the ``WHERE`` clause have the following restrictions: + + - it cannot include any :token:`bind_marker`. + - the columns that are not part of the *base table* primary key can only be restricted by an ``IS NOT NULL`` + restriction. No other restriction is allowed. + - as the columns that are part of the *view* primary key cannot be null, they must always be at least restricted by a + ``IS NOT NULL`` restriction (or any other restriction, but they must have one). + +- it cannot have neither an :ref:`ordering clause <ordering-clause>`, nor a :ref:`limit <limit-clause>`, nor :ref:`ALLOW + FILTERING <allow-filtering>`. + +.. _mv-primary-key: + +MV primary key +`````````````` + +A view must have a primary key and that primary key must conform to the following restrictions: + +- it must contain all the primary key columns of the base table. This ensures that every row of the view correspond to + exactly one row of the base table. +- it can only contain a single column that is not a primary key column in the base table. + +So for instance, give the following base table definition:: + + CREATE TABLE t ( + k int, + c1 int, + c2 int, + v1 int, + v2 int, + PRIMARY KEY (k, c1, c2) + ) + +then the following view definitions are allowed:: + + CREATE MATERIALIZED VIEW mv1 AS + SELECT * FROM t WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL + PRIMARY KEY (c1, k, c2) + + CREATE MATERIALIZED VIEW mv1 AS + SELECT * FROM t WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL + PRIMARY KEY (v1, k, c1, c2) + +but the following ones are **not** allowed:: + + // Error: cannot include both v1 and v2 in the primary key as both are not in the base table primary key + CREATE MATERIALIZED VIEW mv1 AS + SELECT * FROM t WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL AND v1 IS NOT NULL + PRIMARY KEY (v1, v2, k, c1, c2) + + // Error: must include k in the primary as it's a base table primary key column + CREATE MATERIALIZED VIEW mv1 AS + SELECT * FROM t WHERE c1 IS NOT NULL AND c2 IS NOT NULL + PRIMARY KEY (c1, c2) + + +.. _mv-options: + +MV options +`````````` + +A materialized view is internally implemented by a table and as such, creating a MV allows the :ref:`same options than +creating a table <create-table-options>`. + + +.. _alter-materialized-view-statement: ALTER MATERIALIZED VIEW ^^^^^^^^^^^^^^^^^^^^^^^ -*Syntax:* +After creation, you can alter the options of a materialized view using the ``ALTER MATERIALIZED VIEW`` statement: -| bc(syntax). ::= ALTER MATERIALIZED VIEW -| WITH ( AND )\* +.. productionlist:: + alter_materialized_view_statement: ALTER MATERIALIZED VIEW `view_name` WITH `table_options` -The ``ALTER MATERIALIZED VIEW`` statement allows options to be update; -these options are the same as \ ``CREATE TABLE``\ âs options. +The options that can be updated are the same than at creation time and thus the :ref:`same than for tables +<create-table-options>`. + +.. _drop-materialized-view-statement: DROP MATERIALIZED VIEW ^^^^^^^^^^^^^^^^^^^^^^ -*Syntax:* - -bc(syntax). ::= DROP MATERIALIZED VIEW ( IF EXISTS )? - -*Sample:* - -bc(sample). DROP MATERIALIZED VIEW monkeySpecies\_by\_population; +Dropping a materialized view users the ``DROP MATERIALIZED VIEW`` statement: -The ``DROP MATERIALIZED VIEW`` statement is used to drop an existing -materialized view. +.. productionlist:: + drop_materialized_view_statement: DROP MATERIALIZED VIEW [ IF EXISTS ] `view_name`; -If the materialized view does not exists, the statement will return an -error, unless ``IF EXISTS`` is used in which case the operation is a -no-op. +If the materialized view does not exists, the statement will return an error, unless ``IF EXISTS`` is used in which case +the operation is a no-op.