This is an automated email from the ASF dual-hosted git repository.

adelapena pushed a commit to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-5.0 by this push:
     new 5817799605 Restore accidentally deleted DDM doc
5817799605 is described below

commit 58177996058a57e4909b11bcc6e754c8a6e38f6d
Author: Andrés de la Peña <a.penya.gar...@gmail.com>
AuthorDate: Fri Aug 18 11:01:45 2023 +0100

    Restore accidentally deleted DDM doc
    
    patch by Andrés de la Peña; reviewed by Brandon Williams and Berenguer 
Blasi for CASSANDRA-18776
---
 .../pages/developing/cql/dynamic_data_masking.adoc | 178 +++++++++++++++++++++
 .../cassandra/pages/developing/cql/index.adoc      |   2 +-
 2 files changed, 179 insertions(+), 1 deletion(-)

diff --git 
a/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc 
b/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc
new file mode 100644
index 0000000000..8f523ac2ed
--- /dev/null
+++ b/doc/modules/cassandra/pages/developing/cql/dynamic_data_masking.adoc
@@ -0,0 +1,178 @@
+= Dynamic Data Masking
+
+Dynamic data masking (DDM) allows to obscure sensitive information while still 
allowing access to the masked columns.
+DDM doesn't change the stored data. Instead, it just presents the data on 
their obscured form during `SELECT` queries.
+This aims to provide some degree of protection against accidental data 
exposure. However, it's important to know that
+anyone with direct access to the sstable files will be able to read the clear 
data.
+
+== Masking functions
+
+DDM is based on a set of CQL native functions that obscure sensitive 
information. The available functions are:
+
+include::partial$masking_functions.adoc[]
+
+Those functions can be discretionarily used on `SELECT` queries to get an 
obscured view of the data. For example:
+
+[source,cql]
+----
+include::example$CQL/select_with_mask_functions.cql[]
+----
+
+== Attaching masking functions to table columns
+
+The masking functions can be permanently attached to the columns of a table.
+In that case, `SELECT` queries will always return the column values in their 
masked form.
+The masking will be transparent for the users running `SELECT` queries,
+so their only way to know that a column is masked will be consulting the table 
definition.
+
+This is an optional feature that should be enabled with the 
`dynamic_data_masking_enabled` property in `cassandra.yaml`,
+since it's disabled by default.
+
+The masks of the columns of a table can be defined on `CREATE TABLE` queries:
+
+[source,cql]
+----
+include::example$CQL/ddm_create_table.cql[]
+----
+
+Note that in the example above we are referencing the `mask_inner` function 
with two arguments.
+However, that CQL function actually has three arguments when explicitely used 
on `SELECT` queries.
+The first argument is always ommitted when attaching the function to a schema 
column.
+The value of that first argument is always interpreted as the value of the 
masked column, in this case a `text` column.
+For the same reason the call to `mask_default` attached to the column doesn't 
have any argument,
+even when that function requires one argument when explicitely used on 
`SELECT` queries.
+
+Data can be inserted into the masked table as usual. For example:
+
+[source,cql]
+----
+include::example$CQL/ddm_insert_data.cql[]
+----
+
+The attached column masks will make `SELECT` queries automatically return 
masked data,
+without the need of including the masking function on the query:
+
+[source,cql]
+----
+include::example$CQL/ddm_select_with_masked_columns.cql[]
+----
+
+The masking function attached to a column can be changed with an `ALTER TABLE` 
query:
+
+[source,cql]
+----
+include::example$CQL/ddm_alter_mask.cql[]
+----
+
+In a similar way, a masking function can be dettached from a column with an 
`ALTER TABLE` query:
+
+[source,cql]
+----
+include::example$CQL/ddm_drop_mask.cql[]
+----
+
+== Permissions
+
+The `UNMASK` permission allows users to retrieve the unmasked values of masked 
columns.
+The masks will only be applied to the results of a `SELECT` query if the user 
doesn't have the `UNMASK` permission.
+Ordinary users are created without the `UNMASK` permission, whereas superusers 
do have it.
+
+As an example, suppose that we have a table with masked columns:
+
+[source,cql]
+----
+include::example$CQL/ddm_create_table.cql[]
+----
+
+And we insert some data into the table:
+
+[source,cql]
+----
+include::example$CQL/ddm_insert_data.cql[]
+----
+
+[source,cql]
+----
+include::example$CQL/ddm_select_without_unmask_permission.cql[]
+----
+
+Then we create two users with `SELECT` permission for the table, but we only 
grant the `UNMASK` permission to one of
+the users:
+
+[source,cql]
+----
+include::example$CQL/ddm_create_users.cql[]
+----
+
+We can now see that the user with the `UNMASK` permission can see the clear 
data, without any masking:
+
+[source,cql]
+----
+include::example$CQL/ddm_select_with_unmask_permission.cql[]
+----
+
+However, the user without the `UNMASK` permission can only see the masked data:
+
+[source,cql]
+----
+include::example$CQL/ddm_select_without_unmask_permission.cql[]
+----
+
+The `UNMASK` permission works as any other permission. Thus, it can be revoked 
in any moment:
+
+[source,cql]
+----
+include::example$CQL/ddm_revoke_unmask.cql[]
+----
+
+Please note that the anonymous user that is used when authentication is 
disabled has all the permissions.
+Since it includes the `UNMASK` permission, that anonymous user will always see 
the clear data.
+In other words, attaching data masking functions to columns only makes sense 
if authentication is enabled.
+
+Users without the `UNMASK` permission are not allowed to use masked columns in 
the `WHERE` clause of a `SELECT` query.
+This prevents malicious users from figuring out the clear data by running 
exhaustive queries. For instance:
+
+[source,cql]
+----
+include::example$CQL/ddm_select_without_select_masked.cql[]
+----
+
+However, there are some use cases where trusted database users just need a 
useful way to produce masked data
+that will be served to untrusted external users.
+For example, a trusted app can connect to the database and extract masked data 
that will be served to its end users.
+In that case the trusted user (the app) can be given the `SELECT_MASKED` 
permission.
+That permission allows to use masked columns in the `WHERE` clause of a 
`SELECT` query,
+while still seeing the masked data in the query results. For instance:
+
+[source,cql]
+----
+include::example$CQL/ddm_select_with_select_masked.cql[]
+----
+
+== Custom functions
+
+xref:developing/cql/functions.adoc#user-defined-scalar-functions[User-defined 
functions (UDFs)] can be attached to a table column.
+The UDFs used for masking should belong to the same keyspace as the masked 
table.
+The column value to mask will be passed as the first argument of the attached 
UDF.
+Thus, the UDFs attached to a column should have at least one argument,
+and that argument should have the same type as the masked column.
+Also, the attached UDF should return values of the same type as the maked 
column. For instance:
+
+[source,cql]
+----
+include::example$CQL/ddm_create_table_with_udf.cql[]
+----
+
+This creates a dependency between the table schema and the functions.
+Any attempt to drop the function will be rejected while this dependency exists.
+Thus, to drop the function you should first drop the mask.
+This can be done with:
+
+[source,cql]
+----
+include::example$CQL/ddm_drop_mask.cql[]
+----
+
+Dropping the column, or its containing table, or its containing keyspace would 
also remove the dependency.
+
+xref:developing/cql/functions.adoc#aggregate-functions[Aggregate functions] 
cannot be used as masking functions.
diff --git a/doc/modules/cassandra/pages/developing/cql/index.adoc 
b/doc/modules/cassandra/pages/developing/cql/index.adoc
index 84e84a9a21..4c7014269b 100644
--- a/doc/modules/cassandra/pages/developing/cql/index.adoc
+++ b/doc/modules/cassandra/pages/developing/cql/index.adoc
@@ -19,7 +19,7 @@ For that reason, when used in this document, these terms 
(tables, rows and colum
 * xref:cql/functions.adoc[Functions]
 * xref:cql/json.adoc[JSON]
 * xref:cql/security.adoc[CQL security]
-* xref:cql/dynamic_data_masking.adoc[Dynamic data masking]
+* xref:developing/cql/dynamic_data_masking.adoc[Dynamic data masking]
 * xref:cql/triggers.adoc[Triggers]
 * xref:cql/appendices.adoc[Appendices]
 * xref:cql/changes.adoc[Changes]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to