blerer commented on code in PR #2110:
URL: https://github.com/apache/cassandra/pull/2110#discussion_r1101337334


##########
test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeColumnMaskingTest.java:
##########
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.distributed.upgrade;
+
+import org.junit.Test;
+
+import org.apache.cassandra.distributed.UpgradeableCluster;
+import org.apache.cassandra.distributed.api.ICoordinator;
+import org.assertj.core.api.Assertions;
+
+import static org.apache.cassandra.distributed.api.ConsistencyLevel.ALL;
+import static org.apache.cassandra.distributed.api.Feature.GOSSIP;
+import static org.apache.cassandra.distributed.shared.AssertUtils.assertRows;
+import static org.apache.cassandra.distributed.shared.AssertUtils.row;
+
+/**
+ * Tests that dynamic data masking (DDM) functions can be attached to table 
columns during a rolling upgrade involving
+ * nodes that don't include DDM.

Review Comment:
   can/cannot



##########
src/java/org/apache/cassandra/cql3/functions/masking/ColumnMask.java:
##########
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.cql3.functions.masking;
+
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+
+import com.google.common.base.Joiner;
+import com.google.common.collect.ImmutableList;
+
+import org.apache.commons.lang3.StringUtils;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.cql3.AssignmentTestable;
+import org.apache.cassandra.cql3.CQL3Type;
+import org.apache.cassandra.cql3.ColumnIdentifier;
+import org.apache.cassandra.cql3.CqlBuilder;
+import org.apache.cassandra.cql3.Term;
+import org.apache.cassandra.cql3.Terms;
+import org.apache.cassandra.cql3.functions.Function;
+import org.apache.cassandra.cql3.functions.FunctionName;
+import org.apache.cassandra.cql3.functions.FunctionResolver;
+import org.apache.cassandra.cql3.functions.ScalarFunction;
+import org.apache.cassandra.db.marshal.AbstractType;
+import org.apache.cassandra.db.marshal.ReversedType;
+import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.transport.ProtocolVersion;
+import org.apache.cassandra.utils.CassandraVersion;
+
+import static 
org.apache.cassandra.cql3.statements.RequestValidations.invalidRequest;
+
+/**
+ * Dynamic data mask that can be applied to a schema column.
+ * <p>
+ * It consists on a partial application of a certain {@link MaskingFunction} 
to the values of a column, with the
+ * precondition that the type of any masked column is compatible with the type 
of the first argument of the function.
+ * <p>
+ * This partial application is meant to be associated to specific columns in 
the schema, acting as a mask for the values
+ * of those columns. It's associated to queries such as:
+ * <pre>
+ *    CREATE TABLE %t (k int PRIMARY KEY, v int MASKED WITH mask_inner(1, 1));
+ *    ALTER TABLE t ALTER v MASKED WITH mask_inner(2, 1);
+ *    ALTER TABLE t ALTER v DROP MASKED;
+ * </pre>
+ * Note that in the example above we are referencing the {@code mask_inner} 
function with two arguments. However, that
+ * CQL function actually has three arguments. The first argument is always 
ommitted when attaching the function to a
+ * schema column. The value of that first argument is always the value of the 
masked column, in this case an int.
+ */
+public class ColumnMask
+{
+    /** The CQL function used for masking. */
+    public final ScalarFunction function;
+
+    /** The values of the arguments of the partially applied masking function. 
*/
+    public final List<ByteBuffer> partialArgumentValues;
+
+    public ColumnMask(ScalarFunction function, List<ByteBuffer> 
partialArgumentValues)
+    {
+        assert function.argTypes().size() == partialArgumentValues.size() + 1;
+        this.function = function;
+        this.partialArgumentValues = partialArgumentValues;
+    }
+
+    /**
+     * @return The types of the arguments of the partially applied masking 
function.
+     */
+    public List<AbstractType<?>> partialArgumentTypes()
+    {
+        List<AbstractType<?>> argTypes = function.argTypes();
+        return argTypes.size() == 1
+               ? Collections.emptyList()
+               : argTypes.subList(1, argTypes.size());
+    }
+
+    /**
+     * @return A copy of this mask for a version of its masked column that has 
its type reversed.
+     */
+    public ColumnMask withReversedType()
+    {
+        AbstractType<?> reversed = 
ReversedType.getInstance(function.argTypes().get(0));
+        List<AbstractType<?>> args = ImmutableList.<AbstractType<?>>builder()
+                                                  .add(reversed)
+                                                  
.addAll(partialArgumentTypes())
+                                                  .build();
+        Function newFunction = FunctionResolver.get(function.name().keyspace, 
function.name(), args, null, null, null);
+        assert newFunction != null;
+        return new ColumnMask((ScalarFunction) newFunction, 
partialArgumentValues);
+    }
+
+    /**
+     * @param protocolVersion the used version of the transport protocol
+     * @param value           a column value to be masked
+     * @return the specified value after having been masked by the masked 
function
+     */
+    public ByteBuffer mask(ProtocolVersion protocolVersion, ByteBuffer value)
+    {
+        List<ByteBuffer> args = new ArrayList<>(partialArgumentValues.size() + 
1);
+        args.add(value);
+        args.addAll(partialArgumentValues);
+        return function.execute(protocolVersion, args);
+    }
+
+    @Override
+    public boolean equals(Object o)
+    {
+        if (this == o)
+            return true;
+        if (o == null || getClass() != o.getClass())
+            return false;
+        ColumnMask mask = (ColumnMask) o;
+        return function.name().equals(mask.function.name())
+               && partialArgumentValues.equals(mask.partialArgumentValues);
+    }
+
+    @Override
+    public int hashCode()
+    {
+        return Objects.hash(function.name(), partialArgumentValues);
+    }
+
+    @Override
+    public String toString()
+    {
+        List<AbstractType<?>> types = partialArgumentTypes();
+        List<String> arguments = new ArrayList<>(types.size());
+        for (int i = 0; i < types.size(); i++)
+        {
+            CQL3Type type = types.get(i).asCQL3Type();
+            ByteBuffer value = partialArgumentValues.get(i);
+            arguments.add(type.toCQLLiteral(value, ProtocolVersion.CURRENT));
+        }
+        return String.format("%s(%s)", function.name(), 
StringUtils.join(arguments, ", "));
+    }
+
+    public void appendCqlTo(CqlBuilder builder)
+    {
+        builder.append(" MASKED WITH ").append(toString());
+    }
+
+    /**
+     * @return {@code true} if we know that the current cluster supports 
masked columns, or {@code false} if it either
+     * doesn't support it due to the presence of not-upgrades nodes, or we 
don't know if such old nodes exist.
+     */
+    public static boolean clusterSupportsMaskedColumns()
+    {
+        if (!Gossiper.instance.isEnabled())
+            return false;
+
+        long timeout = 
DatabaseDescriptor.getWriteRpcTimeout(TimeUnit.MILLISECONDS);
+        CassandraVersion minVersion = Gossiper.instance.getMinVersion(timeout, 
TimeUnit.MILLISECONDS);
+        return minVersion != null && 
minVersion.familyLowerBound.get().compareTo(CassandraVersion.CASSANDRA_4_1) > 0;
+    }
+
+    /**
+     * A parsed but not prepared column mask.
+     */
+    public final static class Raw
+    {
+        private static final Joiner JOINER = Joiner.on(',');
+
+        public final FunctionName name;
+        public final List<Term.Raw> rawPartialArguments;
+
+        public Raw(FunctionName name, List<Term.Raw> rawPartialArguments)
+        {
+            this.name = name;
+            this.rawPartialArguments = rawPartialArguments;
+        }
+
+        public ColumnMask prepare(String keyspace, String table, 
ColumnIdentifier column, AbstractType<?> type)
+        {
+            ScalarFunction function = findMaskingFunction(keyspace, table, 
column, type);
+
+            List<ByteBuffer> partialArguments = 
preparePartialArguments(keyspace, function);
+
+            return new ColumnMask(function, partialArguments);
+        }
+
+        private ScalarFunction findMaskingFunction(String keyspace, String 
table, ColumnIdentifier column, AbstractType<?> type)
+        {
+            List<AssignmentTestable> args = new 
ArrayList<>(rawPartialArguments.size() + 1);
+            args.add(type);
+            args.addAll(rawPartialArguments);
+
+            Function function = FunctionResolver.get(keyspace, name, args, 
keyspace, table, type);
+
+            if (function == null)
+                throw invalidRequest("Unable to find masking function for %s, 
" +
+                                     "no declared function matches the 
signature %s",
+                                     column, this);
+
+            if (function.isAggregate())
+                throw invalidRequest("Aggregate function %s cannot be used for 
masking table columns", this);
+
+            if (!function.isNative())
+                throw invalidRequest("User defined function %s cannot be used 
for masking table columns", this);
+
+            if (!(function instanceof MaskingFunction))
+                throw invalidRequest("Not-masking function %s cannot be used 
for masking table columns", this);
+
+            if (!function.returnType().equals(type))
+                throw invalidRequest("Masking function %s return type is %s. " 
+
+                                     "This is different to the type of the 
masked column %s of type %s. " +
+                                     "Masking functions can only be attached 
to table columns " +
+                                     "if they return the same data type as the 
masked column.",
+                                     this, function.returnType().asCQL3Type(), 
column, type.asCQL3Type());
+
+            return (ScalarFunction) function;
+        }
+
+        private List<ByteBuffer> preparePartialArguments(String keyspace, 
ScalarFunction function)
+        {
+            // Note that there could be null arguments
+            List<ByteBuffer> arguments = new 
ArrayList<>(rawPartialArguments.size());
+
+            for (int i = 0; i < rawPartialArguments.size(); i++)
+            {
+                String term = rawPartialArguments.get(i).toString();
+                AbstractType<?> type = function.argTypes().get(i + 1);
+                arguments.add(Terms.asBytes(keyspace, term, type));
+            }
+
+            return arguments;
+        }
+
+        @Override
+        public String toString()
+        {
+            return name.toString() + '(' + JOINER.join(rawPartialArguments) + 
')';

Review Comment:
   nit: Do we really need to optimize by having a static JOINER here? It seems 
more confusing that anything else the method is not called on production and is 
not really optimized either.



##########
src/java/org/apache/cassandra/cql3/selection/Selection.java:
##########
@@ -187,7 +187,9 @@ public static Selection fromSelectors(TableMetadata table,
                                                                             
factories,
                                                                             
isJson);
 
-        return (processesSelection(selectables) || selectables.size() != 
selectedColumns.size() || hasGroupBy)
+        boolean hasMaskedColumns = 
selectedColumns.stream().anyMatch(ColumnMetadata::isMasked);

Review Comment:
   For masked column. `ColumnMetadata.processSelection` should return `true`. 
That will ensure that  `processesSelection(selectables)` returns `true` for 
masked columns and avoid the need for `hasMaskedColumns`.
   
   Regarding performance my suggestion would be to use a profiler and see if 
the use of streams is visible or not. Otherwise it will end up as a long and 
inefficient discussion.
   
   



##########
src/java/org/apache/cassandra/cql3/statements/SelectStatement.java:
##########
@@ -1153,10 +1153,14 @@ private Selection prepareSelection(TableMetadata table,
             if (hasGroupBy)
                 Guardrails.groupByEnabled.ensureEnabled(state);
 
+            boolean isJson = parameters.isJson;
+            boolean returnStaticContentOnPartitionWithNoRows = 
restrictions.returnStaticContentOnPartitionWithNoRows();
+
             if (selectables.isEmpty()) // wildcard query
             {
-                return hasGroupBy ? Selection.wildcardWithGroupBy(table, 
boundNames, parameters.isJson, 
restrictions.returnStaticContentOnPartitionWithNoRows())
-                                  : Selection.wildcard(table, 
parameters.isJson, restrictions.returnStaticContentOnPartitionWithNoRows());
+                return hasGroupBy || 
table.columns().stream().anyMatch(ColumnMetadata::isMasked)

Review Comment:
   I would have expected to have a `table.hasMaskedColumn() ` method to avoid 
code duplication each time people need that information and to make the code 
more readable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to