Bringing this discussion to dev@ rather than Slack as we try to figure out 
CASSANDRA-20313 and CASSANDRA-19461.

In the type system, we have 2 different (but related) methods:

AbstractType#allowsEmpty                        - if the user gives empty bytes 
(new byte[0]) will the type reject it
AbstractType#isEmptyValueMeaningless  - if the user gives empty bytes, should 
this be handled like null?

In practice, there are 2 cases that matter:

allowsEmpty = true AND is meaningless = false - stuff like text and bytes
allowsEmpty = true AND is meaningless = true  - many types, example "int"

What this means is that users are able to use empty bytes when writing to these 
types, but this leads to complexity in the filter path, and is something we are 
trying to flesh out the “correct” semantics for SAI.

Simple example:

{code}
@Test
public void test() throws IOException
{
    try (Cluster cluster = Cluster.build(1).start())
    {
        init(cluster);
        cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl (pk int primary 
key, v int)"));
        IInvokableInstance node = cluster.get(1);
        for (int i = 0; i < 10; i++)
            node.executeInternal(withKeyspace("INSERT INTO %s.tbl (pk, v) 
VALUES (?, ?)"), i, ByteBufferUtil.EMPTY_BYTE_BUFFER);

        var qr = node.executeInternalWithResult(withKeyspace("SELECT * FROM 
%s.tbl WHERE v=? ALLOW FILTERING"), ByteBufferUtil.EMPTY_BYTE_BUFFER);
        StringBuilder sb = new StringBuilder();
        sb.append(qr.names());
        while (qr.hasNext())
        {
            var next = qr.next();
            sb.append('\n').append(next);
        }
        System.out.println(sb);
    }
}
{code}

“Should” this return 10 rows or 0?  In this case, the type is int, and int 
defines empty as meaningless, which means it should act as a null; yet this 
query returns 10 rows, which violates CQL as foo = null == false.

Right now there really isn’t a way to query for NULL (CASSANDRA-10715 is still 
open), but if we did add such a thing we would also need to figure out the 
semantics with regard to these cases.

Reply via email to