We're using Cassandra 2.2. This document lists a number of CQL limits. I'm particularly interested in the Collection limits for Set and List. If I've interpreted it correctly, the document states that values in Sets are limited to 65535 bytes. This limit, as far as I know, exists because the set identity is implemented with a composite value in the column name of the storage engine's cell (similar to the clustering column value limit), which CQL restricts to that many bytes. (Is this correct?) Consider a table like CREATE TABLE test.bounds ( someid text, someorder text, words set<text>, PRIMARY KEY (guid, deviceid)) with PreparedStatement ps = session.prepare("INSERT INTO bounds (someid, someorder, epset) VALUES (?, ?, ?)");BoundStatement bs = ps.bind("id", "order", ImmutableSet.of(StringUtils.repeat('a', 66000)));session.execute(bs);
This will throw the expected exception Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: The sum of all clustering columns is too long (66024 > 65535) Now if I change the table to use a List instead of a Set CREATE TABLE test.bounds ( someid text, someorder text, words Set<text>, PRIMARY KEY (guid, deviceid)) and use BoundStatement bs = ps.bind("id", "order", ImmutableList.of(StringUtils.repeat('a', 66000))); I do not receive an exception. The document, however, states that List value sizes are also limited to 65535 bytes. Is the document incorrect or am I misinterpreting? I assumed List values are implemented as simple column values in the underlying storage and the order is maintained through their timestamps.