Nicolas Favre-Felix created CASSANDRA-8254:
----------------------------------------------
Summary: Query parameters (and more) are limited to 65,536 entries
Key: CASSANDRA-8254
URL: https://issues.apache.org/jira/browse/CASSANDRA-8254
Project: Cassandra
Issue Type: Bug
Components: API
Reporter: Nicolas Favre-Felix
Parameterized queries are sent over the wire as a string followed by a list of
arguments. This list is decoded in QueryOptions.Codec by
CBUtil.readValueList(body), which in turn reads a 16-bit short value from the
wire as the number of values to deserialize.
Sending more values leads to a silent overflow, sometimes reported by the
driver as a protocol error as other values are deserialized incorrectly.
64k sounds like a lot, but tables with a large number of clustering dimensions
can hit this limit when fetching a few thousand CQL rows only with an IN query,
e.g.
{code}
SELECT * FROM sensor_data WHERE a=? and (b,c,d,e,f,g,h,i) IN
((?,?,?,?,?,?,?,?), (?,?,?,?,?,?,?,?), (?,?,?,?,?,?,?,?), (?,?,?,?,?,?,?,?) ...
)
{code}
Here, having 8 dimensions in the clustering key plus 1 in the partitioning key
restricts the read to 8,191 CQL rows.
Some other parts of Cassandra still use 16-bit sizes, for example preventing
users to fetch all elements of a large collection (CASSANDRA-6428). The
suggestion at the time was "we'll fix it in the next iteration of the binary
protocol", so I'd like to suggest switching to variable-length integers as this
would solve such issues while keeping messages short.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)