Stefano Lottini created CASSANDRA-19404:
-------------------------------------------
Summary: (5.0-beta-1) Unexpected NullPointerException in ANN+WHERE
when adding rows in another partition
Key: CASSANDRA-19404
URL: https://issues.apache.org/jira/browse/CASSANDRA-19404
Project: Cassandra
Issue Type: Bug
Components: Feature/Vector Search
Reporter: Stefano Lottini
* *Bug observed on the Docker image 5.0-beta1*
* *Bug also observed on latest head of Cassandra repo (as of 2024-02-15)*
* _*(working fine on vsearch branch of datastax/cassandra, commit hash
80c2f8b9ad5b89efee0645977a5ca53943717c0d)*_
Summary: A query with _ann + where clause on a map + where clause on the
partition key_ starts erroring once there are other partitions in the table.
There are three SELECT statements in the repro minimal code below - the third
is where the error is triggered.
{code:java}
// reproduced with Dockerized Cassandra 5.0-beta1 on 2024-02-15
/////////
// SCHEMA
/////////
CREATE TABLE ks.v_table (
pk int,
row_v vector<float, 2>,
metadata map<text, text>,
PRIMARY KEY (pk)
);
CREATE CUSTOM INDEX v_md
ON ks.v_table (entries(metadata))
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX v_idx
ON ks.v_table (row_v)
USING 'StorageAttachedIndex';
/////////////////////////////
// SELECT WORKS (empty table)
/////////////////////////////
SELECT * FROM ks.v_table
WHERE metadata['map_k'] = 'map_v'
AND pk = 0
ORDER BY row_v ANN OF [0.1, 0.2]
LIMIT 4;
//////////////
// ADD ONE ROW
//////////////
INSERT INTO ks.v_table (pk, metadata, row_v)
VALUES
(0, {'map_k': 'map_v'}, [0.11, 0.19]);
/////////////////////////////////////////////
// SELECT WORKS (table has queried partition)
/////////////////////////////////////////////
SELECT * FROM ks.v_table
WHERE metadata['map_k'] = 'map_v'
AND pk = 0
ORDER BY row_v ANN OF [0.1, 0.2]
LIMIT 4;
//////////////////////////////////
// ADD ONE ROW (another partition)
//////////////////////////////////
INSERT INTO ks.v_table (pk, metadata, row_v)
VALUES
(10, {'map_k': 'map_v'}, [0.11, 0.19]);
/////////////////////////////////////////////////
// SELECT BREAKS (table gained another partition)
/////////////////////////////////////////////////
SELECT * FROM ks.v_table
WHERE metadata['map_k'] = 'map_v'
AND pk = 0
ORDER BY row_v ANN OF [0.1, 0.2]
LIMIT 4; {code}
The error has this appearance in CQL Console:
{code:java}
ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read]
message="Operation failed - received 0 responses and 1 failures: UNKNOWN from
/172.17.0.2:7000" info={'consistency': 'ONE', 'required_responses': 1,
'received_responses': 0, 'failures': 1, 'error_code_map': {'172.17.0.2':
'0x0000'}} {code}
And the Cassandra logs have this to say:
{code:java}
java.lang.NullPointerException: Cannot invoke
"org.apache.cassandra.index.sai.iterators.KeyRangeIterator.skipTo(org.apache.cassandra.index.sai.utils.PrimaryKey)"
because "this.nextIterator" is null {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]