[ 
https://issues.apache.org/jira/browse/CASSANDRA-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925726#comment-17925726
 ] 

Joseph Vasco edited comment on CASSANDRA-19661 at 2/11/25 8:46 PM:
-------------------------------------------------------------------

I'm reopening this ticket as I've reproduced exactly the same issue in 5.0.2.

Basically, I also run a cluster of 3 nodes, I also use a vector index '<float, 
1024>', and my nodes are also unable to restart after data is written into that 
index.

One table is loaded with a large amount of vectors (millions), which seems to 
make Cassandra unstable, which eventually makes a node stop and unable to 
restart, presenting the same logs as above. Only solution is then to reset it.

I haven't investigated Cassandra's code in detail, but in practice, if I don't 
write into the vector-indexed column, my cluster is suddenly very stable with 
no problem restarting.

I've tried several implementations of Memtable and Compaction Strategies, which 
made no difference.

I'm unfortunately also unable to provide data as it's non public, but I'll try 
to send some logs as soon as I can.


was (Author: JIRAUSER308657):
I'm reopening this ticket as I've reproduced exactly the same issue in 5.0.2.

Basically, I also run a cluster of 3 nodes, I also use a vector index '<float, 
1024>', and my nodes are also unable to restart after data is written into that 
index.

One table is loaded with a large amount of vectors (hundreds of millions), 
which seems to make Cassandra unstable, which eventually makes a node stop and 
unable to restart, presenting the same logs as above. Only solution is then to 
reset it.

I haven't investigated Cassandra's code in detail, but in practice, if I don't 
write into the vector-indexed column, my cluster is suddenly very stable with 
no problem restarting.

I've tried several implementations of Memtable and Compaction Strategies, which 
made no difference.

I'm unfortunately also unable to provide data as it's non public, but I'll try 
to send some logs as soon as I can.

> Cannot restart Cassandra 5 after creating a vector table and index
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-19661
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19661
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/SAI
>            Reporter: Sergio Rua
>            Priority: Normal
>             Fix For: 5.0-rc1, 5.0, 5.1
>
>         Attachments: 5.0.2_fail_memtableflush_vector_full.txt, 
> upload_content.py
>
>
> I'm using llama-index and llama3 to train a model. I'm using a very simple 
> code that reads some *.txt files from local and uploads them to Cassandra and 
> then creates the index:
>  
> {code:java}
> # Create the index from documents
> index = VectorStoreIndex.from_documents(
>     documents,
>     service_context=vector_store.service_context,
>     storage_context=storage_context,
>     show_progress=True,
>     ) {code}
> This works well and I'm able to use a Chat app to get responses from the 
> Cassandra data. however, right after, I cannot restart Cassandra. It'll break 
> with the following error:
>  
> {code:java}
> INFO  [PerDiskMemtableFlushWriter_0:7] 2024-05-23 08:23:20,102 
> Flushing.java:179 - Completed flushing 
> /data/cassandra/data/gpt/docs_20240523-10c8eaa018d811ef8dadf75182f3e2b4/da-6-bti-Data.db
>  (124.236MiB) for commitlog position 
> CommitLogPosition(segmentId=1716452305636, position=15336)
> [...]
> WARN  [MemtableFlushWriter:1] 2024-05-23 08:28:29,575 
> MemtableIndexWriter.java:92 - [gpt.docs.idx_vector_docs] Aborting index 
> memtable flush for 
> /data/cassandra/data/gpt/docs-aea77a80184b11ef8dadf75182f3e2b4/da-3-bti...{code}
> {code:java}
> java.lang.IllegalStateException: null
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>         at 
> org.apache.cassandra.index.sai.disk.v1.vector.VectorPostings.computeRowIds(VectorPostings.java:76)
>         at 
> org.apache.cassandra.index.sai.disk.v1.vector.OnHeapGraph.writeData(OnHeapGraph.java:313)
>         at 
> org.apache.cassandra.index.sai.memory.VectorMemoryIndex.writeDirect(VectorMemoryIndex.java:272)
>         at 
> org.apache.cassandra.index.sai.memory.MemtableIndex.writeDirect(MemtableIndex.java:110)
>         at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flushVectorIndex(MemtableIndexWriter.java:192)
>         at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:117)
>         at 
> org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
>         at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>         at 
> java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1085)
>         at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
>         at 
> org.apache.cassandra.db.compaction.unified.ShardedMultiWriter.commit(ShardedMultiWriter.java:219)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1323)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1222)
>         at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829) {code}
> The table created by the script is as follows:
>  
> {noformat}
> CREATE TABLE gpt.docs (
>     partition_id text,
>     row_id text,
>     attributes_blob text,
>     body_blob text,
>     vector vector<float, 1024>,
>     metadata_s map<text, text>,
>     PRIMARY KEY (partition_id, row_id)
> ) WITH CLUSTERING ORDER BY (row_id ASC)
>     AND additional_write_policy = '99p'
>     AND allow_auto_snapshot = true
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND cdc = false
>     AND comment = ''
>     AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', 
> 'scaling_parameters': 'T4', 'target_sstable_size': '1GiB'}
>     AND compression = {'chunk_length_in_kb': '16', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND memtable = 'default'
>     AND crc_check_chance = 1.0
>     AND default_time_to_live = 0
>     AND extensions = {}
>     AND gc_grace_seconds = 864000
>     AND incremental_backups = true
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair = 'BLOCKING'
>     AND speculative_retry = '99p';
> CREATE CUSTOM INDEX eidx_metadata_s_docs ON gpt.docs (entries(metadata_s)) 
> USING 'org.apache.cassandra.index.sai.StorageAttachedIndex';
> CREATE CUSTOM INDEX idx_vector_docs ON gpt.docs (vector) USING 
> 'org.apache.cassandra.index.sai.StorageAttachedIndex';{noformat}
> Thank you
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to