[ 
https://issues.apache.org/jira/browse/CASSANDRA-20012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891605#comment-17891605
 ] 

Caleb Rackliffe commented on CASSANDRA-20012:
---------------------------------------------

If all you want to do is equality queries on small blob data, this could work 
reasonably well. This would be especially true if any prefix compression on the 
blobs is possible. The reason they aren't currently supported is our worry that 
being able to index arbitrary blobs could get out of hand/it's usually more 
appropriate to index more clearly typed data. Frozen collections are supported 
(i.e. equality queries are), but those have some structure.

tl;dr It would be possible to support this, but we'd have to think through some 
appropriate guardrails. (see {{sai_frozen_term_size_[warn|fail]_threshold}})

> SAI (maybe) could support the blob CQL type
> -------------------------------------------
>
>                 Key: CASSANDRA-20012
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20012
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/SAI
>            Reporter: Vincent Rischmann
>            Priority: Normal
>
> Hello,
> we're currently exploring SAI for a use case, we have a table like this:
> ```
> CREATE TABLE ucp.profile (
>     project_key blob,
>     profile_id blob,
>     age int,
>     PRIMARY KEY ((project_key, profile_id))
> )
> ```
> we tried using a SAI index but we get an error:
> ``` 
> cqlsh:ucp> create custom index ucp_profile_project_key on ucp.profile 
> (project_key) using 'sai';
> InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="Unsupported type: blob"
> ```
> The documentation is clear, it lists all the types that are supported, but it 
> didn't say _why_ the `blob` type is not supported and it seemed a little 
> weird to me that `blob` isn't supported because in my mind it should be the 
> easiest type to support; I have no clue if that is _true_ but I started to 
> investigate anyway.
> I took a look at the code and stumbled upon [this 
> line](https://github.com/apache/cassandra/blob/6bbe85ef51d2a5a52b9a585e7c1d3031a7fb0c80/src/java/org/apache/cassandra/index/sai/StorageAttachedIndex.java#L163-L168)
>  which lists all the supported types. I thought "why not just add the BLOB 
> type and see what happens ?"
> So that's basically what I did and to my surprise it _seems_ to work from my 
> limited testing:
> ```
> cqlsh:ucp> CREATE TABLE ucp.profile (    project_key blob,    profile_id 
> blob,   age int,    PRIMARY KEY ((project_key, profile_id)));
> cqlsh:ucp> create custom index ucp_profile_profile_id on ucp.profile 
> (profile_id) using 'sai';                                                     
>                                                    
> cqlsh:ucp> insert into profile(project_key, profile_id, age) values(0xcafe, 
> 0xdeadbeef, 200);                                                             
>                                               cqlsh:ucp> select * from 
> profile_id = 0xdeadbeef;                                                      
>                                                           
> cqlsh:ucp> select * from profile where profile_id = 0xdeadbeef;
>  project_key | profile_id | age
> -------------+------------+-----
>       0xcafe | 0xdeadbeef | 200
> (1 rows)
> ```
> I didn't test further, but I'm left wondering if it is simply an oversight 
> that blob are not supported with SAI today ? and if it is _not_ an oversight, 
> I'd love to know the reasons.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to