Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

Kant Kodali Wed, 12 Oct 2016 00:59:18 -0700

Hi All,

I understand Cassandra can have a maximum of 2B rows per partition but in
practice some people seem to suggest the magic number is 100K. why not
create another partition/rowkey automatically (whenever we reach a safe
limit that  we consider would be efficient)  with auto increment bigint  as
a suffix appended to the new rowkey? so that the driver can return the new
rowkey  indicating that there is a new partition and so on...Now I
understand this would involve allowing partial row key searches which
currently Cassandra wouldn't do (but I believe HBASE does) and thinking
about token ranges and potentially many other things..


My current problem is this

I have a row key followed by bunch of columns (this is not time series data)
and these columns can grow to any number so since I have 100K limit (or
whatever the number is. say some limit) I want to break the partition into
level/pages

rowkey1, page1->col1, col2, col3......
rowkey1, page2->col1, col2, col3......

now say my Cassandra db is populated with data and say my application just
got booted up and I want to most recent value of a certain partition but I
don't know which page it belongs to since my application just got booted
up? how do I solve this in the most efficient that is possible in Cassandra
today? I understand I can create MV, other tables that can hold some
auxiliary data such as number of pages per partition and so on..but that
involves the maintenance cost of that other table which I cannot afford
really because I have MV's, secondary indexes for other good reasons. so it
would be great if someone can explain the best way possible as of today
with Cassandra? By best way I mean is it possible with one request? If Yes,
then how? If not, then what is the next best way to solve this?

Thanks,
kant

Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

Reply via email to