Re: Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

Matope Ono Sat, 15 Oct 2016 08:13:53 -0700

Thank you DuyHai.
I was in two minds about large partitions for my app.
I thought upgrading to 3.x would be good and easy option. But now I'm going
to work on refactoring my data model :)


2016-10-15 20:38 GMT+09:00 DuyHai Doan <doanduy...@gmail.com>:

> Yes, more or less. The 100Mb is a rule of thumb. No one will blame you for
> storing 200Mb for example. The figure is just given as an example of order
> of magnitude
>
> On Sat, Oct 15, 2016 at 1:37 PM, Kant Kodali <k...@peernova.com> wrote:
>
>> you mean 100MB (MegaBytes)? Also the data in each of my column is about
>> 1KB so in that case the optimal size 100K columns (since 100K * 1KB =
>> 100MB) right?
>>
>> On Sat, Oct 15, 2016 at 4:26 AM, DuyHai Doan <doanduy...@gmail.com>
>> wrote:
>>
>>> "2) so what is optimal limit in terms of data size?"
>>>
>>> --> Usual recommendations for Cassandra 2.1 are:
>>>
>>> a. max 100Mb per partition size
>>> b. or up to 10 000 000 physical columns for a partition (including
>>> clustering columns etc ...)
>>>
>>> Recently, with the work of Robert Stupp (CASSANDRA-11206) and also with
>>> the huge enhancement from Michael Kjellman (CASSANDRA-9754) it will be
>>> easier to handle huge partition in memory, especially with a reduce memory
>>> footprint with regards to the JVM heap.
>>>
>>> However, as long as we don't have repair and streaming processes that
>>> can be "resumed" in a middle of a partition, the operational pains will
>>> still be there. Same for compaction
>>>
>>>
>>>
>>> On Sat, Oct 15, 2016 at 12:00 PM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> 1) It will be great if someone can confirm that there is no limit
>>>> 2) so what is optimal limit in terms of data size?
>>>>
>>>> Finally, Thanks a lot for pointing out all the operational issues!
>>>>
>>>> On Sat, Oct 15, 2016 at 2:39 AM, DuyHai Doan <doanduy...@gmail.com>
>>>> wrote:
>>>>
>>>>> "But is there still 2B columns limit on the Cassandra code?"
>>>>>
>>>>> --> I remember some one the committer saying that this 2B columns
>>>>> limitation comes from the Thrift era where you're limited to max  2B
>>>>> columns to be returned to the client for each request. It also applies to
>>>>> the max size of each "page" of data
>>>>>
>>>>> Since the introduction of the binary protocol and the paging feature,
>>>>> this limitation does not make sense anymore.
>>>>>
>>>>> By the way, if your partition is too wide, you'll face other
>>>>> operational issues way before reaching the 2B columns limit:
>>>>>
>>>>> - compaction taking looooong time --> heap pressure --> long GC pauses
>>>>> --> nodes flapping
>>>>> - repair & over-streaming, repair session failure in the middle that
>>>>> forces you to re-send the whole big partition --> the receiving node has a
>>>>> bunch of duplicate data --> pressure on compaction
>>>>> - bootstrapping of new nodes --> failure to stream a partition in the
>>>>> middle will force to re-send the whole partition from the beginning again 
>>>>> -->
>>>>> the receiving node has a bunch of duplicate data --> pressure on 
>>>>> compaction
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Oct 15, 2016 at 9:15 AM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>>  compacting 10 sstables each of them have a 15GB partition in what
>>>>>> duration?
>>>>>>
>>>>>> On Fri, Oct 14, 2016 at 11:45 PM, Matope Ono <matope....@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Please forget the part in my sentence.
>>>>>>> For more correctly, maybe I should have said like "He could compact
>>>>>>> 10 sstables each of them have a 15GB partition".
>>>>>>> What I wanted to say is we can store much more rows(and columns) in
>>>>>>> a partition than before 3.6.
>>>>>>>
>>>>>>> 2016-10-15 15:34 GMT+09:00 Kant Kodali <k...@peernova.com>:
>>>>>>>
>>>>>>>> "Robert said he could treat safely 10 15GB partitions at his
>>>>>>>> presentation" This sounds like there is there is a row limit too
>>>>>>>> not only columns??
>>>>>>>>
>>>>>>>> If I am reading this correctly 10 15GB partitions  means 10
>>>>>>>> partitions (like 10 row keys,  thats too small) with each partition of 
>>>>>>>> size
>>>>>>>> 15GB. (thats like 15 million columns where each column can have a data 
>>>>>>>> of
>>>>>>>> size 1KB).
>>>>>>>>
>>>>>>>> On Fri, Oct 14, 2016 at 11:30 PM, Kant Kodali <k...@peernova.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> "Robert said he could treat safely 10 15GB partitions at his
>>>>>>>>> presentation" This sounds like there is there is a row limit too
>>>>>>>>> not only columns??
>>>>>>>>>
>>>>>>>>> If I am reading this correctly 10 15GB partitions  means 10
>>>>>>>>> partitions (like 10 row keys,  thats too small) with each partition 
>>>>>>>>> of size
>>>>>>>>> 15GB. (thats like 10 million columns where each column can have a 
>>>>>>>>> data of
>>>>>>>>> size 1KB).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Oct 14, 2016 at 9:54 PM, Matope Ono <matope....@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks to CASSANDRA-11206, I think we can have much larger
>>>>>>>>>> partition than before 3.6.
>>>>>>>>>> (Robert said he could treat safely 10 15GB partitions at his
>>>>>>>>>> presentation. https://www.youtube.com/watch?v=N3mGxgnUiRY)
>>>>>>>>>>
>>>>>>>>>> But is there still 2B columns limit on the Cassandra code?
>>>>>>>>>> If so, out of curiosity, I'd like to know where the bottleneck
>>>>>>>>>> is. Could anyone let me know about it?
>>>>>>>>>>
>>>>>>>>>> Thanks Yasuharu.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2016-10-13 1:11 GMT+09:00 Edward Capriolo <edlinuxg...@gmail.com>
>>>>>>>>>> :
>>>>>>>>>>
>>>>>>>>>>> The "2 billion column limit" press clipping "puffery". This
>>>>>>>>>>> statement seemingly became popular because highly traffic 
>>>>>>>>>>> traffic-ed story,
>>>>>>>>>>> in which a tech reporter embellished on a statement to make a 
>>>>>>>>>>> splashy
>>>>>>>>>>> article.
>>>>>>>>>>>
>>>>>>>>>>> The effect is something like this:
>>>>>>>>>>> http://www.healthnewsreview.org/2012/08/iced-tea-kidney-ston
>>>>>>>>>>> es-and-the-study-that-never-existed/
>>>>>>>>>>>
>>>>>>>>>>> Iced tea does not cause kidney stones! Cassandra does not store
>>>>>>>>>>> rows with 2 billion columns! It is just not true.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 12, 2016 at 4:57 AM, Kant Kodali <k...@peernova.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well 1) I have not sent it to postgresql mailing lists 2) I
>>>>>>>>>>>> thought this is an open ended question as it can involve ideas from
>>>>>>>>>>>> everywhere including the Cassandra java driver mailing lists so 
>>>>>>>>>>>> sorry If
>>>>>>>>>>>> that bothered you for some reason.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Oct 12, 2016 at 1:41 AM, Dorian Hoxha <
>>>>>>>>>>>> dorian.ho...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Also, I'm not sure, but I don't think it's "cool" to write to
>>>>>>>>>>>>> multiple lists in the same message. (based on postgresql mailing 
>>>>>>>>>>>>> lists
>>>>>>>>>>>>> rules).
>>>>>>>>>>>>> Example I'm not subscribed to those, and now the messages are
>>>>>>>>>>>>> separated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Oct 12, 2016 at 10:37 AM, Dorian Hoxha <
>>>>>>>>>>>>> dorian.ho...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are some issues working on larger partitions.
>>>>>>>>>>>>>> Hbase doesn't do what you say! You have also to be carefull
>>>>>>>>>>>>>> on hbase not to create large rows! But since they are 
>>>>>>>>>>>>>> globally-sorted, you
>>>>>>>>>>>>>> can easily sort between them and create small rows.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In my opinion, cassandra people are wrong, in that they say
>>>>>>>>>>>>>> "globally sorted is the devil!" while all fb/google/etc actually 
>>>>>>>>>>>>>> use
>>>>>>>>>>>>>> globally-sorted most of the time! You have to be careful though 
>>>>>>>>>>>>>> (just like
>>>>>>>>>>>>>> with random partition)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you tell what rowkey1, page1, col(x) actually are ? Maybe
>>>>>>>>>>>>>> there is a way.
>>>>>>>>>>>>>> The most "recent", means there's a timestamp in there ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Oct 12, 2016 at 9:58 AM, Kant Kodali <
>>>>>>>>>>>>>> k...@peernova.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand Cassandra can have a maximum of 2B rows per
>>>>>>>>>>>>>>> partition but in practice some people seem to suggest the magic 
>>>>>>>>>>>>>>> number is
>>>>>>>>>>>>>>> 100K. why not create another partition/rowkey automatically 
>>>>>>>>>>>>>>> (whenever we
>>>>>>>>>>>>>>> reach a safe limit that  we consider would be efficient)  with 
>>>>>>>>>>>>>>> auto
>>>>>>>>>>>>>>> increment bigint  as a suffix appended to the new rowkey? so 
>>>>>>>>>>>>>>> that the
>>>>>>>>>>>>>>> driver can return the new rowkey  indicating that there is a 
>>>>>>>>>>>>>>> new partition
>>>>>>>>>>>>>>> and so on...Now I understand this would involve allowing 
>>>>>>>>>>>>>>> partial row key
>>>>>>>>>>>>>>> searches which currently Cassandra wouldn't do (but I believe 
>>>>>>>>>>>>>>> HBASE does)
>>>>>>>>>>>>>>> and thinking about token ranges and potentially many other 
>>>>>>>>>>>>>>> things..
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My current problem is this
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a row key followed by bunch of columns (this is not
>>>>>>>>>>>>>>> time series data)
>>>>>>>>>>>>>>> and these columns can grow to any number so since I have
>>>>>>>>>>>>>>> 100K limit (or whatever the number is. say some limit) I want 
>>>>>>>>>>>>>>> to break the
>>>>>>>>>>>>>>> partition into level/pages
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> rowkey1, page1->col1, col2, col3......
>>>>>>>>>>>>>>> rowkey1, page2->col1, col2, col3......
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> now say my Cassandra db is populated with data and say my
>>>>>>>>>>>>>>> application just got booted up and I want to most recent value 
>>>>>>>>>>>>>>> of a certain
>>>>>>>>>>>>>>> partition but I don't know which page it belongs to since my 
>>>>>>>>>>>>>>> application
>>>>>>>>>>>>>>> just got booted up? how do I solve this in the most efficient 
>>>>>>>>>>>>>>> that is
>>>>>>>>>>>>>>> possible in Cassandra today? I understand I can create MV, 
>>>>>>>>>>>>>>> other tables
>>>>>>>>>>>>>>> that can hold some auxiliary data such as number of pages per 
>>>>>>>>>>>>>>> partition and
>>>>>>>>>>>>>>> so on..but that involves the maintenance cost of that other 
>>>>>>>>>>>>>>> table which I
>>>>>>>>>>>>>>> cannot afford really because I have MV's, secondary indexes for 
>>>>>>>>>>>>>>> other good
>>>>>>>>>>>>>>> reasons. so it would be great if someone can explain the best 
>>>>>>>>>>>>>>> way possible
>>>>>>>>>>>>>>> as of today with Cassandra? By best way I mean is it possible 
>>>>>>>>>>>>>>> with one
>>>>>>>>>>>>>>> request? If Yes, then how? If not, then what is the next best 
>>>>>>>>>>>>>>> way to solve
>>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> kant
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

Reply via email to