Thanks Robert for the answer. It makes sense. If that happens then it means that your design or use case needs some rework ;)
Regards, Shahab On Tue, Sep 17, 2013 at 2:37 PM, java8964 java8964 <java8...@hotmail.com>wrote: > Another question related to the SSTable files generated in the incremental > backup is not really ONLY incremental delta, right? It will include more > than delta in the SSTable files. > > I will use the example to show my question: > > first, we have this data in the SSTable file 1: > > rowkey(1), columns (maker=honda). > > later, if we add one column in the same key: > > rowkey(1), columns (maker=honda, color=blue) > > The data above being flushed to another SSTable file 2. In this case, it > will be part of the incremental backup at this time. But in fact, it will > contain both old data (make=honda), plus new changes (color=blue). > > So in fact, incremental backup of Cassandra is just hard link all the new > SSTable files being generated during the incremental backup period. It > could contain any data, not just the data being update/insert/delete in > this period, correct? > > Thanks > > Yong > > > From: dean.hil...@nrel.gov > > To: user@cassandra.apache.org > > Date: Tue, 17 Sep 2013 08:11:36 -0600 > > Subject: Re: questions related to the SSTable file > > > > Netflix created file streaming in astyanax into cassandra specifically > because writing too big a column cell is a bad thing. The limit is really > dependent on use case….do you have servers writing 1000's of 200Meg files > at the same time….if so, astyanax streaming may be a better way to go there > where it divides up the file amongst cells and rows. > > > > I know the limit of a row size is really your hard disk space and the > column count if I remember goes into billions though realistically, I think > beyond 10 million might slow down a bit….all I know is we tested up to 10 > million columns with no issues in our use-case. > > > > So you mean at this time, I could get 2 SSTable files, both contain > column "Blue" for the same row key, right? > > > > Yes > > > > In this case, I should be fine as value of the "Blue" column contain the > timestamp to help me to find out which is the last change, right? > > > > Yes > > > > In MR world, each file COULD be processed by different Mapper, but will > be sent to the same reducer as both data will be shared same key. > > > > If that is the way you are writing it, then yes > > > > Dean > > > > From: Shahab Yunus <shahab.yu...@gmail.com<mailto:shahab.yu...@gmail.com > >> > > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > > Date: Tuesday, September 17, 2013 7:54 AM > > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > > Subject: Re: questions related to the SSTable file > > > > derstand if following changes apply to the same row key as above > example, additional SSTable file could be generated. That is >