> Just curious why do you think row key will take 300 byte? That's what I thought it said earlier in the email thread.
> If the row key is Long type, doesn't it take 8 bytes? Yes, 8 bytes on disk. > In his case, the rowCache was 500M with 1.6M rows, so the row data is 300B. > Did I miss something? Did that take into account the token, the row key, and the row payload, and the java memory overhead ? Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 16/11/2012, at 9:35 AM, Wei Zhu <wz1...@yahoo.com> wrote: > Just curious why do you think row key will take 300 byte? If the row key is > Long type, doesn't it take 8 bytes? > In his case, the rowCache was 500M with 1.6M rows, so the row data is 300B. > Did I miss something? > > Thanks. > -Wei > > From: aaron morton <aa...@thelastpickle.com> > To: user@cassandra.apache.org > Sent: Thursday, November 15, 2012 12:15 PM > Subject: Re: unable to read saved rowcache from disk > > For a row cache of 1,650,000: > > 16 byte token > 300 byte row key ? > and row data ? > multiply by a java fudge factor or 5 or 10. > > Trying delete the saved cache and restarting. > > Cheers > > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 15/11/2012, at 8:20 PM, Wz1975 <wz1...@yahoo.com> wrote: > >> Before shut down, you saw rowcache has 500m, 1.6m rows, each row average >> 300B, so 700k row should be a little over 200m, unless it is reading more, >> maybe tombstone? Or the rows on disk have grown for some reason, but row >> cache was not updated? Could be something else eats up the memory. You may >> profile memory and see who consumes the memory. >> >> >> Thanks. >> -Wei >> >> Sent from my Samsung smartphone on AT&T >> >> >> -------- Original message -------- >> Subject: Re: unable to read saved rowcache from disk >> From: Manu Zhang <owenzhang1...@gmail.com> >> To: user@cassandra.apache.org >> CC: >> >> >> 3G, other jvm parameters are unchanged. >> >> >> On Thu, Nov 15, 2012 at 2:40 PM, Wz1975 <wz1...@yahoo.com> wrote: >> How big is your heap? Did you change the jvm parameter? >> >> >> >> Thanks. >> -Wei >> >> Sent from my Samsung smartphone on AT&T >> >> >> -------- Original message -------- >> Subject: Re: unable to read saved rowcache from disk >> From: Manu Zhang <owenzhang1...@gmail.com> >> To: user@cassandra.apache.org >> CC: >> >> >> add a counter and print out myself >> >> >> On Thu, Nov 15, 2012 at 1:51 PM, Wz1975 <wz1...@yahoo.com> wrote: >> Curious where did you see this? >> >> >> Thanks. >> -Wei >> >> Sent from my Samsung smartphone on AT&T >> >> >> -------- Original message -------- >> Subject: Re: unable to read saved rowcache from disk >> From: Manu Zhang <owenzhang1...@gmail.com> >> To: user@cassandra.apache.org >> CC: >> >> >> OOM at deserializing 747321th row >> >> >> On Thu, Nov 15, 2012 at 9:08 AM, Manu Zhang <owenzhang1...@gmail.com> wrote: >> oh, as for the number of rows, it's 1650000. How long would you expect it to >> be read back? >> >> >> On Thu, Nov 15, 2012 at 3:57 AM, Wei Zhu <wz1...@yahoo.com> wrote: >> Good information Edward. >> For my case, we have good size of RAM (76G) and the heap is 8G. So I set the >> row cache to be 800M as recommended. Our column is kind of big, so the hit >> ratio for row cache is around 20%, so according to datastax, might just turn >> the row cache altogether. >> Anyway, for restart, it took about 2 minutes to load the row cache >> >> INFO [main] 2012-11-14 11:43:29,810 AutoSavingCache.java (line 108) reading >> saved cache /var/lib/cassandra/saved_caches/XXX-f2-RowCache >> INFO [main] 2012-11-14 11:45:12,612 ColumnFamilyStore.java (line 451) >> completed loading (102801 ms; 21125 keys) row cache for XXX.f2 >> >> Just for comparison, our key is long, the disk usage for row cache is 253K. >> (it only stores key when row cache is saved to disk, so 253KB/ 8bytes = >> 31625 number of keys). It's about right... >> So for 15MB, there could be a lot of "narrow" rows. (if the key is Long, >> could be more than 1M rows) >> >> Thanks. >> -Wei >> From: Edward Capriolo <edlinuxg...@gmail.com> >> To: user@cassandra.apache.org >> Sent: Tuesday, November 13, 2012 11:13 PM >> Subject: Re: unable to read saved rowcache from disk >> >> http://wiki.apache.org/cassandra/LargeDataSetConsiderations >> >> A negative side-effect of a large row-cache is start-up time. The >> periodic saving of the row cache information only saves the keys that >> are cached; the data has to be pre-fetched on start-up. On a large >> data set, this is probably going to be seek-bound and the time it >> takes to warm up the row cache will be linear with respect to the row >> cache size (assuming sufficiently large amounts of data that the seek >> bound I/O is not subject to optimization by disks) >> >> Assuming a row cache 15MB and the average row is 300 bytes, that could >> be 50,000 entries. 4 hours seems like a long time to read back 50K >> entries. Unless the source table was very large and you can only do a >> small number / reads/sec. >> >> On Tue, Nov 13, 2012 at 9:47 PM, Manu Zhang <owenzhang1...@gmail.com> wrote: >> > "incorrect"... what do you mean? I think it's only 15MB, which is not big. >> > >> > >> > On Wed, Nov 14, 2012 at 10:38 AM, Edward Capriolo <edlinuxg...@gmail.com> >> > wrote: >> >> >> >> Yes the row cache "could be" incorrect so on startup cassandra verify they >> >> saved row cache by re reading. It takes a long time so do not save a big >> >> row >> >> cache. >> >> >> >> >> >> On Tuesday, November 13, 2012, Manu Zhang <owenzhang1...@gmail.com> wrote: >> >> > I have a rowcache provieded by SerializingCacheProvider. >> >> > The data that has been read into it is about 500MB, as claimed by >> >> > jconsole. After saving cache, it is around 15MB on disk. Hence, I >> >> > suppose >> >> > the size from jconsole is before serializing. >> >> > Now while restarting Cassandra, it's unable to read saved rowcache back. >> >> > By "unable", I mean around 4 hours and I have to abort it and remove >> >> > cache >> >> > so as not to suspend other tasks. >> >> > Since the data aren't huge, why Cassandra can't read it back? >> >> > My Cassandra is 1.2.0-beta2. >> > >> > >> >> >> >> >> >> 9 > > >