Re: Cassandra 1.1.5 - SerializingCacheProvider - possible memory leak?
Size and Capacity are in bytes. The RAM is consumed right after Cassandra start (3GB heap) - the reason for this could be 400.000.000 rows on single node, serialized bloom filters take 1,2 GB HDD space. On Mon, Dec 3, 2012 at 10:14 AM, Maciej Miklas mac.mik...@gmail.com wrote: Hi, I have following Cassandra setup on server with 24GB RAM: *cassandra-env.sh* MAX_HEAP_SIZE=6G HEAP_NEWSIZE=500M *cassandra.yaml* key_cache_save_period: 0 row_cache_save_period: 0 key_cache_size_in_mb: 512 row_cache_size_in_mb: 10240 row_cache_provider: SerializingCacheProvider I'am getting OutOfMemory errors, and VisulalVM shows that Old Gen takes nearly whole heap. Those are the Cassandra log messages: INFO CLibrary JNA mlockall successful INFO DatabaseDescriptor DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO DatabaseDescriptor Global memtable threshold is enabled at 1981 MB INFO CacheService Initializing key cache with capacity of 512 MBs. INFO CacheService Scheduling key cache save to each 0 seconds (going to save all keys). INFO CacheService Initializing row cache with capacity of 10240 MBs and provider org.apache.cassandra.cache.SerializingCacheProvider INFO CacheService Scheduling row cache save to each 0 seconds (going to save all keys). . INFO GCInspector GC for ConcurrentMarkSweep: 1106 ms for 1 collections, 5445489440 used; max is 6232735744 INFO StatusLogger Cache Type Size Capacity KeysToSave Provider INFO StatusLogger KeyCache 831 782 831 782 all INFO StatusLogger RowCache 196404489 196404688 all org.apache.cassandra.cache.SerializingCacheProvider . INFO StatusLogger ColumnFamily Memtable ops, data INFO StatusLogger MyCF1 192828,66056113 INFO StatusLogger MyCF2 59913,19535021 INFO StatusLogger MyCF3 124953,59082091 WARN [ScheduledTasks:1] GCInspector.java Heap is 0.8623632454134093 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically 1) I've set row cache size to 10GB. Single row needs in serialized form between 300-500 bytes, this would allow maximum 20 millions row key entries. SerializingCacheProvider reports size of 196 millions, how can I interpret this number? 2) I am using default settings besides changes described above. Since key cache is small, and off heap cache is active, what is taking space in Old Gen? Thanks, Maciej
Cassandra as session store under heavy load
Hi *, I would like to use Cassandra to store session related informations. I do not have real HTTP session - it's different protocol, but the same concept. Memcached would be fine, but I would like to additionally persist data. Cassandra setup: - non replicated Key Space - single Column Family, where key is session ID and each column within row stores single key/value - (MapString,SetString,String) - column TTL = 10 minutes - write CL = ONE - read CL = ONE - 2.000 writes/s - 5.000 reads/s Data example: session1:{ // CF row key {prop1:val1, TTL:10 min}, {prop2:val2, TTL:10 min}, . {propXXX:val3, TTL:10 min} }, session2:{ // CF row key {prop1:val1, TTL:10 min}, {prop2:val2, TTL:10 min}, }, .. session:{ // CF row key {prop1:val1, TTL:10 min}, {prop2:val2, TTL:10 min}, } In this case consistency is not a problem, but the performance could be, especially disk IO. Since data in my session leaves for short time, I would like to avoid storing it on hard drive - except for commit log. I have some questions: 1. If column expires in Memtable before flushing it to SSTable, will Cassandra anyway store such column in SSTable (flush it to HDD)? 2. Replication is disabled for my Key Space, in this case storing such expired column in SSTable would not be necessary, right? 3. Each CF hat max 10 columns. In such case I would enable row cache and disable key cache. But I am expecting my data to be still available in Memtable, in this case I could disable whole cache, right? 4. Any Cassandra configuration hints for such session-store use case would be really appreciated :) Thank you, Maciej
Re: Cassandra as session store under heavy load
- RF is 1. We have few KeySpaces, only this one is not replicated - this data is not that very important. In case of error customer will have to execute process again. But again, I would like to persist it. - Serializing data is not an option, because I would like to have possibility to access data using console - I will keep row cache - you are right, there is no guarantee, that my data is still in Memtable I will get my hardware soon (3 servers) and we will see ;) In this worst case I will switch my session storage to memcached, and leave all other data in Cassandra (no TTL, or very long) Another questions: - Using Cassandra to build something like HTTP session store with short TTL is not an anti-pattern ? - There is really no way to tell Cassandra, that particular Key Space should be stored mostly in RAM and only asynchronous backup on HDD (JMS has something like that)? Thanks, Maciej
Re: Cassandra as session store under heavy load
durable_writes sounds great - thank you! I really do not need commit log here. Another question: it is possible to configure live time of Tombstones? Regards, Maciej
Row Cache Heap Requirements (Cassandra 1.0)
Hi all, I've tested row cache, and find out, that it requires large amount of Heap - I would like to verify this theory. This is my test key space: { TestCF: { row_key_1: { { clientKey: MyTestCluientKey }, { tokenSecret: kd94hf93k423kf44 }, { verifier: hfdp7dh39dks9884 }, { callbackUrl: http%3A%2F%2Fprinter.test.com%2Fready }, { accountId: 234567876545}, { mytestResourceId: ADB112}, { dataTimestamp: 1308903420400 }, { dataType: ACCESS_PERMANENT} }, row_key_2: { { clientKey: MyTestCluientKey }, { tokenSecret: qdqergvhetyhvetyh }, { verifier: wtrgvebyjnrnuiucewrqxcc }, { callbackUrl: http%3A%2F%2Fprinter.test.com%2Fready }, { accountId: 23456789746534}, { mytestResourceId: DQERGCWRTHB}, { dataTimestamp: 130890342333200 }, { dataType: ACCESS_LIMITED} }, ... row_key_x: { }, } } Each row in CF: TestCF contains 8 columns. Row cache is enabled, key cache is disabled. Row hit rate 0.99 - this is read only test. My test loads 1.500.000 rows into cache - and this allocates about 3.5GB heap - this is about 2KB pro single row - this is a lot Is it possible, that single row (8 columns) can allocate about 2KB heap? Thank you, Maciej
Re: Row Cache Heap Requirements (Cassandra 1.0)
this is how I tested it: 1) load cache with 1.500.000 entries 2) execute fill gc 3) mesure heap size (using visual vm) 4) execute flush row cahce over cli 5) execute full gc 6) and again mesure hap usage The difference between 6) and 3) is the heap size used by cache On Fri, Oct 28, 2011 at 3:26 PM, Peter Schuller peter.schul...@infidyne.com wrote: Is it possible, that single row (8 columns) can allocate about 2KB heap? It sounds a bit much, though not extremely so (depending on how much overhead there is per-column relative to per-row). Are you definitely looking at the live size of the heap (for example, trigger a full GC and look at results) and not just how much data there happens to be on the heap after your insertion? In any case, if you are looking for better memory efficiency I advise looking at the off-heap row cache ( http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management ). It is suposed to be enabled by default. If it's not, do you have JNA installed? The reason I say that is that the off-heap cache stores serialized information on the heap rather than the full tree of Java objects. If off-heap caching is enabled, 2 kb/row key would be far far more than expected (unless I'm missing something, I've actually yet to actually measure it myself ;)). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Cassandra 1.x and proper JNA setup
Hi all, is there any documentation about proper JNA configuration? I do not understand few things: 1) Does JNA use JVM heap settings? 2) Do I need to decrease max heap size while using JNA? 3) How do I limit RAM allocated by JNA? 4) Where can I see / monitor row cache size? 5) I've configured JNA just for test on my dev computer and so far I've noticed serious performance issues (high cpu usage on heavy write load), so I must be doing something wrong I've just copied JNA jars into Cassandra/lib, without installing any native libs. This should not work at all, right? Thanks, Maciej
Re: Cassandra 1.x and proper JNA setup
I've just found, that JNA will be not used from 1.1 release - https://issues.apache.org/jira/browse/CASSANDRA-3271 I would be also nice to know what was the reason for this decision. Regards, Maciej On Wed, Nov 2, 2011 at 1:34 PM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Up, also interested in answers to questions below. Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.-Original Message- From: Maciej Miklas [mailto:mac.mik...@googlemail.com] Sent: Tuesday, November 01, 2011 11:15 To: user@cassandra.apache.org Subject: Cassandra 1.x and proper JNA setup Hi all, is there any documentation about proper JNA configuration? I do not understand few things: 1) Does JNA use JVM heap settings? 2) Do I need to decrease max heap size while using JNA? 3) How do I limit RAM allocated by JNA? 4) Where can I see / monitor row cache size? 5) I've configured JNA just for test on my dev computer and so far I've noticed serious performance issues (high cpu usage on heavy write load), so I must be doing something wrong I've just copied JNA jars into Cassandra/lib, without installing any native libs. This should not work at all, right? Thanks, Maciej
Re: Cassandra 1.x and proper JNA setup
According to source code, JNA is being used to call malloc and free. In this case each cached row will be serialized into RAM. We must be really careful when defining cache size - to large size would cause out of memory. Previous Cassandra releases has logic that would decrease cache size if heap is low. Currently each row will be serialized without any memory limit checks - assuming that I understood it right. Those properties: reduce_cache_sizes_at: 0.85 reduce_cache_capacity_to: 0.6 are not used anymore - at least not when JNA is enabled, witch is default from Cassandra 1.0 On Wed, Nov 2, 2011 at 1:53 PM, Maciej Miklas mac.mik...@googlemail.comwrote: I've just found, that JNA will be not used from 1.1 release - https://issues.apache.org/jira/browse/CASSANDRA-3271 I would be also nice to know what was the reason for this decision. Regards, Maciej On Wed, Nov 2, 2011 at 1:34 PM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Up, also interested in answers to questions below. Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.-Original Message- From: Maciej Miklas [mailto:mac.mik...@googlemail.com] Sent: Tuesday, November 01, 2011 11:15 To: user@cassandra.apache.org Subject: Cassandra 1.x and proper JNA setup Hi all, is there any documentation about proper JNA configuration? I do not understand few things: 1) Does JNA use JVM heap settings? 2) Do I need to decrease max heap size while using JNA? 3) How do I limit RAM allocated by JNA? 4) Where can I see / monitor row cache size? 5) I've configured JNA just for test on my dev computer and so far I've noticed serious performance issues (high cpu usage on heavy write load), so I must be doing something wrong I've just copied JNA jars into Cassandra/lib, without installing any native libs. This should not work at all, right? Thanks, Maciej
Re: Cassandra 1.x and proper JNA setup
Super - thank you for help :) On Thu, Nov 3, 2011 at 6:55 PM, Jonathan Ellis jbel...@gmail.com wrote: Relying on that was always a terrible idea because you could easily OOM before it could help. There's no substitute for don't make the caches too large in the first place. We're working on https://issues.apache.org/jira/browse/CASSANDRA-3143 to make cache sizing easier. On Thu, Nov 3, 2011 at 3:16 AM, Maciej Miklas mac.mik...@googlemail.com wrote: According to source code, JNA is being used to call malloc and free. In this case each cached row will be serialized into RAM. We must be really careful when defining cache size - to large size would cause out of memory. Previous Cassandra releases has logic that would decrease cache size if heap is low. Currently each row will be serialized without any memory limit checks - assuming that I understood it right. Those properties: reduce_cache_sizes_at: 0.85 reduce_cache_capacity_to: 0.6 are not used anymore - at least not when JNA is enabled, witch is default from Cassandra 1.0 On Wed, Nov 2, 2011 at 1:53 PM, Maciej Miklas mac.mik...@googlemail.com wrote: I've just found, that JNA will be not used from 1.1 release - https://issues.apache.org/jira/browse/CASSANDRA-3271 I would be also nice to know what was the reason for this decision. Regards, Maciej On Wed, Nov 2, 2011 at 1:34 PM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Up, also interested in answers to questions below. Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.-Original Message- From: Maciej Miklas [mailto:mac.mik...@googlemail.com] Sent: Tuesday, November 01, 2011 11:15 To: user@cassandra.apache.org Subject: Cassandra 1.x and proper JNA setup Hi all, is there any documentation about proper JNA configuration? I do not understand few things: 1) Does JNA use JVM heap settings? 2) Do I need to decrease max heap size while using JNA? 3) How do I limit RAM allocated by JNA? 4) Where can I see / monitor row cache size? 5) I've configured JNA just for test on my dev computer and so far I've noticed serious performance issues (high cpu usage on heavy write load), so I must be doing something wrong I've just copied JNA jars into Cassandra/lib, without installing any native libs. This should not work at all, right? Thanks, Maciej -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Off-heap caching through ByteBuffer.allocateDirect when JNA not available ?
I would like to know it also - actually is should be similar, plus there are no dependencies to sun.misc packages. Regards, Maciej On Thu, Nov 10, 2011 at 1:46 PM, Benoit Perroud ben...@noisette.ch wrote: Thanks for the answer. I saw the move to sun.misc. In what sense allocateDirect is broken ? Thanks, Benoit. 2011/11/9 Jonathan Ellis jbel...@gmail.com: allocateDirect is broken for this purpose, but we removed the JNA dependency using sun.misc.Unsafe instead: https://issues.apache.org/jira/browse/CASSANDRA-3271 On Wed, Nov 9, 2011 at 5:54 AM, Benoit Perroud ben...@noisette.ch wrote: Hi, I wonder if you have already discussed about ByteBuffer.allocateDirect alternative to JNA memory allocation ? If so, do someone mind send me a pointer ? Thanks ! Benoit. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- sent from my Nokia 3210
Data Model Design for Login Servie
Hallo all, I need your help to design structure for simple login service. It contains about 100.000.000 customers and each one can have about 10 different logins - this results 1.000.000.000 different logins. Each customer contains following data: - one to many login names as string, max 20 UTF-8 characters long - ID as long - one customer has only one ID - gender - birth date - name - password as MD5 Login process needs to find user by login name. Data in Cassandra is replicated - this is necessary to obtain all required login data in single call. Also usually we expect low write traffic and heavy read traffic - round trips for reading data should be avoided. Below I've described two possible cassandra data models based on example: we have two users, first user has two logins and second user has three logins A) Skinny rows - row key contains login name - this is the main search criteria - login data is replicated - each possible login is stored as single row which contains all user data - 10 logins for single customer create 10 rows, where each row has different key and the same content // first 3 rows has different key and the same replicated data alfred.tes...@xyz.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, alf...@aad.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, a...@dd.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, // two following rows has again the same data for second customer manf...@xyz.de { id: 1133 gender: MALE birthdate: 1997.02.01 name: Manfredus Maximus pwd: e44c504ff16c8fcd2fe8c74bb492adda }, rober...@xyz.de { id: 1133 gender: MALE birthdate: 1997.02.01 name: Manfredus Maximus pwd: e44c504ff16c8fcd2fe8c74bb492adda } B) Rows grouped by alphabetical prefix - Number of rows is limited - for example first letter from login name - Each row contains all logins which benign with row key - row with key 'a' contains all logins which begin with 'a' - Data might be unbalanced, but we avoid skinny rows - this might have positive performance impact (??) - to avoid super columns each row contains directly columns, where column name is the user login and column value is corresponding data in kind of serialized form (I would like to have is human readable) a { alfred.tes...@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa, alf...@aad.de@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa, a...@dd.de@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa }, m { manf...@xyz.de:1133;MALE;1997.02.01; Manfredus Maximus;e44c504ff16c8fcd2fe8c74bb492adda }, r { rober...@xyz.de:1133;MALE;1997.02.01; Manfredus Maximus;e44c504ff16c8fcd2fe8c74bb492adda } Which solution is better, especially for better read performance? Do you have better idea? Thanks, Maciej
Re: Data Model Design for Login Servie
but secondary index is limited only to repeating values like enums. In my case I would have performance issue. right? On 18.11.2011, at 02:08, Maxim Potekhin potek...@bnl.gov wrote: 1122: { gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa alias1: alfred.tes...@xyz.de alias2: alf...@aad.de alias3: a...@dd.de } ...and you can use secondary indexes to query on anything. Maxim On 11/17/2011 4:08 PM, Maciej Miklas wrote: Hallo all, I need your help to design structure for simple login service. It contains about 100.000.000 customers and each one can have about 10 different logins - this results 1.000.000.000 different logins. Each customer contains following data: - one to many login names as string, max 20 UTF-8 characters long - ID as long - one customer has only one ID - gender - birth date - name - password as MD5 Login process needs to find user by login name. Data in Cassandra is replicated - this is necessary to obtain all required login data in single call. Also usually we expect low write traffic and heavy read traffic - round trips for reading data should be avoided. Below I've described two possible cassandra data models based on example: we have two users, first user has two logins and second user has three logins A) Skinny rows - row key contains login name - this is the main search criteria - login data is replicated - each possible login is stored as single row which contains all user data - 10 logins for single customer create 10 rows, where each row has different key and the same content // first 3 rows has different key and the same replicated data alfred.tes...@xyz.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, alf...@aad.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, a...@dd.de { id: 1122 gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa }, // two following rows has again the same data for second customer manf...@xyz.de { id: 1133 gender: MALE birthdate: 1997.02.01 name: Manfredus Maximus pwd: e44c504ff16c8fcd2fe8c74bb492adda }, rober...@xyz.de { id: 1133 gender: MALE birthdate: 1997.02.01 name: Manfredus Maximus pwd: e44c504ff16c8fcd2fe8c74bb492adda } B) Rows grouped by alphabetical prefix - Number of rows is limited - for example first letter from login name - Each row contains all logins which benign with row key - row with key 'a' contains all logins which begin with 'a' - Data might be unbalanced, but we avoid skinny rows - this might have positive performance impact (??) - to avoid super columns each row contains directly columns, where column name is the user login and column value is corresponding data in kind of serialized form (I would like to have is human readable) a { alfred.tes...@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa, alf...@aad.de@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa, a...@dd.de@xyz.de:1122;MALE;1987.11.09; Alfred Tester;e72c504dc16c8fcd2fe8c74bb492affa }, m { manf...@xyz.de:1133;MALE;1997.02.01; Manfredus Maximus;e44c504ff16c8fcd2fe8c74bb492adda }, r { rober...@xyz.de:1133;MALE;1997.02.01; Manfredus Maximus;e44c504ff16c8fcd2fe8c74bb492adda } Which solution is better, especially for better read performance? Do you have better idea? Thanks, Maciej
Re: Data Model Design for Login Servie
I will follow exactly this solution - thanks :) On Fri, Nov 18, 2011 at 9:53 PM, David Jeske dav...@gmail.com wrote: On Thu, Nov 17, 2011 at 1:08 PM, Maciej Miklas mac.mik...@googlemail.comwrote: A) Skinny rows - row key contains login name - this is the main search criteria - login data is replicated - each possible login is stored as single row which contains all user data - 10 logins for single customer create 10 rows, where each row has different key and the same content To me this seems reasonable. Remember, because of your replication of the datavalues you will want a quick way to find all the logins for a given ID, so you will also want to store a separate dataset like: 1122 { alfred.tes...@xyz.de =1(where the login is a column key) alf...@aad.de =1 } When you do an update, you'll need to fetch the entire row for the user-id, and then update all copies of the data. THis can create problems, if the data is out of sync (which it will be at certain times because of eventual consistency, and might be if something bad happens). ...the other option, of course, is to make a login-name indirection. You would have only one copy of the user-data stored by ID, and then you would store a separate mapping from login-name-to-ID. Of course this would require two roundtrips to get the user information from login-id, which is something I know you said you didn't want to do.
Cassandra - row range and column slice
Hallo, assuming Ordered Partitioner I would like to have possibility to find records by row key range and columns by slice - for example: Give me all rows between 2001 and 2003 and all columns between A and C. For such data: { 2001: {A:v1, Z:v2}, 2002: {R:v2, Z:v3}, 2003: {C:v4, Z:v5}, 2004: {A:v1,B:v33, Z:v2} } Result would be: 2001: {A:v1}, 2003: {C:v4} Is such multi-slice query possible with Cassandra? Are there any performance Issues (bedsides unbalanced cluster)? Thanks, Maciej
Cassandra cache patterns with thiny and wide rows
I've asked this question already on stackoverflow but without answer - I wll try again: My use case expects heavy read load - there are two possible model design strategies: 1. Tiny rows with row cache: In this case row is small enough to fit into RAM and all columns are being cached. Read access should be fast. 2. Wide rows with key cache. Wide rows with large columns amount are to big for row cache. Access to column subset requires HDD seek. As I understand using wide rows is a good design pattern. But we would need to disable row cache - so what is the benefit of such wide row (at least for read access)? Which approach is better 1 or 2?
Re: hector connection pool
Have you tried to change: me.prettyprint.cassandra.service.CassandraHostConfigurator#retryDownedHostsDelayInSeconds ? Hector will ping down hosts every xx seconds and recover connection. Regards, Maciej On Mon, Mar 5, 2012 at 8:13 PM, Daning Wang dan...@netseer.com wrote: I just got this error : All host pools marked down. Retry burden pushed out to client. in a few clients recently, client could not recover, we have to restart client application. we are using 0.8.0.3 hector. At that time we did compaction for a CF, it takes several hours, server was busy. But I think client should recover after server load was down. Any bug reported about this? I did search but could not find one. Thanks, Daning
Cassandra as Database for Role Based Access Control System
Hi *, I would like to know your opinion about using Cassandra to implement a RBAC-like authentication authorization model. We have simplified the central relationship of the general model ( http://en.wikipedia.org/wiki/Role-based_access_control) to: user ---n:m--- role ---n:m--- resource user(s) and resource(s) are indexed with externally visible identifiers. These identifiers need to be re-ownable (think: mail aliases), too. The main reason to consider Cassandra is the availability, scalability and (global) geo-redundancy. This is hard to achieve with a RBDMS. On the other side, RBAC has many m:n relations. While some inconsistencies may be acceptable, resource ownership (i.e. role=owner) must never ever be mixed up. What do you think? Is such relational model an antipattern for Cassandra usage? Do you know similar solutions based on Cassandra? Regards, Maciej ps. I've posted this question also on stackoverflow, but I would like to also get feedback from Cassandra community.
Re: Schema advice/help
multiget would require Order Preserving Partitioner, and this can lead to unbalanced ring and hot spots. Maybe you can use secondary index on itemtype - is must have small cardinality: http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/ On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito dnd1...@gmail.com wrote: without the ability to do disjoint column slices, i would probably use 5 different rows. userId:itemType - activityId then it's a multiget slice of 10 items from each of your 5 rows. On 26/03/2012 22:16, Ertio Lew wrote: I need to store activities by each user, on 5 items types. I always want to read last 10 activities on each item type, by a user (ie, total activities to read at a time =50). I am wanting to store these activities in a single row for each user so that they can be retrieved in single row query, since I want to read all the last 10 activities on each item.. I am thinking of creating composite names appending itemtype : activityId(activityId is just timestamp value) but then, I don't see about how to read the last 10 activities from all itemtypes. Any ideas about schema to do this better way ?
Re: Schema advice/help
yes - but anyway in your example you need key range quey and that requires OOP, right? On Tue, Mar 27, 2012 at 5:13 PM, Guy Incognito dnd1...@gmail.com wrote: multiget does not require OPP. On 27/03/2012 09:51, Maciej Miklas wrote: multiget would require Order Preserving Partitioner, and this can lead to unbalanced ring and hot spots. Maybe you can use secondary index on itemtype - is must have small cardinality: http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/ On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito dnd1...@gmail.com wrote: without the ability to do disjoint column slices, i would probably use 5 different rows. userId:itemType - activityId then it's a multiget slice of 10 items from each of your 5 rows. On 26/03/2012 22:16, Ertio Lew wrote: I need to store activities by each user, on 5 items types. I always want to read last 10 activities on each item type, by a user (ie, total activities to read at a time =50). I am wanting to store these activities in a single row for each user so that they can be retrieved in single row query, since I want to read all the last 10 activities on each item.. I am thinking of creating composite names appending itemtype : activityId(activityId is just timestamp value) but then, I don't see about how to read the last 10 activities from all itemtypes. Any ideas about schema to do this better way ?
Re: Schema advice/help
correct - I see also no other solution for this problem On Thu, Mar 29, 2012 at 1:46 AM, Guy Incognito dnd1...@gmail.com wrote: well, no. my assumption is that he knows what the 5 itemTypes (or appropriate corresponding ids) are, so he can do a known 5-rowkey lookup. if he does not know, then agreed, my proposal is not a great fit. could do (as originally suggested) userId - itemType:activityId if you want to keep everything in the same row (again assumes that you know what the itemTypes are). but then you can't really do a multiget, you have to do 5 separate slice queries, one for each item type. can also do some wacky stuff around maintaining a row that explicitly only holds the last 10 items by itemType (meaning you have to delete the oldest one everytime you insert a new one), but that prolly requires read-on-write etc and is a lot messier. and you will prolly need to worry about the case where you (transiently) have more than 10 'latest' items for a single itemType. On 28/03/2012 09:49, Maciej Miklas wrote: yes - but anyway in your example you need key range quey and that requires OOP, right? On Tue, Mar 27, 2012 at 5:13 PM, Guy Incognito dnd1...@gmail.com wrote: multiget does not require OPP. On 27/03/2012 09:51, Maciej Miklas wrote: multiget would require Order Preserving Partitioner, and this can lead to unbalanced ring and hot spots. Maybe you can use secondary index on itemtype - is must have small cardinality: http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/ On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito dnd1...@gmail.comwrote: without the ability to do disjoint column slices, i would probably use 5 different rows. userId:itemType - activityId then it's a multiget slice of 10 items from each of your 5 rows. On 26/03/2012 22:16, Ertio Lew wrote: I need to store activities by each user, on 5 items types. I always want to read last 10 activities on each item type, by a user (ie, total activities to read at a time =50). I am wanting to store these activities in a single row for each user so that they can be retrieved in single row query, since I want to read all the last 10 activities on each item.. I am thinking of creating composite names appending itemtype : activityId(activityId is just timestamp value) but then, I don't see about how to read the last 10 activities from all itemtypes. Any ideas about schema to do this better way ?
Cassandra 1.1 - conflict resolution - any changes ?
Hi, I've seen this blog entry: http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 and I am trying to understand, how could Cassandra support PRIMARY KEY. Cassandra has silent conflict resolution, where each insert overwrites next one, and there are only inserts and deletes - no updates. The last data version is being resolved first during read - as the latest entry from all corresponding SS Tables. Is this still correct in Cassandra 1.1? Thanks, Maciej
CQL 3.0 - UPDATE Statement - how it works?
CQL will have UPDATE future, I am trying to understand how this could work. Every write is an append to SSTable, UPDATE would need to change data, but only if it exists, and this is problematic, since we have distributed system. Is UPDATE special kind of insert, which changes given data only if it already exists? Will be UPDATE resolved first during read operation (SSTable merge)? Thanks, Maciej
Cassandra 1.0 - is disk seek required to access SSTable metadata
Hi all, older Cassandra versions had to read columns from each SSTable with positive bloom filter in order to find recent value. This was optimized with: Improve read performance in update-intensive workload https://issues.apache.org/jira/browse/CASSANDRA-2498 Now each SSTable has metadata - SSTableMetadata. Bloom filter is stored in RAM, but what about metadata? Is disk seek required to access it? Thanks, Maciej
SSTable Index and Metadata - are they cached in RAM?
Hi all, bloom filter for row keys is always in RAM. What about SSTable index, and Metadata? Is it cached by Cassandra, or it relays on memory mapped files? Thanks, Maciej
Re: SSTable Index and Metadata - are they cached in RAM?
Great articles, I did not find those before ! * SSTable Index - yes I mean column Index. *I would like to understand, how many disk seeks might be required to find column in single SSTable. I am assuming positive bloom filter on row key. Now Cassandra needs to find out whenever given SSTable contains column name, and this might require few disk seeks: 1) Check key cache, if found go to 5) 2) Rad from disk all row keys, in order to find one (binary search) 3) Found row key contains disk offset to its column index 4) Read from disk column index for our row key. Index contains also bloom filter on column names 5) Use bloom filter on column name, to find out whenever this SSTable might contain our column 6) Read column to finally make sure that is exists As I understand, in the worst case, we can have three disk seeks (2, 4, 6) pro SSTable in order to check whenever it contains given column, it that correct ? I would expect, that sorted row keys (from point 2) ) already contain bloom filter for their columns. But bloom filter is stored together with column index, is that correct? Cheers, Maciej On Fri, Aug 17, 2012 at 12:06 AM, aaron morton aa...@thelastpickle.comwrote: What about SSTable index, Not sure what you are referring to there. Each row has a in a SStable has a bloom filter and may have an index of columns. This is not cached. See http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ or http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-dive-query-performance and Metadata? This is the meta data we hold in memory for every open sstable https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/08/2012, at 7:34 PM, Maciej Miklas mac.mik...@gmail.com wrote: Hi all, bloom filter for row keys is always in RAM. What about SSTable index, and Metadata? Is it cached by Cassandra, or it relays on memory mapped files? Thanks, Maciej
Re: Understanding UnavailableException
UnavailableException is bit tricky. It means, that not all replicas required by CL received update. Actually you do not know, whenever update was stored or not, and actually what went wrong. This is the case, why writing with CL.ALL might get problematic. It is enough, that only one replica is off-line and you will get exception. Remember also, that CL.ALL means, all replicas in all Data Centers - not only local DC. Writing with QUORUM_LOCAL could be better idea. There is only one CL, where exception guarantees, that data was really not stored: CL.ANY with hinted handoff enabled. One more thing: write goes always to all replicas independent from provided CL. Client request blocks only until required replicas respond - however this response is asynchronous. This means, when you write with lower CL, replicas will get data with the same speed, only your client does not wait for acknowledgment from all of them. Ciao, Maciej On Fri, Aug 17, 2012 at 11:07 AM, Mohit Agarwal coolmoh...@gmail.comwrote: Hi guys, I am trying to understand what happens when an UnavailableException is thrown. a) Suppose we are doing a ConsistencyLevel.ALL write on a 3 node cluster. My understanding is that if one of the nodes is down and the coordinator node is aware of that(through gossip), then it will respond to the request with an UnavailableException. Is this correct? b) What happens if the coordinator isn't aware of a node being down and sends the request to all the nodes and never hears back from one of the node. Would this result in a TimedOutException or a UnavailableException? c) I am trying to understand the cases where the client receives an error, but data could have been inserted into Cassandra. One such case is the TimedOutException. Are there any other situations like these? Thanks, Mohit
Re: What is the ideal server-side technology stack to use with Cassandra?
I'am using Java + Tomcat + Spring + Hector on Lunux - I works as always just great. It is also not bad idea to mix databases - Cassandra is not always solution for every problem, Cassandra + Mongo could be ;) On Fri, Aug 17, 2012 at 7:54 PM, Aaron Turner synfina...@gmail.com wrote: My stack: Java + JRuby + Rails + Torquebox I'm using the Hector client (arguably the most mature out there) and JRuby+RoR+Torquebox gives me a great development platform which really scales (full native thread support for example) and is extremely powerful. Honestly I expect, all my future RoR apps will be built on JRuby/Torquebox because I've been so happy with it even if I don't have a specific need to utilize Java libraries from inside the app. And the best part is that I've yet to have to write a single line of Java! :) On Fri, Aug 17, 2012 at 6:53 AM, Edward Capriolo edlinuxg...@gmail.com wrote: The best stack is the THC stack. :) Tomcat Hadoop Cassandra :) On Fri, Aug 17, 2012 at 6:09 AM, Andy Ballingall TF balling...@thefoundry.co.uk wrote: Hi, I've been running a number of tests with Cassandra using a couple of PHP drivers (namely PHPCassa (https://github.com/thobbs/phpcassa/) and PDO-cassandra ( http://code.google.com/a/apache-extras.org/p/cassandra-pdo/), and the experience hasn't been great, mainly because I can't try out the CQL3. Aaron Morton (aa...@thelastpickle.com) advised: If possible i would avoid using PHP. The PHP story with cassandra has not been great in the past. There is little love for it, so it takes a while for work changes to get in the client drivers. AFAIK it lacks server side states which makes connection pooling impossible. You should not pool cassandra connections in something like HAProxy. So my question is - if you were to build a new scalable project from scratch tomorrow sitting on top of Cassandra, which technologies would you select to serve HTTP requests to ensure you get: a) The best support from the cassandra community (e.g. timely updates of drivers, better stability) b) Optimal efficiency between webservers and cassandra cluster, in terms of the performance of individual requests and in the volumes of connections handled per second c) Ease of development and and deployment. What worked for you, and why? What didn't work for you? -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Cyclop - CQL3 web based editor
Hi all, This is the Cassandra mailing list, but I've developed something that is strictly related to Cassandra, and some of you might find it useful, so I've decided to send email to this group. This is web based CQL3 editor. The idea is, to deploy it once and have simple and comfortable CQL3 interface over web - without need to install anything. The editor itself supports code completion, not only based on CQL syntax, but also based database content - so for example the select statement will suggest tables from active keyspace, or in where closure only columns from table provided after select from The results are displayed in reversed table - rows horizontally and columns vertically. It seems to be more natural for column oriented database. You can also export query results to CSV, or add query as browser bookmark. The whole application is based on wicket + bootstrap + spring and can be deployed in any web 3.0 container. Here is the project (open source): https://github.com/maciejmiklas/cyclop Have a fun! Maciej
Re: Cassandra 1.2 : OutOfMemoryError: unable to create new native thread
the cassandra-env.sh has option JVM_OPTS=$JVM_OPTS -Xss180k it will give this error if you start cassandra with java 7. So increase the value, or remove option. Regards, Maciej On Mon, Dec 16, 2013 at 2:37 PM, srmore comom...@gmail.com wrote: What is your thread stack size (xss) ? try increasing that, that could help. Sometimes the limitation is imposed by the host provider (e.g. amazon ec2 etc.) Thanks, Sandeep On Mon, Dec 16, 2013 at 6:53 AM, Oleg Dulin oleg.du...@gmail.com wrote: Hi guys! I beleive my limits settings are correct. Here is the output of ulimits -a: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1547135 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 10 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 32768 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited However, I just had a couple of cassandra nodes go down over the weekend for no apparent reason with the following error: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:691) at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Any input is greatly appreciated. -- Regards, Oleg Dulin http://www.olegdulin.com
Cyclop - CQL web based editor has been released!
Hi everybody, I am aware that this mailing list is meant for Cassandra users, but I’ve developed something that is strictly related to Cassandra, so I tough that it might be interesting for some of you. I’ve already sent one email several months ago, but since then a lot of things has changed! Cyclop is web based CQL editor - you can deploy it in web container and use it’s web interface to execute CQL queries or to import/export data. There is also live deployment, so you can try it out immediately. Of course the whole thing is open source. Hier is the project link containing all details: https://github.com/maciejmiklas/cyclop Regards, Maciej
Re: Cyclop - CQL web based editor has been released!
thanks - I've fixed it. Regards, Maciej On Mon, May 12, 2014 at 2:50 AM, graham sanderson gra...@vast.com wrote: Looks cool - giving it a try now (note FYI when building, TestDataConverter.java line 46 assumes a specific time zone) On May 11, 2014, at 12:41 AM, Maciej Miklas mac.mik...@gmail.com wrote: Hi everybody, I am aware that this mailing list is meant for Cassandra users, but I’ve developed something that is strictly related to Cassandra, so I tough that it might be interesting for some of you. I’ve already sent one email several months ago, but since then a lot of things has changed! Cyclop is web based CQL editor - you can deploy it in web container and use it’s web interface to execute CQL queries or to import/export data. There is also live deployment, so you can try it out immediately. Of course the whole thing is open source. Hier is the project link containing all details: https://github.com/maciejmiklas/cyclop Regards, Maciej
CQL 3 and wide rows
Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? Regards, Maciej Miklas
Re: CQL 3 and wide rows
Hallo Jack, You have given a perfect example for wide row. Each reading from sensor creates new column within a row. It was also possible with Hector/CLI to have millions of columns within a single row. According to this page http://wiki.apache.org/cassandra/CassandraLimitations single row can have 2 billions columns. How does this relate to CQL 3 and tables? I still do not understand it because: - it looks like driver loads all column names into memory - it looks to me that the 2 billions limitation from CLI is not valid anymore - Map and Set values do not support iterator Regards, Maciej On 19 May 2014, at 17:31, Jack Krupansky j...@basetechnology.com wrote: You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.” See: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows -- Jack Krupansky From: Maciej Miklas Sent: Monday, May 19, 2014 11:20 AM To: user@cassandra.apache.org Subject: CQL 3 and wide rows Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? Regards, Maciej Miklas
Re: CQL 3 and wide rows
Hi James, Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram….. My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map, in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector. Regards, Maciej On 19 May 2014, at 17:30, James Campbell ja...@breachintelligence.com wrote: Maciej, In CQL3 wide rows are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMARY KEY (row_key, wide_row_column)); Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1. James From: Maciej Miklas mac.mik...@gmail.com Sent: Monday, May 19, 2014 11:20 AM To: user@cassandra.apache.org Subject: CQL 3 and wide rows Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? Regards, Maciej Miklas
Re: CQL 3 and wide rows
yes :) On 20 May 2014, at 14:24, Jack Krupansky j...@basetechnology.com wrote: To keep the terminology clear, your “row_key” is actually the “partition key”, and “wide_row_column” is actually a “clustering column”, and the combination of your row_key and wide_row_column is a “compound primary key”. -- Jack Krupansky From: Aaron Morton Sent: Tuesday, May 20, 2014 3:06 AM To: Cassandra User Subject: Re: CQL 3 and wide rows In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMARY KEY (row_key, wide_row_column)); Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1. Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns. IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector. Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help. Cheers Aaron - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 20/05/2014, at 8:18 am, Maciej Miklas mac.mik...@gmail.com wrote: Hi James, Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram….. My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map, in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector. Regards, Maciej On 19 May 2014, at 17:30, James Campbell ja...@breachintelligence.com wrote: Maciej, In CQL3 wide rows are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMARY KEY (row_key, wide_row_column)); Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1. James From: Maciej Miklas mac.mik...@gmail.com Sent: Monday, May 19, 2014 11:20 AM To: user@cassandra.apache.org Subject: CQL 3 and wide rows Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? Regards, Maciej Miklas
Re: CQL 3 and wide rows
Hi Aron, Thanks for the answer! Lest consider such CLI code: for(int i = 0 ; i 10_000_000 ; i++) { set[‘rowKey1’][‘myCol::i’] = UUID.randomUUID(); } The code above will create single row, that contains 10^6 columns sorted by ‘i’. This will work fine, and this is the wide row to my understanding - row that holds many columns AND I can read only some part of it by right slice query. On the other hand side, I can iterate over all columns without latencies because data is stored on single node. I’ve been using similar structures as replacement for secondary indexes - it’s well known pattern. How would I model it in CQL 3? 1) I could create Map, but Maps are fully loaded into memory, and Map containing 10^6 elements is definitely a problem. Plus it’s a big waste of RAM if you consider that I need only to read small subset. 2) I could alter table for each new column, which would create similar structure to this one from my CLI example. But it looks to me that all columns names are loaded into ram, which is still large limitation. I hope that I am wrong here - I am not sure. 3) I could redesign my model and divide data into many rows, but why would I do that, if I can use wide rows. My idea of wide row, is a row that can hold large amount of key-value pairs (in any form), where I can filter on those keys to efficiently load only that part which I currently need. Regards, Maciej On 20 May 2014, at 09:06, Aaron Morton aa...@thelastpickle.com wrote: In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMARY KEY (row_key, wide_row_column)); Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1. Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns. IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector. Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help. Cheers Aaron - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 20/05/2014, at 8:18 am, Maciej Miklas mac.mik...@gmail.com wrote: Hi James, Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram….. My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map, in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector. Regards, Maciej On 19 May 2014, at 17:30, James Campbell ja...@breachintelligence.com wrote: Maciej, In CQL3 wide rows are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMARY KEY (row_key, wide_row_column)); Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1. James From: Maciej Miklas mac.mik...@gmail.com Sent: Monday, May 19, 2014 11:20 AM To: user@cassandra.apache.org Subject: CQL 3 and wide rows Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? Regards, Maciej Miklas
Re: CQL 3 and wide rows
Thank you Nate - now I understand it ! This is real improvement when compared to CLI :) Regards, Maciej On 20 May 2014, at 17:16, Nate McCall n...@thelastpickle.com wrote: Something like this might work: cqlsh:my_keyspace CREATE TABLE my_widerow ( ... id text, ... my_col timeuuid, ... PRIMARY KEY (id, my_col) ... ) WITH caching='KEYS_ONLY' AND ... compaction={'class': 'LeveledCompactionStrategy'}; cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace insert into my_widerow (id, my_col) values ('some_key_1',now()); cqlsh:my_keyspace select * from my_widerow; id | my_col +-- some_key_1 | 7266d240-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 73ba0630-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10 cqlsh:my_keyspace select * from my_widerow where id = 'some_key_1' and my_col 73ba0630-e030-11e3-a50d-8b2f9bfbfa10; id | my_col +-- some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10 cqlsh:my_keyspace select * from my_widerow where id = 'some_key_1' and my_col 73ba0630-e030-11e3-a50d-8b2f9bfbfa10 and my_col 76227ab0-e030-11e3-a50d-8b2f9bfbfa10; id | my_col +-- some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10 These queries would all work fine from the DS Java Driver. Note that only the cells that are needed are pulled into memory: ./bin/nodetool cfstats my_keyspace my_widerow ... Column Family: my_widerow ... Average live cells per slice (last five minutes): 6.0 ... This shows that we are slicing across 6 rows on average for the last couple of select statements. Hope that helps. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com