TTL on SecondaryIndex Columns. A bug?
Hi, We are having an issue with TTL on Secondary index columns. We get 0 rows in return when running queries on indexed columns that have TTL. Everything works fine with small amounts of data, but when we get over a ceratin threshold it looks like older rows dissapear from the index. In the example below we create 70 rows with 45k columns each + one indexed column with just the rowkey as value, so we have one row per indexed value. When the script is finished the index contains rows 66-69. Rows 0-65 are gone from the index. Using 'indexedColumn' without TTL fixes the problem. - SCHEMA START - create keyspace ks123 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 1} and durable_writes = true; use ks123; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : AsciiType, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; - SCHEMA FINISH - - POPULATE START - from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 'cf1') for rowKey in xrange(70): b = Mutator(pool) for datapoint in xrange(1, 45001): b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000); b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600); print 'row %d' % rowKey b.send() b = Mutator(pool) pool.dispose() - POPULATE FINISH - - QUERY START - [default@ks123] get cf1 where 'indexedColumn'='65'; 0 Row Returned. Elapsed time: 2.38 msec(s). [default@ks123] get cf1 where 'indexedColumn'='66'; --- RowKey: 66 = (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ... = (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) = (column=indexedColumn, value=66, timestamp=1355818768119334, ttl=7887600) 1 Row Returned. Elapsed time: 31 msec(s). - QUERY FINISH - This is all using Cassandra 1.1.7 with default settings. Best regards, Alexei Bakanov
Monitoring the number of client connections
Hi! I want to know how many client connections has each one of my cluster nodes (to check if my load balancing is spreading in a balanced way, to check if increase in the cluster load can be related to an increase in the number of connections, and things like that). I was thinking about going with netstat, counting ESTABLISHED connections to port 9160, but then I thought maybe there is some way in cassandra to get that information (maybe a counter of connections in the JMX?). I've tried installing MX4J and going over all MBeans, but I haven't found any with a promising name, they all seem unrelated to this information. And I can't find anything skimming the manual, so... Can you think a better way than netstat to get this information? Better yet, is there anything similar to Show processlist in mysql? Thanks! -- [image: Groupalia] http://es.groupalia.com/ www.groupalia.com http://es.groupalia.com/Tomàs NúñezIT-SysprodTel. + 34 93 159 31 00 Fax. + 34 93 396 18 52Llull, 95-97, 2º planta, 08005 BarcelonaSkype: tomas.nunez.groupaliatomas.nu...@groupalia.comnombre.apell...@groupalia.com[image: Twitter] Twitter http://twitter.com/#%21/groupaliaes[image: Twitter] Facebook https://www.facebook.com/GroupaliaEspana[image: Twitter] Linkedin http://www.linkedin.com/company/groupalia groupalia.jpglinkedin.pngfacebook.pngtwitter.png
Re: rpc_timeout exception while inserting
I was trying to mix CQL2 and CQL3 to check whether a columnfamily with compound keys can be further indexed. Because using CQL3 secondary indexing on table with composite PRIMARY KEY is not possible. And surprisingly by mixing the CQL versions i was able to do so. But when i want to insert anything in the column family it gives me a rpc_timeout exception. I personally found it quite abnormal, so thought of posting this thing in forum. Best, On Mon, Dec 10, 2012 at 6:29 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Mon, Dec 10, 2012 at 12:36 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote: Hi All, I have a column family which structure is CREATE TABLE practice ( id text, name text, addr text, pin text, PRIMARY KEY (id, name) ) WITH comment='' AND caching='KEYS_ONLY' AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; CREATE INDEX idx_address ON practice (addr); Initially i have made the column family using CQL 3.0.0. Then for creating the index i have used CQL 2.0. Now when want to insert any data in the column family it always shows a timeout exception. INSERT INTO practice (id, name, addr,pin) VALUES ( '1','AB','kolkata','700052'); Request did not complete within rpc_timeout. Please suggest me where i am getting wrong? That would be creating the index through CQL 2. Why did you use CQL 3 for the CF creation and CQL 2 for the index one? If you do both in CQL 3, that should work as expected. That being said, you should probably not get timeouts (that won't do what you want though). If you look at the server log, do you have an exception there? -- Sylvain -- Abhijit Chanda Analyst VeHere Interactive Pvt. Ltd. +91-974395
cassandra-cli + secured JMX = WARNING: Could not connect to the JMX
Hi, for a number of reasons (particularly to prevent accidental access), I've decided to try to secure (slightly) our Cassandra installation(s). To this end I've started with 'securing' client access via code very similar to SimpleAuthenticator (in examples dir). This, however, has no effect on nodetool operation which uses JMX. So I went ahead with securing JMX access as well (via -Dcom.sun.management.jmxremote.password.file and -Dcom.sun.management.jmxremote.access.file). This seemed to work well, except now if I use cassandra-cli (on Windows) to interact with the server, it gives me the following: apache-cassandra-1.1.6\bincassandra-cli -h hostname -u experiment -pw password Starting Cassandra Client Connected to: db on hostname/9160 Welcome to Cassandra CLI version 1.1.6 [experiment@unknown] show keyspaces; WARNING: Could not connect to the JMX on hostname:7199, information won't be shown. Keyspace: system: Replication Strategy: org.apache.cassandra.locator.LocalStrategy Durable Writes: true Options: [replication_factor:1] ... (rest of keyspace output snipped) I am concerned about this WARNING. I still seem to get the output (at least for the 'system' keyspace, don't have anything else there yet), but I'm wondering what problems I can expect and if there's a way to resolve this WARNING (I'm guessing it has to do with specifying credentials for JMX in addition to credentials used for client access)? Thanks in advance, Sergey -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-cli-secured-JMX-WARNING-Could-not-connect-to-the-JMX-tp7584322.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: rpc_timeout exception while inserting
CQL2 and CQL3 indexes are not compatible. I guess CQL2 is able to detect that the table was defined in CQL3 probably should not allow it. Backwards comparability is something the storage engines and interfaces have to account for. At least they should prevent you from hurting yourself. But do not try to defeat the system. Just stick with one CQL version. On Tue, Dec 18, 2012 at 7:37 AM, Abhijit Chanda abhijit.chan...@gmail.comwrote: I was trying to mix CQL2 and CQL3 to check whether a columnfamily with compound keys can be further indexed. Because using CQL3 secondary indexing on table with composite PRIMARY KEY is not possible. And surprisingly by mixing the CQL versions i was able to do so. But when i want to insert anything in the column family it gives me a rpc_timeout exception. I personally found it quite abnormal, so thought of posting this thing in forum. Best, On Mon, Dec 10, 2012 at 6:29 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Mon, Dec 10, 2012 at 12:36 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote: Hi All, I have a column family which structure is CREATE TABLE practice ( id text, name text, addr text, pin text, PRIMARY KEY (id, name) ) WITH comment='' AND caching='KEYS_ONLY' AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; CREATE INDEX idx_address ON practice (addr); Initially i have made the column family using CQL 3.0.0. Then for creating the index i have used CQL 2.0. Now when want to insert any data in the column family it always shows a timeout exception. INSERT INTO practice (id, name, addr,pin) VALUES ( '1','AB','kolkata','700052'); Request did not complete within rpc_timeout. Please suggest me where i am getting wrong? That would be creating the index through CQL 2. Why did you use CQL 3 for the CF creation and CQL 2 for the index one? If you do both in CQL 3, that should work as expected. That being said, you should probably not get timeouts (that won't do what you want though). If you look at the server log, do you have an exception there? -- Sylvain -- Abhijit Chanda Analyst VeHere Interactive Pvt. Ltd. +91-974395
Partition maintenance
Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I'm not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve
Re: Partition maintenance
You could make a column family for each period of time and then drop the column family when you want to destroy it. Before you drop it you could use the sstabletojson converter and write the json files out to tape. Might make your life difficult however if you need an input split for map reduce between each time period because you would be limited to working on one column family at a time. On Dec 18, 2012, at 8:09 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: Partition maintenance
If I'm understanding you correctly, you can write TTL's on each insert. 18 months would be roughly 540 days which would be 46656000 seconds. I've not tried that number, but I use smaller TTL's all the time and they work fine. Once they are expired they get tombstones and are no longer searchable. Space is reclaimed as with any tombstone. --Chris On Dec 18, 2012, at 11:08 AM, stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve -- The downside of being better than everyone else is that people tend to assume you're pretentious.
Re: Partition maintenance
My understanding was that TTLs only apply to columns and not on a per row basis. This means that for each column insert you would need to set that TTL. Does this mean that the amount of data space used in such a case would be the TTL * the number of columns? I was hoping there was a way to set a row TTL. See older post: http://comments.gmane.org/gmane.comp.db.cassandra.user/12701 From: Christopher Keller cnkel...@gmail.commailto:cnkel...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Tuesday, December 18, 2012 11:16 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Partition maintenance If I'm understanding you correctly, you can write TTL's on each insert. 18 months would be roughly 540 days which would be 46656000 seconds. I've not tried that number, but I use smaller TTL's all the time and they work fine. Once they are expired they get tombstones and are no longer searchable. Space is reclaimed as with any tombstone. --Chris On Dec 18, 2012, at 11:08 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve -- The downside of being better than everyone else is that people tend to assume you're pretentious.
RE: Partition maintenance
No way to read the taped data with TTL later - will disappear from tapes :) Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider Take a ride with Adform's Rich Media Suitehttp://vimeo.com/adform/richmedia [Adform News] http://www.adform.com [Adform awarded the Best Employer 2012] http://www.adform.com/site/blog/adform/adform-takes-top-spot-in-best-employer-survey/ Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Keith Wright [mailto:kwri...@nanigans.com] Sent: Tuesday, December 18, 2012 18:33 To: user@cassandra.apache.org Subject: Re: Partition maintenance My understanding was that TTLs only apply to columns and not on a per row basis. This means that for each column insert you would need to set that TTL. Does this mean that the amount of data space used in such a case would be the TTL * the number of columns? I was hoping there was a way to set a row TTL. See older post: http://comments.gmane.org/gmane.comp.db.cassandra.user/12701 From: Christopher Keller cnkel...@gmail.commailto:cnkel...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Tuesday, December 18, 2012 11:16 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Partition maintenance If I'm understanding you correctly, you can write TTL's on each insert. 18 months would be roughly 540 days which would be 46656000 seconds. I've not tried that number, but I use smaller TTL's all the time and they work fine. Once they are expired they get tombstones and are no longer searchable. Space is reclaimed as with any tombstone. --Chris On Dec 18, 2012, at 11:08 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I'm not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve -- The downside of being better than everyone else is that people tend to assume you're pretentious. inline: signature-logo29.pnginline: signature-best-employer-logo4823.png
RE: Partition maintenance
Michael - That is one approach I have considered, but that also makes querying the system particularly onerous since every column family would require its own query – I don’t think there is any good way to “join” those, right? Chris – that is an interesting concept, but as Viktor and Keith note, it seems to have problems. Could we do this simply by mass deletes? For example, if I created a column which was just /MM, then during our maintenance we could spool off records that match the month we are archiving, then do a bulk delete by that key. We would need to have a secondary index for that, I would assume. From: Michael Kjellman [mailto:mkjell...@barracuda.com] Sent: Tuesday, December 18, 2012 11:15 AM To: user@cassandra.apache.org Subject: Re: Partition maintenance You could make a column family for each period of time and then drop the column family when you want to destroy it. Before you drop it you could use the sstabletojson converter and write the json files out to tape. Might make your life difficult however if you need an input split for map reduce between each time period because you would be limited to working on one column family at a time. On Dec 18, 2012, at 8:09 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: Partition maintenance
Just make month time stamp a part of row key. Then once a month select old data, move it and delete. Andrey On Tue, Dec 18, 2012 at 8:08 AM, stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: ** ** In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? ** ** Thanks in advance, Steve
Re: Partition maintenance
Yeah. No JOINs as of now in Cassandra. What if you dumped the CF in question once a month to json and rewrote out each record in the json data if it met the time stamp you were interested in archiving. You could then bulk load each month back in if you had to restore. Doesn't help with deletes though and I would advise against large mass delete operations each month -- tends to lead to a very unhappy cluster On Dec 18, 2012, at 9:23 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Michael - That is one approach I have considered, but that also makes querying the system particularly onerous since every column family would require its own query – I don’t think there is any good way to “join” those, right? Chris – that is an interesting concept, but as Viktor and Keith note, it seems to have problems. Could we do this simply by mass deletes? For example, if I created a column which was just /MM, then during our maintenance we could spool off records that match the month we are archiving, then do a bulk delete by that key. We would need to have a secondary index for that, I would assume. From: Michael Kjellman [mailto:mkjell...@barracuda.com] Sent: Tuesday, December 18, 2012 11:15 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Partition maintenance You could make a column family for each period of time and then drop the column family when you want to destroy it. Before you drop it you could use the sstabletojson converter and write the json files out to tape. Might make your life difficult however if you need an input split for map reduce between each time period because you would be limited to working on one column family at a time. On Dec 18, 2012, at 8:09 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Hi folks. Still working through the details of building out a Cassandra solution and I have an interesting requirement that I’m not sure how to implement in Cassandra: In our current Oracle world, we have the data for this system partitioned by month, and each month the data that are now 18-months old are archived to tape/cold storage and then the partition for that month is dropped. Is there a way to do something similar with Cassandra without destroying our overall performance? Thanks in advance, Steve -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: entire range of node out of sync -- out of the blue
in your data directory, for each keyspace there is a solr.json. cassandra stores the SSTABLEs it knows about when using leveled compaction. take a look at that file and see if it looks accurate. if not, this is a bug with cassandra that we are checking into as well On Thu, Dec 6, 2012 at 7:38 PM, aaron morton aa...@thelastpickle.comwrote: The log message matches what I would expect to see for nodetool -pr Not using pr means repair all the ranges the node is a replica for. If you have RF == number of nodes, then it will repair all the data. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/12/2012, at 9:42 PM, Andras Szerdahelyi andras.szerdahe...@ignitionone.com wrote: Thanks! i'm also thinking a repair run without -pr could have caused this maybe ? Andras Szerdahelyi* *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A M: +32 493 05 50 88 | Skype: sandrew84 C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png On 06 Dec 2012, at 04:05, aaron morton aa...@thelastpickle.com wrote: - how do i stop repair before i run out of storage? ( can't let this finish ) To stop the validation part of the repair… nodetool -h localhost stop VALIDATION The only way I know to stop streaming is restart the node, their may be a better way though. INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2--dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. ) Am assuming this was ran on the first node in DC west with -pr as you said. The log message is saying this is going to repair the primary range for the node for the node. The repair is then actually performed one CF at a time. You should also see log messages ending with range(s) out of sync which will say how out of sync the data is. - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely ) Sounds like repair is streaming a lot of differences. If you have the space I would give Levelled compaction time to take care of it. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/12/2012, at 1:32 AM, Andras Szerdahelyi andras.szerdahe...@ignitionone.com wrote: hi list, AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why. I see log lines like this on all my nodes in my two ( east/west ) data centres... INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2--dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. ) ( this is around 80-100 GB of data for a single node. ) - i did not observe any network failures or nodes falling off the ring - good distribution of data ( load is equal on all nodes ) - hinted handoff is on - read repair chance is 0.1 on the CF - 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy - repair -pr is scheduled to run off-peak hours, daily - leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down ) - i am on 1.1.6 - java heap 10G - max memtables 2G - 1G row cache - 256M key cache my nodes' ranges are: DC west 0 85070591730234615865843651857942052864 DC east 100 85070591730234615865843651857942052964 symptoms are: - logs show sstables being streamed over to other nodes - 140k files in data dir of CF on all nodes - cfstats reports 20k sstables, up from 6 on all nodes - compaction continuously running with no results whatsoever ( number of stables growing ) i tried the following: - offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? ) - online scrub ( no effect ) - repair ( no effect ) - cleanup ( no effect ) my questions are: - how do i stop repair before i run out of storage? ( can't let this finish ) - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely ) thanks, Andras Andras Szerdahelyi* *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A M: +32 493 05 50 88 | Skype: sandrew84 C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png
Re: Read operations resulting in a write?
AFAIK there is no way to disable hoisting. Feel free to let your jira fingers do the talking. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 6:10 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Is there a way to turn this on and off through configuration? I am not necessarily sure I would want this feature. Also it is confusing if these writes show up in JMX and look like user generated write operations. On Mon, Dec 17, 2012 at 10:01 AM, Mike mthero...@yahoo.com wrote: Thank you Aaron, this was very helpful. Could it be an issue that this optimization does not really take effect until the memtable with the hoisted data is flushed? In my simple example below, the same row is updated and multiple selects of the same row will result in multiple writes to the memtable. It seems it maybe possible (although unlikely) where, if you go from a write-mostly to a read-mostly scenario, you could get into a state where you are stuck rewriting to the same memtable, and the memtable is not flushed because it absorbs the over-writes. I can foresee this especially if you are reading the same rows repeatedly. I also noticed from the codepaths that if Row caching is enabled, this optimization will not occur. We made some changes this weekend to make this column family more suitable to row-caching and enabled row-caching with a small cache. Our initial results is that it seems to have corrected the write counts, and has increased performance quite a bit. However, are there any hidden gotcha's there because this optimization is not occurring? https://issues.apache.org/jira/browse/CASSANDRA-2503 mentions a compaction is behind problem. Any history on that? I couldn't find too much information on it. Thanks, -Mike On 12/16/2012 8:41 PM, aaron morton wrote: 1) Am I reading things correctly? Yes. If you do a read/slice by name and more than min compaction level nodes where read the data is re-written so that the next read uses fewer SSTables. 2) What is really happening here? Essentially minor compactions can occur between 4 and 32 memtable flushes. Looking through the code, this seems to only effect a couple types of select statements (when selecting a specific column on a specific key being one of them). During the time between these two values, every select statement will perform a write. Yup, only for readying a row where the column names are specified. Remember minor compaction when using SizedTiered Compaction (the default) works on buckets of the same size. Imagine a row that had been around for a while and had fragments in more than Min Compaction Threshold sstables. Say it is 3 SSTables in the 2nd tier and 2 sstables in the 1st. So it takes (potentially) 5 SSTable reads. If this row is read it will get hoisted back up. But the row has is in only 1 SSTable in the 2nd tier and 2 in the 1st tier it will not hoisted. There are a few short circuits in the SliceByName read path. One of them is to end the search when we know that no other SSTables contain columns that should be considered. So if the 4 columns you read frequently are hoisted into the 1st bucket your reads will get handled by that one bucket. It's not every select. Just those that touched more the min compaction sstables. 3) Is this desired behavior? Is there something else I should be looking at that could be causing this behavior? Yes. https://issues.apache.org/jira/browse/CASSANDRA-2503 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 15/12/2012, at 12:58 PM, Michael Theroux mthero...@yahoo.com wrote: Hello, We have an unusual situation that I believe I've reproduced, at least temporarily, in a test environment. I also think I see where this issue is occurring in the code. We have a specific column family that is under heavy read and write load on a nightly basis. For the purposes of this description, I'll refer to this column family as Bob. During this nightly processing, sometimes Bob is under very write load, other times it is very heavy read load. The application is such that when something is written to Bob, a write is made to one of two other tables. We've witnessed a situation where the write count on Bob far outstrips the write count on either of the other tables, by a factor of 3-10. This is based on the WriteCount available on the column family JMX MBean. We have not been able to find where in our code this is happening, and we have gone as far as tracing our CQL calls to determine that the relationship between Bob and the other tables are what we expect. I brought up a test node to experiment, and see a situation where, when a select statement is executed, a write will occur.
Re: Data Model Review
I have heard it best to try and avoid the use of super columns for now. Yup. Your model makes sense. If you are creating the CF using the cassandra-cli you will probably want to reverse order the column names see http://thelastpickle.com/2011/10/03/Reverse-Comparators/ If you want to use CQL 3 you could do something like this: CREATE TABLE InstagramPhotos ( user_name str, photo_seq timestamp, meta_1 str, meta_2 str PRIMARY KEY (user_name, phot_seq) ); That's pretty much the same. user_name is the row key, and photo_seq will be used as part of a composite column name internally. (You can do the same thing without CQL, just look up composite columns) You can do something similar for the annotations. Depending on your use case I would use UNIX epoch time if possible rather than a time uuid. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 4:35 AM, Adam Venturella aventure...@gmail.com wrote: My use case is capturing some information about Instagram photos from the API. I have 2 use cases. One, I need to capture all of the media data for an account and two I need to be able to privately annotate that data. There is some nuance in this, multiple http queries for example, but ignoring that, and assuming I have obtained all of the data surrounding an accounts photos here is how I was thinking of storing that information for use case 1. ColumnFamily: InstagramPhotos Row Key: account_username Columns: Coulmn Name: date_posted_timestamp Coulumn Value: JSON representing the data for the individual photo (filter, comments, likes etc, not the binary photo data). So the idea would be to keep adding columns to the row that contain that serialized data (in JSON) with their timestamps as the name. Timestamps as the column names, I figure, should help help to perform range queries, where I make the 1st column inserted the earliest timestamp and the last column inserted the most recent. I could probably also use TimeUUIDs here as well since I will have things ordered prior to inserting. The question here, does this approach make sense? Is it common to store JSON in columns like this? I know there are super columns as well, so I could use those I suppose instead of JSON. The extra level of indexing would probably be useful to query specific photos for use case 2. I have heard it best to try and avoid the use of super columns for now. I have no information to back that claim up other than some time spent in the IRC. So feel free to debunk that statement if it is false. So that is use case one, use case two covers the private annotations. I figured here: ColumnFamily: InstagramAnnotations row key: Canonical Media Id Column Name: TimeUUID Column Value: JSON representing an annotation/internal comment Writing out the above I can actually see where I might need to tighten some things up around how I store the photos. I am clearly missing an obvious connection between the InstagramPhotos and the InstagramAnnotations, maybe super columns would help with the photos instead of JSON? Otherwise I would need to build an index row where I tie the the canonical photo id to a timestamp (column name) in the InstagramPhotos. I could also try to figure out how to make a TimeUUID of my own that can double as the media's canonical id or further look at Instagram's canonical id for photos and see if it already counts up. In which case I could use that in place of a timestamp. Anyway, I figured I would see if anyone might help flush out other potential pitfalls in the above. I am definitely new to cassandra and I am using this project as a way to learn some more about assembling systems using it.
Re: TTL on SecondaryIndex Columns. A bug?
Thanks for the nice steps to reproduce. I ran this on my MBP using C* 1.1.7 and got the expected results, both get's returned a row. Were you running against a single node or a cluster ? If a cluster did you change the CL, cassandra-cli defaults to ONE. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 9:44 PM, Alexei Bakanov russ...@gmail.com wrote: Hi, We are having an issue with TTL on Secondary index columns. We get 0 rows in return when running queries on indexed columns that have TTL. Everything works fine with small amounts of data, but when we get over a ceratin threshold it looks like older rows dissapear from the index. In the example below we create 70 rows with 45k columns each + one indexed column with just the rowkey as value, so we have one row per indexed value. When the script is finished the index contains rows 66-69. Rows 0-65 are gone from the index. Using 'indexedColumn' without TTL fixes the problem. - SCHEMA START - create keyspace ks123 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 1} and durable_writes = true; use ks123; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : AsciiType, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; - SCHEMA FINISH - - POPULATE START - from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 'cf1') for rowKey in xrange(70): b = Mutator(pool) for datapoint in xrange(1, 45001): b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000); b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600); print 'row %d' % rowKey b.send() b = Mutator(pool) pool.dispose() - POPULATE FINISH - - QUERY START - [default@ks123] get cf1 where 'indexedColumn'='65'; 0 Row Returned. Elapsed time: 2.38 msec(s). [default@ks123] get cf1 where 'indexedColumn'='66'; --- RowKey: 66 = (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ... = (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) = (column=indexedColumn, value=66, timestamp=1355818768119334, ttl=7887600) 1 Row Returned. Elapsed time: 31 msec(s). - QUERY FINISH - This is all using Cassandra 1.1.7 with default settings. Best regards, Alexei Bakanov
Re: Monitoring the number of client connections
AFAIK the count connections is not exposed. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 10:37 PM, Tomas Nunez tomas.nu...@groupalia.com wrote: Hi! I want to know how many client connections has each one of my cluster nodes (to check if my load balancing is spreading in a balanced way, to check if increase in the cluster load can be related to an increase in the number of connections, and things like that). I was thinking about going with netstat, counting ESTABLISHED connections to port 9160, but then I thought maybe there is some way in cassandra to get that information (maybe a counter of connections in the JMX?). I've tried installing MX4J and going over all MBeans, but I haven't found any with a promising name, they all seem unrelated to this information. And I can't find anything skimming the manual, so... Can you think a better way than netstat to get this information? Better yet, is there anything similar to Show processlist in mysql? Thanks! -- groupalia.jpg www.groupalia.com Tomàs Núñez IT-Sysprod Tel. + 34 93 159 31 00 Fax. + 34 93 396 18 52 Llull, 95-97, 2º planta, 08005 Barcelona Skype: tomas.nunez.groupalia tomas.nu...@groupalia.com twitter.png Twitterfacebook.png Facebooklinkedin.png Linkedin
Re: Monitoring the number of client connections
netstat + cron is your friend at this point in time On Dec 18, 2012, at 8:25 PM, aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote: AFAIK the count connections is not exposed. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 10:37 PM, Tomas Nunez tomas.nu...@groupalia.commailto:tomas.nu...@groupalia.com wrote: Hi! I want to know how many client connections has each one of my cluster nodes (to check if my load balancing is spreading in a balanced way, to check if increase in the cluster load can be related to an increase in the number of connections, and things like that). I was thinking about going with netstat, counting ESTABLISHED connections to port 9160, but then I thought maybe there is some way in cassandra to get that information (maybe a counter of connections in the JMX?). I've tried installing MX4J and going over all MBeans, but I haven't found any with a promising name, they all seem unrelated to this information. And I can't find anything skimming the manual, so... Can you think a better way than netstat to get this information? Better yet, is there anything similar to Show processlist in mysql? Thanks! -- groupalia.jpghttp://es.groupalia.com/ www.groupalia.comhttp://es.groupalia.com/ Tomàs Núñez IT-Sysprod Tel. + 34 93 159 31 00 Fax. + 34 93 396 18 52 Llull, 95-97, 2º planta, 08005 Barcelona Skype: tomas.nunez.groupalia tomas.nu...@groupalia.commailto:nombre.apell...@groupalia.com twitter.png Twitterhttp://twitter.com/#%21/groupaliaesfacebook.png Facebookhttps://www.facebook.com/GroupaliaEspanalinkedin.png Linkedinhttp://www.linkedin.com/company/groupalia Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Exception on running nodetool in windows
Hi, I have been trying to run cassandra on windows(1.1.6) and i am getting an exception while checking my node status. nodetool ring -h localhost Classcastexception: can not convert java.lang.String to some another java type( i don't remember exact java class). But somehow, this is fine on ubuntu. Any idea? -Vivek