Schema size
Hi All, I have a question about schema cleaning in cassandra. I use cassandra version 1.0.9. I have 5 keyspaces and about 1500 column family per keyspace. After dynamically creating and deleting CF my schema's sstables size were very high. For example size of Migrations was 45 GB and Schema sstable size was 45 GB whereas all data size was about 10 GB. Why schema grow was so fast? And how may I cleanup schema or later schema cleanup itself? -- Best regards, Alexander
Schema changes not getting picked up from different process
Hi all, This is my first message on this posting list so I'm sorry if I am breaking any rules. I just wanted to report some sort of a problem that I'm having with Cassandra. Short version of my problem: if I make changes to the schema from within a process, they do not get picked up by the other processes that are connected to the Cassandra cluster unless I trigger a reconnect. Long version: Process 1: cassandra-cli connected to cluster and keyspace Process 2: cassandra-cli connected to cluster and keyspace From within process 1 - create column family test; From within process 2 - describe test; - fails with an error (other query/insert methods fail as well). I'm not sure if this is indeed a bug or just a misunderstanding from my part. Regards, Victor
Re: Schema changes not getting picked up from different process
What version are you using? It might be related to https://issues.apache.org/jira/browse/CASSANDRA-4052 On 05/25/2012 07:32 AM, Victor Blaga wrote: Hi all, This is my first message on this posting list so I'm sorry if I am breaking any rules. I just wanted to report some sort of a problem that I'm having with Cassandra. Short version of my problem: if I make changes to the schema from within a process, they do not get picked up by the other processes that are connected to the Cassandra cluster unless I trigger a reconnect. Long version: Process 1: cassandra-cli connected to cluster and keyspace Process 2: cassandra-cli connected to cluster and keyspace From within process 1 - create column family test; From within process 2 - describe test; - fails with an error (other query/insert methods fail as well). I'm not sure if this is indeed a bug or just a misunderstanding from my part. Regards, Victor
Re: Schema changes not getting picked up from different process
Hi Dave, Thank you for your answer. 2012/5/25 Dave Brosius dbros...@mebigfatguy.com What version are you using? I am using version 1.1.0 It might be related to https://issues.apache.org/jira/browse/CASSANDRA-4052 Indeed the Issue you suggested goes into the direction of my problem. However, things are a little bit more complex. I used the cassandra-cli just for this example, although I'm getting this behavior from other clients (I'm using python and ruby scripts). Basically I'm modifying the schema through the ruby script and I'm trying to query and insert data through the python script. Both of the scripts are meant to be on forever (sort of daemons) and thus they establish once at start a connection to the Cassandra which is kept alive. I can see from the comments on the issue that keeping a long-lived connection to the Cluster might not be ideal and it would probably be better to reconnect upon executing a set of queries.
Re: Schema size
In Cassandra 1.1 the schema is no longer a full migration history. Before that each schema change is recorded in the table and they all have to be replayed when a node bootstraps. Also 1.1 has some bug ATM that means you should not switch to it. People tend to say things like with cassandra X you now can have more CF and KS's but back in the day the thinking was You only need 1 Hector has a concept of virtual keyspaces which is a model that is likely to give you more success then 1500 CFs. I am not a fan of dynamically creating and tearing down CFS on the fly or 1 per customer designs. When the wrinkles come out of 1.1 and up upgrade the schema size should shrink down. On Fri, May 25, 2012 at 6:25 AM, Sasha Yanushkevich yanus...@gmail.com wrote: Hi All, I have a question about schema cleaning in cassandra. I use cassandra version 1.0.9. I have 5 keyspaces and about 1500 column family per keyspace. After dynamically creating and deleting CF my schema's sstables size were very high. For example size of Migrations was 45 GB and Schema sstable size was 45 GB whereas all data size was about 10 GB. Why schema grow was so fast? And how may I cleanup schema or later schema cleanup itself? -- Best regards, Alexander
FYI: Java 7u4 on Linux requires higher stack size
Hell all, We've started to test Oracle Java 7u4 (currently we're on 7u3) on Linux to try G1 GC. Cassandra can't start on 7u4 with exception: The stack size specified is too small, Specify at least 160k Cannot create Java VM Changing in cassandra-env.sh -Xss128k to -Xss160k allowed to start Cassandra, but when Thrift client disconnects, Cassandra log fills with exceptions: ERROR 17:08:56,300 Fatal exception in thread Thread[Thrift:13,5,main] java.lang.StackOverflowError at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Increasing stack size from 160k to 192k eliminated such excepitons. Just wanted you to know if someone tries to migrate to Java 7u4. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. inline: signature-logo29.png
Re: FYI: Java 7u4 on Linux requires higher stack size
Thanks, we're investigating in https://issues.apache.org/jira/browse/CASSANDRA-4275. On Fri, May 25, 2012 at 10:31 AM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Hell all, ** ** We’ve started to test Oracle Java 7u4 (currently we’re on 7u3) on Linux to try G1 GC. ** ** Cassandra can’t start on 7u4 with exception: ** ** The stack size specified is too small, Specify at least 160k Cannot create Java VM ** ** Changing in cassandra-env.sh -Xss128k to -Xss160k allowed to start Cassandra, but when Thrift client disconnects, Cassandra log fills with exceptions: ** ** ERROR 17:08:56,300 Fatal exception in thread Thread[Thrift:13,5,main] java.lang.StackOverflowError at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ** ** Increasing stack size from 160k to 192k eliminated such excepitons. ** ** ** ** Just wanted you to know if someone tries to migrate to Java 7u4. ** ** ** ** Best regards / Pagarbiai *Viktor Jevdokimov* Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#%21/adforminsider What is Adform: watch this short video http://vimeo.com/adform/display [image: Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com signature-logo29.png
Frequent exception with Cassandra 1.0.9
I am running embedded Cassandra version 1.0.9 on Windows2008 Server frequently encounter the following exception: Stack: [0x7dc6,0x7dcb], sp=0x7dcaf0b0, free space=316k Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j java.io.WinNTFileSystem.getSpace0(Ljava/io/File;I)J+0 j java.io.WinNTFileSystem.getSpace(Ljava/io/File;I)J+10 j java.io.File.getUsableSpace()J+34 j org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTab le(Ljava/lang/String;JZ)Ljava/lang/String;+44 j org.apache.cassandra.db.Table.getDataFileLocation(JZ)Ljava/lang/String;+ 6 j org.apache.cassandra.db.Table.getDataFileLocation(J)Ljava/lang/String;+3 j org.apache.cassandra.db.ColumnFamilyStore.getFlushPath(JLjava/lang/Strin g;)Ljava/lang/String;+5 j org.apache.cassandra.db.ColumnFamilyStore.createFlushWriter(JJLorg/apach e/cassandra/db/commitlog/ReplayPosition;)Lorg/apache/cassandra/io/sstabl e/SSTableWriter;+18 J org.apache.cassandra.db.Memtable.writeSortedContents(Lorg/apache/cassand ra/db/commitlog/ReplayPosition;)Lorg/apache/cassandra/io/sstable/SSTable Reader; j org.apache.cassandra.db.Memtable.access$400(Lorg/apache/cassandra/db/Mem table;Lorg/apache/cassandra/db/commitlog/ReplayPosition;)Lorg/apache/cas sandra/io/sstable/SSTableReader;+2 j org.apache.cassandra.db.Memtable$4.runMayThrow()V+36 j org.apache.cassandra.utils.WrappedRunnable.run()V+9 J java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Ljava/lang/Runnab le;)V J java.util.concurrent.ThreadPoolExecutor$Worker.run()V j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub Java into: java version 1.6.0_30 Java(TM) SE Runtime Environment (build 1.6.0_30-b12) Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
RE: will compaction delete empty rows after all columns expired?
This is an old thread from December 27, 2011. I interpret the yes answer to mean you do not have to explicitly delete an empty row after all of its columns have been deleted, the empty row (i.e. row key) will automatically be deleted eventually (after gc_grace). Is that true? I am not seeing that behavior on our v 0.7.9 ring. We are accumulating a large number of old empty rows. They are taking alot of space because the row keys are big, and exploding the data size by 10x. I have read conflicting information on blogs and cassandra docs. Someone mentioned that there are both row tombstones and column tombstones, implying that you have to explicitly delete empty rows. Is that correct. My basic question is... how do I delete all these empty row keys? - From: Feng Qu Sent: Tuesday, December 27, 2011 11:09 AM Compaction should delete empty rows once gc_grace_seconds is passed, right? - From: Peter Schuller Yes. But just to be extra clear: Data will not actually be removed once the row in question participates in compaction. Compactions will not be actively triggered by Cassandra for tombstone processing reasons.
Re: will compaction delete empty rows after all columns expired?
do not delete empty rows. It refreshes tombstone and they will never expire.
CounterColumns with double, min/max
Hi all, I am currently investigating the possibilities of Cassandra for my company, I have played around with Hector and Cassandra for a month now, and I have a few questions to ask (in a separate mail for each topic). So here are my first questions: 1) I found the CounterColumns to be a very powerful feature. I heard that it was more or less the implementation of Rainbird described by Twitter a year ago, and was supposed to go open source one day (Cf http://techcrunch.com/2011/02/04/twitter-rainbird/ ) Is it true? 2) More seriously: do you think Cassandra-2833 will be done one day? What about Cassandra-1132? I think having CounterColumns being able to manage double and min/max operation would be quite useful features (at least for my company). Thanks for your time, Furcy PS: if anyone happens to live in Paris (I know :-( ), I would be glad to talk about Cassandra around a beer.
Re: Composite keys question
I'm not much advanced in cassandra, but seeing the pycassa doc http://pycassa.github.com/pycassa/assorted/composite_types.html, for composites you can't even search for the second term, you need a first term, the second will filter, you just do range slices on the composite columns it's totally different from secondary indexes for the rows also CQL can't do everything as much as the other clients 2012/5/24 Roland Mechler rmech...@sencha.com Suppose I have a table in CQL3 with a 2 part composite, and I do a select that specifies just the second part of the key (not the partition key), will this result in a full table scan, or is the second part of the key indexed? Example: cqlsh:Keyspace1 CREATE TABLE test_table (part1 text, part2 text, data text, PRIMARY KEY (part1, part2)); cqlsh:Keyspace1 INSERT INTO test_table (part1, part2, data) VALUES ('1','1','a'); cqlsh:Keyspace1 SELECT * FROM test_table WHERE part2 = '1'; part1 | part2 | data ---+---+-- 1 | 1 |a -Roland
Re: Composite keys question
Thanks for your response, Cyril. Yeah, I realized shortly after asking that indeed the second term is not being indexed, so it must be doing a table scan. Indexing for composite columns is in the works ( https://issues.apache.org/jira/browse/CASSANDRA-3680), but not sure how soon that will be available. The thing is, it did actually let me search on the second term only, which is perhaps a little surprising. -Roland On Fri, May 25, 2012 at 12:33 PM, Cyril Auburtin cyril.aubur...@gmail.comwrote: I'm not much advanced in cassandra, but seeing the pycassa doc http://pycassa.github.com/pycassa/assorted/composite_types.html, for composites you can't even search for the second term, you need a first term, the second will filter, you just do range slices on the composite columns it's totally different from secondary indexes for the rows also CQL can't do everything as much as the other clients 2012/5/24 Roland Mechler rmech...@sencha.com Suppose I have a table in CQL3 with a 2 part composite, and I do a select that specifies just the second part of the key (not the partition key), will this result in a full table scan, or is the second part of the key indexed? Example: cqlsh:Keyspace1 CREATE TABLE test_table (part1 text, part2 text, data text, PRIMARY KEY (part1, part2)); cqlsh:Keyspace1 INSERT INTO test_table (part1, part2, data) VALUES ('1','1','a'); cqlsh:Keyspace1 SELECT * FROM test_table WHERE part2 = '1'; part1 | part2 | data ---+---+-- 1 | 1 |a -Roland
Re: CounterColumns with double, min/max
On 1, countandra.org. On 2, the issue is a little more deep (we have investigated this at countandra). To approach it a little more comprehensively, the issue has more to do with events rather than counts (at least in IMO). A similar issue is about averages... countandra does sums and counts quite easily; thanks to Cassandra's counter columns. However Averages are a little more challenging in the eventually consistent world. Average = Sum/Count. However which Sum and which Count? You can have a Sum and Count from different times and divide the two and you might get a division by zero or 0/sum. We are investigating the use of Interval Time Clocks (ITCs) as the mechanism for co-ordinating these times. Incidentally the use of the ITC can be used to produce globally consistent snapshots across nodes of Cassandra. Regards Milind On Fri, May 25, 2012 at 11:55 AM, Furcy Pin fu...@qunb.com wrote: Hi all, I am currently investigating the possibilities of Cassandra for my company, I have played around with Hector and Cassandra for a month now, and I have a few questions to ask (in a separate mail for each topic). So here are my first questions: 1) I found the CounterColumns to be a very powerful feature. I heard that it was more or less the implementation of Rainbird described by Twitter a year ago, and was supposed to go open source one day (Cf http://techcrunch.com/2011/02/04/twitter-rainbird/ ) Is it true? 2) More seriously: do you think Cassandra-2833 will be done one day? What about Cassandra-1132? I think having CounterColumns being able to manage double and min/max operation would be quite useful features (at least for my company). Thanks for your time, Furcy PS: if anyone happens to live in Paris (I know :-( ), I would be glad to talk about Cassandra around a beer.
Re: nodetool repair taking forever
Thanks for the reply Aaron. By compaction being on, do you mean if run nodetool compact, then the answer is no. I haven't set any explicit compaction_thresholds which means it should be using the default, min 4 and max 32. Having said that to solve the problem, I just did a full cluster restart and ran nodetool repair again. The entire cluster of 6 nodes was repaired in 10 hours. I am also contemplating since all the 6 nodes are replicas of each other, do I even need to run repair on all the nodes. Wouldn't running it on the first node suffice since it will repair all the ranges its responsible for(which is everything). So unless I upgrade to 1.0.+, where I can use the -pr option is it advisable to just run repair on the first node. -Raj On Tue, May 22, 2012 at 5:05 AM, aaron morton aa...@thelastpickle.comwrote: I also dont understand if all these nodes are replicas of each other why is that the first node has almost double the data. Have you performed any token moves ? Old data is not deleted unless you run nodetool cleanup. Another possibility is things like a lot of hints. Admittedly it would have to be a *lot* of hints. The third is that compaction has fallen behind. This week its even worse, the nodetool repair has been running for the last 15 hours just on the first node and when I run nodetool compactionstats I constantly see this - pending tasks: 3 First check the logs for errors. Repair will first calculate the differences, you can see this as a validation compaction in nodetool compactionstats. Then it will stream the data, you can watch that with nodetool netstats. Try to work out which part is taking the most time. 15 hours for 50Gb sounds like a long time (btw do you have compaction on ?) Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 3:14 AM, Raj N wrote: Hi experts, I have a 6 node cluster spread across 2 DCs. DC RackStatus State LoadOwnsToken 113427455640312814857969558651062452225 DC1 RAC13 Up Normal 95.98 GB33.33% 0 DC2 RAC5Up Normal 50.79 GB0.00% 1 DC1 RAC18 Up Normal 50.83 GB33.33% 56713727820156407428984779325531226112 DC2 RAC7Up Normal 50.74 GB0.00% 56713727820156407428984779325531226113 DC1 RAC19 Up Normal 61.72 GB33.33% 113427455640312814857969558651062452224 DC2 RAC9Up Normal 50.83 GB0.00% 113427455640312814857969558651062452225 They are all replicas of each other. All reads and writes are done at LOCAL_QUORUM. We are on Cassandra 0.8.4. I see that our weekend nodetool repair runs for more than 12 hours. Especially on the first one which has 96 GB data. Is this usual? We are using 500 GB SAS drives with ext4 file system. This gets worse every week. This week its even worse, the nodetool repair has been running for the last 15 hours just on the first node and when I run nodetool compactionstats I constantly see this - pending tasks: 3 and nothing else. Looks like its just stuck. There's nothing substantial in the logs as well. I also dont understand if all these nodes are replicas of each other why is that the first node has almost double the data. Any help will be really appreciated. Thanks -Raj
Re: nodetool repair taking forever
On Sat, May 19, 2012 at 8:14 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, [ repair seems to be hanging forever ] https://issues.apache.org/jira/browse/CASSANDRA-2433 Affects 0.8.4. I also believe there is a contemporaneous bug (reported by Stu Hood?) regarding failed repair resulting in extra disk usage, but I can't currently find it in JIRA. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: NetworkTopologyStrategy with 1 node
replication_factor = 1 and strategy_options = [{DC1:0}] You should not be setting both of these. All you should need is: strategy_options = [{DC1:1}] On Fri, May 25, 2012 at 1:47 PM, Cyril Auburtin cyril.aubur...@gmail.com wrote: I was using a single node, on cassandra 0.7.10 with Network strategy = SimpleStrategy, and replication factor = 1, everything is fine, I was using a consistency level of ONE, for reading/writing I have updated the keyspace to update keyspace Mymed with replication_factor = 1 and strategy_options = [{DC1:0}] and placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'; with conf/cassandra-topology.properties having just this for the moment: default=DC1:r1 the keyspace could update, I could use ks; also, but can't read anything even from Thrift using ConsistencyLevel.ONE; it will complain that this strategy require Quorum I tried with ConsistencyLevel.LOCAL_QUORUM; but get an exception like : org.apache.thrift.TApplicationException: Internal error processing get_slice and in cassandra console: DEBUG 19:45:02,013 Command/ConsistencyLevel is SliceFromReadCommand(table='Mymed', key='637972696c2e617562757274696e40676d61696c2e636f6d', column_parent='QueryPath(columnFamilyName='Authentication', superColumnName='null', columnName='null')', start='', finish='', reversed=false, count=100)/LOCAL_QUORUM ERROR 19:45:02,014 Internal error processing get_slice java.lang.NullPointerException at org.apache.cassandra.locator.NetworkTopologyStrategy.getReplicationFactor(NetworkTopologyStrategy.java:139) at org.apache.cassandra.service.DatacenterReadCallback.determineBlockFor(DatacenterReadCallback.java:83) at org.apache.cassandra.service.ReadCallback.init(ReadCallback.java:77) at org.apache.cassandra.service.DatacenterReadCallback.init(DatacenterReadCallback.java:48) at org.apache.cassandra.service.StorageProxy.getReadCallback(StorageProxy.java:461) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:326) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:291) So probably I guess Network topology strategy can't work with just one node? thx for any feedback