Re: Problem when starting Cassandra 1.1.5
Please upgrade the JAVA with 1.7.X then it will be working. Thanks Regards *Adeel**Akbar* On 10/8/2012 1:36 PM, Thierry Templier wrote: Hello, I would want to upgrade Cassandra to version 1.1.5 but I have a problem when trying to start this version: $ ./cassandra -f xss = -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k Segmentation fault Here is the Java version I used: $ java -version java version 1.6.0_24 OpenJDK Runtime Environment (IcedTea6 1.11.4) (6b24-1.11.4-1ubuntu0.10.04.1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Thanks very much for your help! Thierry
Cassandra 1.1.4 performance issue
Hi, We're running a small Cassandra cluster (1.1.4) with two nodes and serving data to our Web and Java application. After up-gradation of Cassandra from 1.0.8 to 1.1.4, we're starting to see some weird issues. If we run 'ring' command from second node, its show that failed to connect 7199 of node 1. $ /opt/apache-cassandra-1.1.4/bin/nodetool -h XX.XX.XX.01 ring Failed to connect to 'XX.XX.XX.01:7199': Connection refused We're using Network Monitoring System and Monit to monitor the servers, and in NMS the average CPU usage is around increased upto 500%, on our quad-core Xeon servers with 16 GB RAM. But occasionally through Monit we can see that the 1-min load average goes above 7. Is this common? Does this happen to everyone else? And why the spikiness in load? We can't find anything in the cassandra logs indicating that something's up (such as a slow GC or compaction), and there's no corresponding traffic spike in the application either. Should we just add more nodes if any single one gets CPU spikes? Another explanation could also be that we've configured it wrong. We're running pretty much default config and each node has 16G of RAM. A single keyspace with 15 to 20 column families, RF=2, and we have 260 GB of actual data. Please find below top and I/O stats for further reference; top - 14:21:51 up 29 days, 9:52, 1 user, load average: 6.59, 3.16, 1.42 Tasks: 163 total, 2 running, 161 sleeping, 0 stopped, 0 zombie Cpu0 : 29.0%us, 0.0%sy, 0.0%ni, 71.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 28.0%us, 0.0%sy, 0.0%ni, 72.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 13.3%us, 0.0%sy, 0.0%ni, 86.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 23.5%us, 0.7%sy, 0.0%ni, 75.5%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu4 : 89.4%us, 0.3%sy, 0.0%ni, 10.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu5 : 29.2%us, 0.0%sy, 0.0%ni, 70.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 25.1%us, 0.0%sy, 0.0%ni, 74.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 24.3%us, 0.0%sy, 0.0%ni, 72.0%id, 0.0%wa, 2.3%hi, 1.3%si, 0.0%st Mem: 16427844k total, 16317416k used, 110428k free, 128824k buffers Swap:0k total,0k used,0k free, 11344696k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 5284 root 18 0 265g 7.7g 3.6g S 266.6 49.0 474:24.38 java -ea -javaagent:/opt/apache-cassandra-1.1.4/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:Thr 1 root 15 0 10368 660 548 S 0.0 0.0 0:01.64 init [3] # iostat -xmn 2 10 -x and -n options are mutually exclusive avg-cpu: %user %nice %system %iowait %steal %idle 9.770.030.540.980.00 88.68 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.59 3.97 5.54 0.42 0.20 0.02 75.52 0.11 19.10 3.55 2.11 sda1 0.00 0.00 0.01 0.00 0.00 0.00 88.69 0.001.36 1.31 0.00 sda2 0.59 3.97 5.53 0.42 0.20 0.02 75.51 0.11 19.12 3.55 2.11 sdb 1.54 7.82 10.39 0.64 0.28 0.03 57.77 0.36 32.61 4.27 4.70 sdb1 1.54 7.82 10.39 0.64 0.28 0.03 57.77 0.36 32.61 4.27 4.70 dm-0 0.00 0.00 1.73 0.62 0.02 0.00 19.27 0.026.75 0.90 0.21 dm-1 0.00 0.00 16.32 12.23 0.46 0.05 36.47 0.50 17.67 2.07 5.92 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.007.10 3.41 0.00 Device: rMB_nor/swMB_nor/srMB_dir/s wMB_dir/srMB_svr/swMB_svr/s ops/srops/swops/s avg-cpu: %user %nice %system %iowait %steal %idle 12.460.000.000.190.00 87.35 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 2.50 0.00 1.00 0.00 0.01 28.00 0.000.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sda2 0.00 2.50 0.00 1.00 0.00 0.01 28.00 0.000.00 0.00 0.00 sdb 0.00 4.50 0.50 1.50 0.00 0.02 28.00 0.016.00 6.00 1.20 sdb1 0.00 4.50 0.50 1.50 0.00 0.02 28.00 0.016.00 6.00 1.20 dm-0 0.00 0.00 0.50 4.50 0.00 0.02 8.80 0.048.00 2.40 1.20 dm-1 0.00 0.00 0.00 5.00 0.00 0.02 8.00 0.000.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 Device: rMB_nor/swMB_nor/srMB_dir/s wMB_dir/srMB_svr/swMB_svr/s ops/srops/swops/s avg-cpu: %user %nice %system %iowait %steal %idle 12.52
Re: Problem when starting Cassandra 1.1.5
Thanks very much, Adeel! It works much better! Thierry Please upgrade the JAVA with 1.7.X then it will be working. Thanks Regards *Adeel**Akbar* On 10/8/2012 1:36 PM, Thierry Templier wrote: Hello, I would want to upgrade Cassandra to version 1.1.5 but I have a problem when trying to start this version: $ ./cassandra -f xss = -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k Segmentation fault Here is the Java version I used: $ java -version java version 1.6.0_24 OpenJDK Runtime Environment (IcedTea6 1.11.4) (6b24-1.11.4-1ubuntu0.10.04.1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Thanks very much for your help! Thierry
Re: 1000's of CF's.
So what solution should be for cassandra architecture when we need to make Hadoop M\R jobs and not be restricted by number of CF? What we have now is fair amount of CFs ( 2K) and this number is slowly growing so we already planing to merge partitioned CFs. But our next goal is to run hadoop tasks on those CFs. All we have is plain Hector and custom ORM on top of it. As far as i understand VirtualKeyspace doesn't help in our case. Also i dont understand why not implement support for many CF ( or build-in partitioning ) on cassandra side. Anybody can explain why this can or cannot be done in cassandra? Just in case: We're using cassandra 1.0.11 on 30 nodes (planning upgrade on 1.1.* soon). -- W/ best regards, Sergey. On 04.10.2012 0:10, Hiller, Dean wrote: Okay, so it only took me two solid days not a week. PlayOrm in master branch now supports virtual CF's or virtual tables in ONE CF, so you can have 1000's or millions of virtual CF's in one CF now. It works with all the Scalable-SQL, works with the joins, and works with the PlayOrm command line tool. Two ways to do it, if you are using the ORM half, you just annotate @NoSqlEntity(MyVirtualCfName) @NoSqlVirtualCf(storedInCf=sharedCf) So it's stored in sharedCf with the table name of MyVirtualCfName(in command line tool, use MyVirtualCfName to query the table). Then if you don't know your meta data ahead of time, you need to create DboTableMeta and DboColumnMeta objects and save them for every table you create and can use TypedRow to read and persist (which is what we have a project doing). If you try it out let me know. We usually get bug fixes in pretty fast if you run into anything. (more and more questions are forming on stack overflow as well ;) ). Later, Dean
Problem while streaming SSTables with BulkOutputFormat
Hello, I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am using Cassandra 1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1 namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4 of these 6 datanodes/tasktrackers.The issue happens when I have more than 1 reducer, SSTables are generated in each node, however, I get the following error in the tasktracker's logs when they are streamed into the Cassandra cluster: Exception in thread Streaming to /172.16.110.79:1 java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Exception in thread Streaming to /172.16.110.92:1 java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ... This is what I get in the logs of one of my Cassandra nodes:ERROR 16:47:34,904 Sending retry message failed, closing session. java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(Unknown Source) at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) at sun.nio.ch.IOUtil.write(Unknown Source) at sun.nio.ch.SocketChannelImpl.write(Unknown Source) at java.nio.channels.Channels.writeFullyImpl(Unknown Source) at java.nio.channels.Channels.writeFully(Unknown Source) at java.nio.channels.Channels.access$000(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source) at java.io.OutputStream.write(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source) at java.io.DataOutputStream.writeInt(Unknown Source) at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:196) at org.apache.cassandra.streaming.StreamInSession.sendMessage(StreamInSession.java:171) at org.apache.cassandra.streaming.StreamInSession.retry(StreamInSession.java:160) at org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:168) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:98) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) Does anyone know what caused these errors? Thank you for your help.Regards,Ralph
Re: MBean cassandra.db.CompactionManager TotalBytesCompacted counts backwards
I'm attempting to plot how busy the node is doing compactions but there seems to only be a few metrics reported that might be suitable: CompletedTasks, PendingTasks, TotalBytesCompacted, TotalCompactionsCompleted. It's not clear to me what the difference between CompletedTasks and TotalCompactionsCompleted is but I am plotting TotalCompactionsCompleted / sec as one metric; however, this rate is nearly always less than 1 and doesn't capture how much resources are used doing the compaction. A compaction of 4 smallest SSTables counts the same as a compaction of 4 largest SSTables but the cost is hugely different. Thus, I'm also plotting TotalBytesCompacted / sec. Since the TotalBytesCompacted value sometimes moves backwards I'm not confident that it's reporting what it is meant to report. The code and comments indicate that it should only be incremented by the final size of the newly created SSTable or by the bytes-compacted-so-far for a larger compaction, so I don't see why it should be reasonable for it to sometimes decrease. How should the impact of compaction be measured if not by bytes compacted? -Bryan On Sun, Oct 7, 2012 at 7:39 AM, Edward Capriolo edlinuxg...@gmail.comwrote: I have not looked at this JMX object in a while, however the compaction manager can support multiple threads. Also it moves from 0-filesize each time it has to compact a set of files. That is more useful for showing current progress rather then lifetime history. On Fri, Oct 5, 2012 at 7:27 PM, Bryan Talbot btal...@aeriagames.com wrote: I've recently added compaction rate (in bytes / second) to my monitors for cassandra and am seeing some odd values. I wasn't expecting the values for TotalBytesCompacted to sometimes decrease from one reading to the next. It seems that the value should be monotonically increasing while a server is running -- obviously it would start again at 0 when the server is restarted or if the counter rolls over (unlikely for a 64 bit long). Below are two samples taken 60 seconds apart: the value decreased by 2,954,369,012 between the two readings. reported_metric=[timestamp:1349476449, status:200, request:[mbean:org.apache.cassandra.db:type=CompactionManager, attribute:TotalBytesCompacted, type:read], value:7548675470069] previous_metric=[timestamp:1349476389, status:200, request:[mbean:org.apache.cassandra.db:type=CompactionManager, attribute:TotalBytesCompacted, type:read], value:7551629839081] I briefly looked at the code for CompactionManager and a few related classes and don't see anyplace that is performing subtraction explicitly; however, there are many additions of signed long values that are not validated and could conceivably contain a negative value thus causing the totalBytesCompacted to decrease. It's interesting to note that the all of the differences I've seen so far are more than the overflow value of a signed 32 bit value. The OS (CentOS 5.7) and sun java vm (1.6.0_29) are both 64 bit. JNA is enabled. Is this expected and normal? If so, what is the correct interpretation of this metric? I'm seeing the negatives values a few times per hour when reading it once every 60 seconds. -Bryan -- Bryan Talbot Architect / Platform team lead, Aeria Games and Entertainment Silicon Valley | Berlin | Tokyo | Sao Paulo
Re: RandomPartitioner and the token limits
AFAIK in the code the minimum exclusive value token is -1, so as a signed integer the maxmium value is 2**127 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/10/2012, at 3:19 AM, Carlos Pérez Miguel cperez...@gmail.com wrote: Hello, Reading the wiki of operations (http://wiki.apache.org/cassandra/Operations) I noticed something strange. When using RandomPartitioner, tokens are integers in the range [0,2**127] (both limits included) but keys are converted into this range using MD5. MD5 has 128 bits, so, tokens should not be in the range [0, (2**128)-1]? Anyway, if Cassandra uses only 127 bits of that 128 bits because it tries to convert this 128 bit into a signed int, tokens should not be in the range [0, 2**127) (first limit included, last not included)? Thank you Carlos Pérez Miguel
Re: Regarding Row Cache configuration and non-heap memory
In short the question is whether the row_cache_size_in_mb can exceed the heap setting for cassandra 1.1.4 if jna.jar is present in the libs? Yes. AFAIK jna.jar is not required for off heap row cache in 1.1.X My heap settings are 8G and new heap size is 1600M. You can reduce the size of the heap in 1.1X. The default settings max out of 4G. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/10/2012, at 3:21 PM, Ananth Gundabattula agundabatt...@gmail.com wrote: Hello, I have configured cassandra 1.1.4 to use row cache of 10GB ( the RAM on the machine is pretty big and hence the row cache size is high). My heap settings are 8G and new heap size is 1600M. As I read from the forum and documentation, jna.jar allows to use non-heap memory for the row caches. The question I have is how is the configuration in cassandra.yaml for row_cache_size_in_mb interpreted? Is it referring to the non-heap setting or the memory used inside the heap to maintain book-keeping information about the non-heap memory ( as I gather from the postings that heap is indeed used to some extent while still using the non-heap memory for row caches). In short the question is whether the row_cache_size_in_mb can exceed the heap setting for cassandra 1.1.4 if jna.jar is present in the libs? Thanks for your time. Regards, Ananth
Re: Importing sstable with Composite key? (without is working)
Not sure why you have two different definitions for the bars2 CF. You will need to create SSTable's that match the schema cassandra has. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/10/2012, at 7:15 AM, T Akhayo t.akh...@gmail.com wrote: Good evening, Today i managed to get a small cluster running of 2 computers. I also managed to get my data model working and are able to import sstables created with SSTableSimpleUnsortedWriter with sstableloader. The only problem is when i try to use the composite key in my datamodel, after i import my sstables and issue a simple select the cassandra crashes: === ava.lang.IllegalArgumentException at java.nio.Buffer.limit(Unknown Source) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:76) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31) at java.util.TreeMap.put(Unknown Source) at org.apache.cassandra.db.TreeMapBackedSortedColumns.addColumn(TreeMapBackedSortedColumns.java:95) at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109) ... at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:108) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) === Now i can get everything running again by removing the data directories on both nodes. I suspect cassandra crashes because the sstable that is being imported has a different schema when it comes to composite key (without composite key import works fine). My schema with composite key is: === create table bars2( id uuid, timeframe int, datum timestamp, open double, high double, low double, close double, bartype int, PRIMARY KEY (timeframe, datum) ); === create column family bars2 with column_type = 'Standard' and comparator = 'CompositeType(org.apache.cassandra.db.marshal.DateType,org.apache.cassandra.db.marshal.UTF8Type)' and default_validation_class = 'UTF8Type' and key_validation_class = 'Int32Type' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; === My code to create the sstable is (only the interested parts): === sstWriter = new SSTableSimpleUnsortedWriter(new File(c:\\cassandra\\newtables\\), new RandomPartitioner(), readtick, bars2, UTF8Type.instance, null, 64); CompositeType.Builder cb=new CompositeType.Builder(CompositeType.getInstance(compositeList)); cb.add( bytes(curMinuteBar.getDatum().getTime())); cb.add(bytes(1)); sstWriter.newRow(cb.build()); (... add columns...) === I highly suspect that the problem can be at 2 locations: - In the SSTableSimpleUnsortedWriter i use a UTF8Type.instance as comparator, i'm not sure if that is right with a composite key? - When calling sstWriter.newRow i use CompositeType.Builder to build the composite key, i'm not sure if i'm doing this the right way? (i did try different combinations) Does somebody know how i can continue on my journey?
Re: Why data is not even distributed.
This is an issue with using the BOP. If you are just starting out stick with the Random Partitioner. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/10/2012, at 10:33 AM, Andrey Ilinykh ailin...@gmail.com wrote: It was my first thought. Then I md5 uuid and used the digest as a key: MessageDigest md = MessageDigest.getInstance(MD5); //in the loop UUID uuid = UUID.randomUUID(); byte[] bytes = md.digest(asByteArray(uuid)); the result is exactly the same, first node takes 66%, second 33% and third one is empty. for some reason rows which should be placed on third node moved to first one. Address DC RackStatus State Load Effective-Ownership Token Token(bytes[56713727820156410577229101238628035242]) 127.0.0.1 datacenter1 rack1 Up Normal 7.68 MB 33.33% Token(bytes[00]) 127.0.0.3 datacenter1 rack1 Up Normal 79.17 KB 33.33% Token(bytes[0113427455640312821154458202477256070485]) 127.0.0.2 datacenter1 rack1 Up Normal 3.81 MB 33.33% Token(bytes[56713727820156410577229101238628035242]) On Thu, Oct 4, 2012 at 12:33 AM, Tom fivemile...@gmail.com wrote: Hi Andrey, while the data values you generated might be following a true random distribution, your row key, UUID, is not (because it is created on the same machines by the same software with a certain window of time) For example, if you were using the UUID class in Java, these would be composed from several components (related to dimensions such as time and version), so you can not expect a random distribution over the whole space. Cheers Tom On Wed, Oct 3, 2012 at 5:39 PM, Andrey Ilinykh ailin...@gmail.com wrote: Hello, everybody! I'm observing very strange behavior. I have 3 node cluster with ByteOrderPartitioner. (I run 1.1.5) I created a key space with replication factor of 1. Then I created one column family and populated it with random data. I use UUID as a row key, and Integer as a column name. Row keys were generated as UUID uuid = UUID.randomUUID(); I populated about 10 rows with 100 column each. I would expect equal load on each node, but the result is totally different. This is what nodetool gives me: Address DC RackStatus State Load Effective-Ownership Token Token(bytes[56713727820156410577229101238628035242]) 127.0.0.1 datacenter1 rack1 Up Normal 27.61 MB 33.33% Token(bytes[00]) 127.0.0.3 datacenter1 rack1 Up Normal 206.47 KB 33.33% Token(bytes[0113427455640312821154458202477256070485]) 127.0.0.2 datacenter1 rack1 Up Normal 13.86 MB 33.33% Token(bytes[56713727820156410577229101238628035242]) one node (127.0.0.3) is almost empty. Any ideas what is wrong? Thank you, Andrey
Re: Query over secondary indexes
get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? How long is ages and how many nodes do you have? Are there any errors in server logs ? When you do a get by secondary index at a CL higher than ONE ever RFth node is involved. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/10/2012, at 10:20 PM, Vivek Mishra mishra.v...@gmail.com wrote: Thanks Rishabh. But i want to search over duplicate columns only. -Vivek On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal rishabh.agra...@impetus.co.in wrote: Try making user_name a primary key in combination with some other unique column and see if results are improving. -Rishabh From: Vivek Mishra [mailto:mishra.v...@gmail.com] Sent: Friday, October 05, 2012 2:35 PM To: user@cassandra.apache.org Subject: Query over secondary indexes I have a column family User which is having a indexed column user_name. My schema is having around 0.1 million records only and user_name is duplicated across all rows. Now when i am trying to retrieve it as: get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? Also, i tried get_indexed_slices via Thrift API by setting IndexClause.setCount(1), still no luck, it got hang and not even returning a single result. I believe 0.1 million is not a huge amount of data. Cassandra version : 1.1.2 Any idea? -Vivek Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
READ messages dropped
Hi! In the last 3 days I see many messages of READ messages dropped in last 5000ms on one of my 3 nodes cluster. I see no errors in the log. There are also messages of Finished hinted handoff of 0 rows to endpoint but I had those for a while now, so I don't know if they are related. I am running Cassandra 1.0.8 on a 3 node cluster on EC2 m1.large instances. Rep factor 3 (Quorum read and write) Does anyone have a clue what I should be looking for, or how to solve it? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.png
Re: Question regarding hinted handoffs and restoring backup in cluster
If you are restoring the backup to get back to previous point in them, then you will want to remove all hints from the cluster. You will also want to stop recording them, IIRC the only way to do that is via a yaml config. If you are restoring the data to recover from some sort of loss, then keeping the hints in place is ok. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/10/2012, at 12:30 AM, Fredrik fredrik.l.stigb...@sitevision.se wrote: When restoring a backup for the entire cluster my understanding is that you must shutdown the entire cluster and then restore the backup and then start up all nodes again. http://www.datastax.com/docs/1.0/operations/backup_restore But how should I handle hinted handoffs (Hints CF). Since they're stored in the system keyspace and according to the docs I only need to restore the specific keyspace not the system keyspace. Won't these hinted handoffs, which isn't based on the backup, be delivered and applied as soon as one of the node which they're aimed for comes up and thus be applied to the backuped data. What is the recomended way to handle this situation? Removing the hints cf from the system tables before restart of the cluster nodes? Regards /Fredrik
Re: rolling restart after gc_grace change
Is it still an issue if you don't run a repair within gc_grace_seconds ? There is a potential issue. You want to make sure the tombstones are distributed to all replicas *before* gc_grace_seconds has expired. If they are not you can have a case where some replicas compact and purge their tombstone (essentially a hard delete), while one replica keeps the original value. The result is data returning from the dead. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/10/2012, at 2:54 AM, Oleg Dulin oleg.du...@gmail.com wrote: What if gc_grace_seconds is pretty low, say 2 mins, what happens with nodetool repair ? That wiki page below points at a bug that has been fixed long ago. Is it still an issue if you don't run a repair within gc_grace_seconds ? On 2012-01-09 10:02:49 +, aaron morton said: Nah, thats old style. gc_grace_seconds is a CF level setting now. Make the change with update column family in the CLI or your favorite client. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 9/01/2012, at 9:33 PM, Igor wrote: Hi! On thehttp://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds you can read: To minimize the amount of forgotten deletes, first increase GCGraceSeconds across the cluster (rolling restart required) Rolling restart still required for 1.0.6? -- Regards, Oleg Dulin NYC Java Big Data Engineer http://www.olegdulin.com/
Re: question about where clause of CQL update statement
What is the CF schema ? Is it not possible to include a column in both the set clause and in the where clause? And if it is not possible, how come? Not sure. Looks like you are looking for a conditional update here. You know the row is at ID 1 and you only want to update if locked = 'false' ? Not sure that's supported. But I'm not sure this is the right sort of error. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/10/2012, at 11:30 AM, John Sanda john.sa...@gmail.com wrote: I am using CQL 3 and trying to execute the following, UPDATE CHANGELOGLOCK SET LOCKED = 'true', LOCKEDBY = '10.11.8.242 (10.11.8.242)', LOCKGRANTED = '2012-10-05 16:58:01' WHERE ID = 1 AND LOCKED = 'false'; It gives me the error, Bad Request: PRIMARY KEY part locked found in SET part. The primary key consists only of the ID column, but I do have a secondary index on the locked column. Is it not possible to include a column in both the set clause and in the where clause? And if it is not possible, how come? Thanks - John
Re: Text searches and free form queries
It works pretty fast. Cool. Just keep an eye out for how big the lucene token row gets. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 7/10/2012, at 2:57 AM, Oleg Dulin oleg.du...@gmail.com wrote: So, what I ended up doing is this -- As I write my records into the main CF, I tokenize some fields that I want to search on using Lucene and write an index into a separate CF, such that my columns are a composite of: luceneToken:record key I can then search my records by doing a slice for each lucene token in the search query and then do an intersection of the sets. It works pretty fast. Regards, Oleg On 2012-09-05 01:28:44 +, aaron morton said: AFAIk if you want to keep it inside cassandra then DSE, roll your own from scratch or start with https://github.com/tjake/Solandra . Outside of Cassandra I've heard of people using Elastic Search or Solr which I *think* is now faster at updating the index. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/09/2012, at 3:00 AM, Andrey V. Panov panov.a...@gmail.com wrote: Some one did search on Lucene, but for very fresh data they build search index in memory so data become available for search without delays. On 3 September 2012 22:25, Oleg Dulin oleg.du...@gmail.com wrote: Dear Distinguished Colleagues: -- Regards, Oleg Dulin NYC Java Big Data Engineer http://www.olegdulin.com/
Re: Why data is not even distributed.
The problem was - I calculated 3 tokens for random partitioner but used them with BOP, so nodes were not supposed to be loaded evenly. That's ok, I got it. But what I don't understand, why nodetool ring shows equal ownership. This is an example: I created small cluster with BOP and three tokens 00 then I put some random data which is nicely distributed: Address DC RackStatus State Load Effective-Ownership Token Token(bytes[]) 127.0.0.1 datacenter1 rack1 Up Normal 1.92 MB 33.33% Token(bytes[00]) 127.0.0.2 datacenter1 rack1 Up Normal 1.93 MB 33.33% Token(bytes[]) 127.0.0.3 datacenter1 rack1 Up Normal 1.99 MB 33.33% Token(bytes[]) then I moved node 2 to 0100 and node 3 to 0200. Which means node 1 owns almost everything. Address DC RackStatus State Load Effective-Ownership Token Token(bytes[0200]) 127.0.0.1 datacenter1 rack1 Up Normal 5.76 MB 33.33% Token(bytes[00]) 127.0.0.2 datacenter1 rack1 Up Normal 30.37 KB 33.33% Token(bytes[0100]) 127.0.0.3 datacenter1 rack1 Up Normal 25.78 KB 33.33% Token(bytes[0200]) As you can see all data is located on node 1. But nodetool ring still shows 33.33% for each node. No matter how I move nodes, it always gives me 33.33%. It looks like a bug for me. Thank you, Andrey
Re: Query over secondary indexes
It was on 1 node and there is no error in server logs. -Vivek On Tue, Oct 9, 2012 at 1:21 AM, aaron morton aa...@thelastpickle.comwrote: get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? How long is ages and how many nodes do you have? Are there any errors in server logs ? When you do a get by secondary index at a CL higher than ONE ever RFth node is involved. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/10/2012, at 10:20 PM, Vivek Mishra mishra.v...@gmail.com wrote: Thanks Rishabh. But i want to search over duplicate columns only. -Vivek On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal rishabh.agra...@impetus.co.in wrote: Try making *user_name* a primary key in combination with some other unique column and see if results are improving. -Rishabh *From:* Vivek Mishra [mailto:mishra.v...@gmail.com] *Sent:* Friday, October 05, 2012 2:35 PM *To:* user@cassandra.apache.org *Subject:* Query over secondary indexes I have a column family User which is having a indexed column user_name. My schema is having around 0.1 million records only and user_name is duplicated across all rows. Now when i am trying to retrieve it as: get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? Also, i tried get_indexed_slices via Thrift API by setting IndexClause.setCount(1), still no luck, it got hang and not even returning a single result. I believe 0.1 million is not a huge amount of data. Cassandra version : 1.1.2 Any idea? -Vivek -- Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Query over secondary indexes
I did wait for atleast 5 minutes before terminating it. Also sometimes it results in server crash as well, though data volume is not very huge. -Vivek On Tue, Oct 9, 2012 at 7:05 AM, Vivek Mishra mishra.v...@gmail.com wrote: It was on 1 node and there is no error in server logs. -Vivek On Tue, Oct 9, 2012 at 1:21 AM, aaron morton aa...@thelastpickle.comwrote: get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? How long is ages and how many nodes do you have? Are there any errors in server logs ? When you do a get by secondary index at a CL higher than ONE ever RFth node is involved. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/10/2012, at 10:20 PM, Vivek Mishra mishra.v...@gmail.com wrote: Thanks Rishabh. But i want to search over duplicate columns only. -Vivek On Fri, Oct 5, 2012 at 2:45 PM, Rishabh Agrawal rishabh.agra...@impetus.co.in wrote: Try making *user_name* a primary key in combination with some other unique column and see if results are improving. -Rishabh *From:* Vivek Mishra [mailto:mishra.v...@gmail.com] *Sent:* Friday, October 05, 2012 2:35 PM *To:* user@cassandra.apache.org *Subject:* Query over secondary indexes I have a column family User which is having a indexed column user_name. My schema is having around 0.1 million records only and user_name is duplicated across all rows. Now when i am trying to retrieve it as: get User where user_name = 'Vivek', it is taking ages to retrieve that data. Is there anything i am doing wrong? Also, i tried get_indexed_slices via Thrift API by setting IndexClause.setCount(1), still no luck, it got hang and not even returning a single result. I believe 0.1 million is not a huge amount of data. Cassandra version : 1.1.2 Any idea? -Vivek -- Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Nodetool repair, exit code/status?
Hello. In the process of trying to streamline and provide better reporting for various data storage systems, I've realized that although we're verifying that nodetool repair runs, we're not verifying that it is successful. I found a bug relating to the exit code for nodetool repair, where, in some situations, there is no way to verify the repair has completed successfully: https://issues.apache.org/jira/browse/CASSANDRA-2666 Is this still a problem? What is the best way to monitor the final status of the repair command to make sure all is well? Thank you ahead of time for any info. - David
Using Composite columns
Hi, I am trying to use compound primary key with cassandra and i am referring to: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 I have created a column family as: CREATE TABLE altercations ( instigator text, started_at timestamp, ships_destroyed int, energy_used float, alliance_involvement boolean, PRIMARY KEY (instigator, started_at) ); then tried: cqlsh:testcomp select * from altercations; (gives me no results, Which looks fine). Then i tried insert statement as: INSERT INTO altercations (instigator, started_at, ships_destroyed, energy_used, alliance_involvement) VALUES ('Jayne Cobb', '7943-07-23', 2, 4.6, 'false'); (Sucess with this) Then again i tried: cqlsh:testcomp select * from altercations; it is giving me an error: [timestamp out of range for platform time_t] I am able to get that work by changing '7943-07-23' to '2012-07-23'. Just wanted to know, why cassandra does not complain for {timestamp out of range for platform time_t} at the time of persisting it? -Vivek
RE: Using compound primary key
Did you use the --cql3 option with the cqlsh command? From: Vivek Mishra [mailto:mishra.v...@gmail.com] Sent: Monday, October 08, 2012 7:22 PM To: user@cassandra.apache.org Subject: Using compound primary key Hi, I am trying to use compound primary key column name and i am referring to: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 As mentioned on this example, i tried to create a column family containing compound primary key (one or more) as: CREATE TABLE altercations ( instigator text, started_at timestamp, ships_destroyed int, energy_used float, alliance_involvement boolean, PRIMARY KEY (instigator,started_at,ships_destroyed) ); And i am getting: ** TSocket read 0 bytes cqlsh:testcomp ** Then followed by insert and select statements giving me following errors: cqlsh:testcompINSERT INTO altercations (instigator, started_at, ships_destroyed, ... energy_used, alliance_involvement) ... VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false'); TSocket read 0 bytes cqlsh:testcomp select * from altercations; Traceback (most recent call last): File bin/cqlsh, line 1008, in perform_statement self.cursor.execute(statement, decoder=decoder) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 117, in execute response = self.handle_cql_execution_errors(doquery, prepared_q, compress) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 132, in handle_cql_execution_errors return executor(*args, **kwargs) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1583, in execute_cql_query self.send_execute_cql_query(query, compression) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1593, in send_execute_cql_query self._oprot.trans.flush() File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 293, in flush self.__trans.write(buf) File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py, line 117, in write plus = self.handle.send(buff) error: [Errno 32] Broken pipe cqlsh:testcomp Any idea? Is it a problem with CQL3 or with cassandra? P.S: I did post same query on dev group as well to get a quick response. -Vivek
Re: Using compound primary key
Certainly. As these are available with cql3 only! Example mentioned on datastax website is working fine, only difference is i tried with a compound primary key with 3 composite columns in place of 2 -Vivek On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua aba...@247-inc.com wrote: ** ** Did you use the “--cql3” option with the cqlsh command? ** ** *From:* Vivek Mishra [mailto:mishra.v...@gmail.com] *Sent:* Monday, October 08, 2012 7:22 PM *To:* user@cassandra.apache.org *Subject:* Using compound primary key ** ** Hi, ** ** I am trying to use compound primary key column name and i am referring to: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 ** ** As mentioned on this example, i tried to create a column family containing compound primary key (one or more) as: ** ** CREATE TABLE altercations ( instigator text, started_at timestamp, ships_destroyed int, energy_used float, alliance_involvement boolean, PRIMARY KEY (instigator,started_at,ships_destroyed) ); ** ** And i am getting: ** ** ** TSocket read 0 bytes cqlsh:testcomp ** ** ** ** ** Then followed by insert and select statements giving me following errors:* *** ** ** ** ** cqlsh:testcompINSERT INTO altercations (instigator, started_at, ships_destroyed, ... energy_used, alliance_involvement) ... VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false'); TSocket read 0 bytes ** ** cqlsh:testcomp select * from altercations; Traceback (most recent call last): File bin/cqlsh, line 1008, in perform_statement self.cursor.execute(statement, decoder=decoder) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 117, in execute response = self.handle_cql_execution_errors(doquery, prepared_q, compress) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 132, in handle_cql_execution_errors return executor(*args, **kwargs) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1583, in execute_cql_query self.send_execute_cql_query(query, compression) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1593, in send_execute_cql_query self._oprot.trans.flush() File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 293, in flush self.__trans.write(buf) File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py, line 117, in write plus = self.handle.send(buff) error: [Errno 32] Broken pipe ** ** cqlsh:testcomp ** ** ** ** ** ** ** ** Any idea? Is it a problem with CQL3 or with cassandra? ** ** P.S: I did post same query on dev group as well to get a quick response.** ** ** ** ** ** -Vivek
Re: Using compound primary key
Hey Vivek, The same thing happened to me the other day. You may be missing a component in your compound key. See this thread: http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E I also wrote a couple blogs on it: http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html They've fixed this in the 1.2 beta, whereby it checks (at the thrift layer) to ensure you have the requisite number of components in the compound/composite key. -brian On Oct 8, 2012, at 10:32 PM, Vivek Mishra wrote: Certainly. As these are available with cql3 only! Example mentioned on datastax website is working fine, only difference is i tried with a compound primary key with 3 composite columns in place of 2 -Vivek On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua aba...@247-inc.com wrote: Did you use the “--cql3” option with the cqlsh command? From: Vivek Mishra [mailto:mishra.v...@gmail.com] Sent: Monday, October 08, 2012 7:22 PM To: user@cassandra.apache.org Subject: Using compound primary key Hi, I am trying to use compound primary key column name and i am referring to: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 As mentioned on this example, i tried to create a column family containing compound primary key (one or more) as: CREATE TABLE altercations ( instigator text, started_at timestamp, ships_destroyed int, energy_used float, alliance_involvement boolean, PRIMARY KEY (instigator,started_at,ships_destroyed) ); And i am getting: ** TSocket read 0 bytes cqlsh:testcomp ** Then followed by insert and select statements giving me following errors: cqlsh:testcompINSERT INTO altercations (instigator, started_at, ships_destroyed, ... energy_used, alliance_involvement) ... VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false'); TSocket read 0 bytes cqlsh:testcomp select * from altercations; Traceback (most recent call last): File bin/cqlsh, line 1008, in perform_statement self.cursor.execute(statement, decoder=decoder) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 117, in execute response = self.handle_cql_execution_errors(doquery, prepared_q, compress) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 132, in handle_cql_execution_errors return executor(*args, **kwargs) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1583, in execute_cql_query self.send_execute_cql_query(query, compression) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1593, in send_execute_cql_query self._oprot.trans.flush() File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 293, in flush self.__trans.write(buf) File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py, line 117, in write plus = self.handle.send(buff) error: [Errno 32] Broken pipe cqlsh:testcomp Any idea? Is it a problem with CQL3 or with cassandra? P.S: I did post same query on dev group as well to get a quick response. -Vivek -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using compound primary key
Hi Brian, Thanks for these references. These will surly help as i am on my way to get them integrate with-in Kundera. Surprisingly Column family itself was not created with example i was trying. Thanks again, -Vivek On Tue, Oct 9, 2012 at 8:33 AM, Brian O'Neill b...@alumni.brown.edu wrote: Hey Vivek, The same thing happened to me the other day. You may be missing a component in your compound key. See this thread: http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E I also wrote a couple blogs on it: http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html http://brianoneill.blogspot.com/2012/10/cql-astyanax-and-compoundcomposite-keys.html They've fixed this in the 1.2 beta, whereby it checks (at the thrift layer) to ensure you have the requisite number of components in the compound/composite key. -brian On Oct 8, 2012, at 10:32 PM, Vivek Mishra wrote: Certainly. As these are available with cql3 only! Example mentioned on datastax website is working fine, only difference is i tried with a compound primary key with 3 composite columns in place of 2 -Vivek On Tue, Oct 9, 2012 at 7:57 AM, Arindam Barua aba...@247-inc.com wrote: ** ** Did you use the “--cql3” option with the cqlsh command? ** ** *From:* Vivek Mishra [mailto:mishra.v...@gmail.com] *Sent:* Monday, October 08, 2012 7:22 PM *To:* user@cassandra.apache.org *Subject:* Using compound primary key ** ** Hi, ** ** I am trying to use compound primary key column name and i am referring to: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 ** ** As mentioned on this example, i tried to create a column family containing compound primary key (one or more) as: ** ** CREATE TABLE altercations ( instigator text, started_at timestamp, ships_destroyed int, energy_used float, alliance_involvement boolean, PRIMARY KEY (instigator,started_at,ships_destroyed) ); ** ** And i am getting: ** ** * * TSocket read 0 bytes cqlsh:testcomp * * ** ** ** ** Then followed by insert and select statements giving me following errors: ** ** ** ** cqlsh:testcompINSERT INTO altercations (instigator, started_at, ships_destroyed, ... energy_used, alliance_involvement) ... VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false'); TSocket read 0 bytes ** ** cqlsh:testcomp select * from altercations; Traceback (most recent call last): File bin/cqlsh, line 1008, in perform_statement self.cursor.execute(statement, decoder=decoder) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 117, in execute response = self.handle_cql_execution_errors(doquery, prepared_q, compress) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py, line 132, in handle_cql_execution_errors return executor(*args, **kwargs) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1583, in execute_cql_query self.send_execute_cql_query(query, compression) File bin/../lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py, line 1593, in send_execute_cql_query self._oprot.trans.flush() File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 293, in flush self.__trans.write(buf) File bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py, line 117, in write plus = self.handle.send(buff) error: [Errno 32] Broken pipe ** ** cqlsh:testcomp ** ** ** ** ** ** ** ** Any idea? Is it a problem with CQL3 or with cassandra? ** ** P.S: I did post same query on dev group as well to get a quick response.* *** ** ** ** ** -Vivek -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/