Re: Java heap space on Cassandra start up version 1.0.10
You may have a corrupt metadata/statistics sstable component. You can try deleting those and restarting. Cassandra can rebuild that component if it is missing. On Fri, Jul 6, 2012 at 6:00 PM, Jason Hill jasonhill...@gmail.com wrote: Hello friends, I'm getting a: ERROR 22:50:29,695 Fatal exception in thread Thread[SSTableBatchOpen:2,5,main] java.lang.OutOfMemoryError: Java heap space error when I start Cassandra. This node was running fine and after some server work/upgrades it started throwing this error when I start the Cassandra service. I was on 0.8.? and have upgraded to 1.0.10 to see if it would help, but I get the same error. I've removed some of the column families from my keyspace directory to see if I can get it to start without the heap space error and with some combinations it will run. However, I'd like to get it running with all my colFams and wonder if someone could give me some advice on what might be causing my error. It doesn't seem to be related to compaction, if I am reading the log correctly, and most of the help I've found on this topic deals with compaction. I'm thinking that my 2 column families should not be enough to fill my heap, but I am at a loss as to what I should try next? Thanks for your consideration. output.log: INFO 22:50:26,319 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 22:50:26,322 Heap size: 5905580032/5905580032 INFO 22:50:26,322 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/apache-cassandra-1.0.10.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.10.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 22:50:28,586 JNA mlockall successful INFO 22:50:28,593 Loading settings from file:/etc/cassandra/cassandra.yaml DEBUG 22:50:28,677 Syncing log with a period of 1 INFO 22:50:28,677 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 22:50:28,686 Global memtable threshold is enabled at 1877MB DEBUG 22:50:28,761 setting auto_bootstrap to true snip DEBUG 22:50:28,797 Checking directory /var/lib/cassandra/data DEBUG 22:50:28,798 Checking directory /var/lib/cassandra/commitlog DEBUG 22:50:28,798 Checking directory /var/lib/cassandra/saved_caches DEBUG 22:50:28,806 Removing compacted SSTable files from NodeIdInfo (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,808 Removing compacted SSTable files from Versions (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,818 Removing compacted SSTable files from Versions.76657273696f6e (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,819 Removing compacted SSTable files from IndexInfo (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,821 Removing compacted SSTable files from Schema (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,823 Removing compacted SSTable files from Migrations (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,825 Removing compacted SSTable files from LocationInfo (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,827 Removing compacted SSTable files from HintsColumnFamily (see http://wiki.apache.org/cassandra/MemtableSSTable) DEBUG 22:50:28,833 Initializing system.NodeIdInfo DEBUG 22:50:28,839 Starting CFS NodeIdInfo DEBUG 22:50:28,868 Creating IntervalNode from [] DEBUG 22:50:28,869 KeyCache capacity for NodeIdInfo is 1 DEBUG 22:50:28,871 Initializing system.Versions DEBUG 22:50:28,873 Starting CFS Versions INFO 22:50:28,877 Opening /var/lib/cassandra/data/system/Versions-hd-5 (248 bytes) DEBUG 22:50:28,879 Load metadata for /var/lib/cassandra/data/system/Versions-hd-5 INFO 22:50:28,880 Opening /var/lib/cassandra/data/system/Versions-hd-6 (248 bytes) DEBUG 22:50:28,880 Load metadata for
Re: BulkLoading sstables from v1.0.3 to v1.1.1
Thanks Ivo. We are quite close to releasing so we'd hope to understand what causing the error and may try to avoid it where possible. As said, it seems to work ok the first time round. The problem you referring in the last mail, was it restricted to bulk loading or otherwise? Thanks -A Ivo Meißner i...@overtronic.com 於 10 Jul 2012 07:20 寫道: Hi, there are some problems in version 1.1.1 with secondary indexes and key caches that are fixed in 1.1.2. I would try to upgrade to 1.1.2 and see if the error still occurs. Ivo Hi As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh. Not sure if relevant, but one of the cf contains a secondary index. The build basically does: Drop the destination keyspace if exist Add the destination keyspace, wait for schema to agree run sstableLoader Do some validation of the streamed data Keyspace / column families schema are basically the same, apart from in the one of v1.1.1, we had compression and key cache switched on. On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables loaded beautifully. But subsequent build failed with -- [21:02:02][exec] progress: [snip ip_addresses]... [total: 0 - 0MB/s (avg: 0MB/s)]ERROR 21:02:02,811 Error in ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: Connection reset
Re: Serious issue updating Cassandra version and topology
To be clear, this happened on a 1.1.2 node and it happened again *after* you had run a scrub ? Has this cluster been around for a while or was the data created with 1.1 ? Can you confirm that all sstables were re-written for the CF? Check the timestamp on the files. Also also files should have the same version, the -h?- part of the name. Can you repair the other CF's ? If this cannot be repaired by scrub or upgradetables you may need to cut the row out of the sstables. Using sstable2json and json2sstable. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/07/2012, at 4:05 PM, Michael Theroux wrote: Hello, We're in the process of trying to move a 6-node cluster from RF=1 to RF=3. Once our replication factor was upped to 3, we ran nodetool repair, and immediately hit an issue on the first node we ran repair on: INFO 03:08:51,536 Starting repair command #1, repairing 2 ranges. INFO 03:08:51,552 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] new session: will sync xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101, /10.29.187.61 on range (Token(bytes[d558]),Token(bytes[])] for x.[a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s] INFO 03:08:51,555 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] requesting merkle trees for a (to [/10.29.187.61, xxx-xx-xx-xxx-compute-1.amazonaws.com/10.202.99.101]) INFO 03:08:52,719 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for a from /10.29.187.61 INFO 03:08:53,518 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for a from xxx-xx-xx-xxx-.compute-1.amazonaws.com/10.202.99.101 INFO 03:08:53,519 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] requesting merkle trees for b (to [/10.29.187.61, xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101]) INFO 03:08:53,639 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Endpoints /10.29.187.61 and xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101 are consistent for a INFO 03:08:53,640 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] a is fully synced (18 remaining column family to sync for this session) INFO 03:08:54,049 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for b from /10.29.187.61 ERROR 03:09:09,440 Exception in thread Thread[ValidationExecutor:1,1,main] java.lang.AssertionError: row DecoratedKey(Token(bytes[efd5654ce92a705b14244e2f5f73ab98c3de2f66c7adbd71e0e893997e198c47]), efd5654ce92a705b14244e2f5f73ab98c3de2f66c7adbd71e0e893997e198c47) received out of order wrt DecoratedKey(Token(bytes[f33a5ad4a45e8cac7987737db246ddfe9294c95bea40f411485055f5dbecbadb]), f33a5ad4a45e8cac7987737db246ddfe9294c95bea40f411485055f5dbecbadb) at org.apache.cassandra.service.AntiEntropyService$Validator.add(AntiEntropyService.java:349) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:712) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) It looks from the log above, the sync of the a column family was successful. However, the b column family resulted in this error. In addition, the repair hung after this error. We ran node tool scrub on all nodes and invalidated the key and row caches and tried again (with RF=2), and it didn't help alleviate the problem. Some other important pieces of information: We use ByteOrderedPartitioner (we MD5 hash the keys ourselves) We're using Leveled Compaction As we're in the middle of a transition, one node is on 1.1.2 (the one we tried repair on), the other 5 are on 1.1.1 Thanks, -Mike
Re: Effect of rangequeries with RandomPartitioner
Index files map keys (not tokens) to offsets in the data file. A range scan uses the index file to seek to the start position in the data file and then does a partial scan of the data file. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 9/07/2012, at 7:24 PM, prasenjit mukherjee wrote: Thanks for the response. Further questions inline.. On Mon, Jul 9, 2012 at 11:50 AM, samal samalgo...@gmail.com wrote: 1. With RandomPartitioner, on a given node, are the keys sorted by their hash_values or original/unhashed keys ? hash value, 1. Based on the second answer in http://stackoverflow.com/questions/2359175/cassandra-file-structure-how-are-the-files-used it seems that the index-file ( for a given ssTable ) contains the row-key ( and not the hash_keys ). Or may be I am missing something. 2. Do the keys in Index-file ( ref http://hi.csdn.net/attachment/20/28/0_1322461982l3D8.gif ) actually contain : hash(row_key)+row_key or something like that ? Otherwise you need a separate mapping info from hash_bucket - rows for reading. -Thanks, Prasenjit
Re: Setting the Memtable allocator on a per CF basis
Would you guys consider adding this option to a future release? All improvements are considered :) Please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA and reference CASSANDRA-3073 If you want I can try to create a patch myself and submit it to you? Sounds like a plan. Thanks - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 10/07/2012, at 1:47 AM, Joost van de Wijgerd wrote: Hello Cassandra Devs, We are currently trying to optimize our Cassandra system with different workloads. One of our workload is update heavy (very). Currently we are running with a patch that allows the Live Ratio to go below 1.0 (lower bound set to 0.1 now) which gives us a bit better performance in terms of flushes on this particular CF. We then experienced unexpected memory issues which on further inspection seems to be related to the SlabAllocator. What happens is that we allocate a Region of 1MB every couple of seconds (the columns we write in this CF contain serialized session data, can be 100K each), so overwrites are actually done into another Region and these regions are only freed (most of the time) when the Memtable is flushed. We actually added some debug logs and to write about 300MB to disk we created roughly 3000 regions. (3GB of data, some of them might be collected before the flush but probably not much) It would really great if we could use the native allocator only for this CF. Since the SlabAllocator gives us very good results on our other CFs. (we tried running on a patched version with the HeapAllocator set but went OOM almost immediately) I have found this issue in which Jonathan mentions he is ok with adding a configuration option: https://issues.apache.org/jira/browse/CASSANDRA-3073 Unfortunately it seems the issue was closed and nothing was implemented. Would you guys consider adding this option to a future release? SlabAllocator should be the default but in the CF properties the HeapAllocator can be set. If you want I can try to create a patch myself and submit it to you? Kind Regards Joost -- Joost van de Wijgerd Visseringstraat 21B 1051KH Amsterdam +31624111401 joost.van.de.wijgerd@Skype http://www.linkedin.com/in/jwijgerd
Re: Composite Slice Query returning non-sliced data
Ah, it's a Hector query question. You may have bette luck on the Hector email list. Or if you can turn on debug logging on the server and grab the query that would be handy. The first thing that stands out is that (in cassandra) comparison operations are not used in a slice range. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 10/07/2012, at 12:36 PM, Sunit Randhawa wrote: Aaron, Let me start from the beginning. 1- I have a ColumnFamily called Rollup15 with below definition: create column family Rollup15 with comparator = 'CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)' and key_validation_class = UTF8Type and default_validation_class = UTF8Type; 2- Once created, it is empty. Below is the output of CLI: [default@Schema] list Rollup15; Using default limit of 100 0 Row Returned. Elapsed time: 16 msec(s). 3- I use the Code below to insert the Composite Data into Cassandra: public void insertData(String columnFamilyName, String key, String value, int rollupInterval, String... columnSlice) { Composite colKey = new Composite(); colKey.addComponent(rollupInterval, IntegerSerializer.get()); if (columnSlice != null){ for (String colName : columnSlice){ colKey.addComponent(colName, serializer); } } createMutator(keyspace, serializer).addInsertion(key, columnFamilyName, createColumn(colKey, value, new CompositeSerializer(), serializer)).execute(); } 4- After insertion, below is the CLI Output: [default@Schema] list Rollup15; Using default limit of 100 --- RowKey: query1_1337295600 = (column=15:Composite1:Composite2, value=value123, timesta mp=134187983347) 1 Row Returned. Elapsed time: 9 msec(s). So, there is record with 3 Composite Keys (15, Composite1 and Composite2) 5- Now I am doing fetch based on Code Below. I am doing a fetch for column 15:Composite3 which I know it is not there: Composite start = new Composite(); start.addComponent(0, 15, Composite.ComponentEquality.EQUAL); start.addComponent(1, Composite3,Composite.ComponentEquality.EQUAL); Composite finish = new Composite(); finish.addComponent(0, 15, Composite.ComponentEquality.EQUAL); finish.addComponent(1,Composite3+ Character.MAX_VALUE, Composite.ComponentEquality.GREATER_THAN_EQUAL); SliceQueryString, Composite, String sq = HFactory.createSliceQuery(keyspace, StringSerializer.get(), new CompositeSerializer(), StringSerializer.get()); sq.setColumnFamily(Rollup15); sq.setKey(query1_1337295600); sq.setRange(start, finish, false, 1); QueryResultColumnSliceComposite, String result = sq .execute(); ColumnSliceComposite, String orderedRows = result.get(); 6- And I get output for RowKey: query1_1337295600 as (column=15:Composite1:Composite2, value=value123, timesta mp=134187983347) which should not be the case since it does not belong to the 'Composite3' slice. Sunit. On Sun, Jul 8, 2012 at 11:45 AM, aaron morton aa...@thelastpickle.com wrote: Something like: This is how I did the write in CLI and this is what it printed. and then This is how I did the read in the CLI and this is what it printed. It's hard to imagine what data is in cassandra based on code. cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 7/07/2012, at 1:28 PM, Sunit Randhawa wrote: Aaron, For writing, i am using cli. Below is the piece of code that is reading column names of different types. Composite start = new Composite(); start.addComponent(0, beginTime, Composite.ComponentEquality.EQUAL); if (columns != null){ int colCount =1; for (String colName : columns){ start.addComponent(colCount,colName,Composite.ComponentEquality.EQUAL); colCount++; } } Composite finish = new Composite(); finish.addComponent(0, endTime, Composite.ComponentEquality.EQUAL); if (columns != null){ int colCount =1; for (String colName : columns){ if (colCount == columns.size()) finish.addComponent(colCount,colName+ Character.MAX_VALUE, Composite.ComponentEquality.GREATER_THAN_EQUAL); //Greater_than_equal is meant for any subslices to A:B:C if searched on A:B else
Re: Dynamic CF
On Fri, Jul 6, 2012 at 10:49 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: At this point I am really confused about what direction Cassandra is going. CQL 3 has the benefit of composite keys, but no dynamic columns. I thought, the whole point of Cassandra was to provide dynamic tables. CQL3 absolutely provide dynamic tables/wide rows, the syntax is just different. The typical example for wide rows is a time serie, for instance keeping all the events for a given event_kind in the same C* row ordered by time. You declare that in CQL3 using: CREATE TABLE events ( event_kind text, time timestamp, event_name text, event_details text, PRIMARY KEY (event_kind, time) ) The important part in such definition is that one CQL row (i.e a given event_kind, time, event_name, even_details) does not map to an internal Cassandra row. More precisely, all events sharing the same event_kind will be in the same internal row. This is a wide row/dynamic table in the sense of thrift. I need to have a huge table to store market quotes, and be able to query it by name and timestamp (t1 = t = t2), therefore I wanted the composite key. Loading data to such table using prepared statements (CQL 3-based) was very slow, because it makes a server call for each row. You should use a BATCH statement which is the equivalent to batch_mutate. -- Sylvain
Re: cannot build 1.1.2 from source
I would check if you don't have a version of antlr install on you system that takes precedence over the one distributed with C* and happens to not be compatible. Because I don't remember there having been much change to the Cli between 1.1.1 and 1.1.2 and the grammar nobody has had that problem so far. -- Sylvain On Mon, Jul 9, 2012 at 8:07 PM, Arya Goudarzi gouda...@gmail.com wrote: Thanks for your response. Yes. I do that every time before I build. On Sun, Jul 8, 2012 at 11:51 AM, aaron morton aa...@thelastpickle.com wrote: Did you try running ant clean first ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/07/2012, at 1:57 PM, Arya Goudarzi wrote: Hi Fellows, I used to be able to build cassandra 1.1 up to 1.1.1 with the same set of procedures by running ant on the same machine, but now the stuff associated with gen-cli-grammar breaks the build. Any advice will be greatly appreciated. -Arya Source: source tarball for 1.1.2 downloaded from one of the mirrors in cassandra.apache.org OS: Ubuntu 10.04 Precise 64bit Ant: Apache Ant(TM) version 1.8.2 compiled on December 3 2011 Maven: Apache Maven 3.0.3 (r1075438; 2011-02-28 17:31:09+) Java: java version 1.6.0_32 Java(TM) SE Runtime Environment (build 1.6.0_32-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) Buildfile: /home/arya/workspace/cassandra-1.1.2/build.xml maven-ant-tasks-localrepo: maven-ant-tasks-download: maven-ant-tasks-init: maven-declare-dependencies: maven-ant-tasks-retrieve-build: init-dependencies: [echo] Loading dependency paths from file: /home/arya/workspace/cassandra-1.1.2/build/build-dependencies.xml init: [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/classes/main [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/classes/thrift [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/lib [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/classes [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/src/gen-java check-avro-generate: avro-interface-generate-internode: [echo] Generating Avro internode code... avro-generate: build-subprojects: check-gen-cli-grammar: gen-cli-grammar: [echo] Building Grammar /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:697:1: Multiple token rules can match input such as '-': IntegerNegativeLiteral, COMMENT [java] [java] As a result, token(s) COMMENT were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'I': INCR, INDEX, Identifier [java] [java] As a result, token(s) INDEX,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as '0'..'9': IP_ADDRESS, IntegerPositiveLiteral, DoubleLiteral, Identifier [java] [java] As a result, token(s) IntegerPositiveLiteral,DoubleLiteral,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'T': TRUNCATE, TTL, Identifier [java] [java] As a result, token(s) TTL,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'A': T__109, API_VERSION, AND, ASSUME, Identifier [java] [java] As a result, token(s) API_VERSION,AND,ASSUME,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'E': EXIT, Identifier [java] [java] As a result, token(s) Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'L': LIST, LIMIT, Identifier [java] [java] As a result, token(s) LIMIT,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'B': BY, Identifier [java] [java] As a result, token(s) Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as 'O': ON, Identifier [java] [java] As a result, token(s)
Trigger and customized filter
Does anyone know something about the following questions? 1. Does Cassandra support customized filter? customized filter means programmer can define his desired filter to select the data. 2. Does Cassandra support trigger? trigger has the same meaning as in RDBMS. Thanks in advance. Regards, Felipe Mathias Schmidt *(Computer Science UFRGS, RS, Brazil)*
RE: Dynamic CF
Thanks Sylvain, this is useful. So I guess, in the batch_mutate call, in the map that I pass to it, only the first element of the composite key should be used as a key (because it is the real key), and the other parts of the key should be passed as regular columns? Is this correct? While I am waiting for your confirmation, I am going to try it. -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Tuesday, July 10, 2012 8:24 AM To: user@cassandra.apache.org Subject: Re: Dynamic CF On Fri, Jul 6, 2012 at 10:49 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: At this point I am really confused about what direction Cassandra is going. CQL 3 has the benefit of composite keys, but no dynamic columns. I thought, the whole point of Cassandra was to provide dynamic tables. CQL3 absolutely provide dynamic tables/wide rows, the syntax is just different. The typical example for wide rows is a time serie, for instance keeping all the events for a given event_kind in the same C* row ordered by time. You declare that in CQL3 using: CREATE TABLE events ( event_kind text, time timestamp, event_name text, event_details text, PRIMARY KEY (event_kind, time) ) The important part in such definition is that one CQL row (i.e a given event_kind, time, event_name, even_details) does not map to an internal Cassandra row. More precisely, all events sharing the same event_kind will be in the same internal row. This is a wide row/dynamic table in the sense of thrift. I need to have a huge table to store market quotes, and be able to query it by name and timestamp (t1 = t = t2), therefore I wanted the composite key. Loading data to such table using prepared statements (CQL 3-based) was very slow, because it makes a server call for each row. You should use a BATCH statement which is the equivalent to batch_mutate. -- Sylvain This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email.
Re: Dynamic CF
I think he means something like having a fixed set of coiumns in the table definition, then in the actual rows having other columns not specified in the defintion, indepentent of the composited part of the PK. When I reviewed CQL3 for using in Gossie[1] I realized I couldn't have this, and that it would complicate things like migrations or optional columns. For this reason I didn't use CQL3 and instead wrote a row unmaper that detects the discontinuities in the composited part and uses those as the boundaries for the individual concrete rows stored in a wide row [2]. For example: Given a Timeline table defined as key validation UTF8Type, column name validation CompositeType(LongType, AsciiType), value validation BytesType: Timeline: { user1: { 134193302100: { Author: Tom, Body: Hey! }, 134193302200: { Author: Paul, Body: Nice, Lat: 40.0, Lon: 20.0 }, 134193302300: { Author: Lana, Body: Cool } }, ... } Both of the following structs are valid and will be able to be unmaped from the wide row user1: type Tweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring } type GeoTweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring Lat float32 Lon float32 } Granted I lose database-side validation over the individual column values (BytesType) but in exchange I get very flexible rows and much nicer behaviour for model changes and migrations. 1: https://github.com/carloscm/gossie 2: https://github.com/carloscm/gossie/blob/master/src/gossie/mapping.go#L339 On 10 July 2012 14:23, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Jul 6, 2012 at 10:49 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: At this point I am really confused about what direction Cassandra is going. CQL 3 has the benefit of composite keys, but no dynamic columns. I thought, the whole point of Cassandra was to provide dynamic tables. CQL3 absolutely provide dynamic tables/wide rows, the syntax is just different. The typical example for wide rows is a time serie, for instance keeping all the events for a given event_kind in the same C* row ordered by time. You declare that in CQL3 using: CREATE TABLE events ( event_kind text, time timestamp, event_name text, event_details text, PRIMARY KEY (event_kind, time) ) The important part in such definition is that one CQL row (i.e a given event_kind, time, event_name, even_details) does not map to an internal Cassandra row. More precisely, all events sharing the same event_kind will be in the same internal row. This is a wide row/dynamic table in the sense of thrift. I need to have a huge table to store market quotes, and be able to query it by name and timestamp (t1 = t = t2), therefore I wanted the composite key. Loading data to such table using prepared statements (CQL 3-based) was very slow, because it makes a server call for each row. You should use a BATCH statement which is the equivalent to batch_mutate. -- Sylvain -- http://www.groupalia.com/Carlos CarrascoIT - Software Architect Llull, 95-97, 2º planta, 08005 BarcelonaSkype: carlos.carrasco.groupalia www.groupalia.comcarlos.carra...@groupalia.com
Re: Dynamic CF
On Tue, Jul 10, 2012 at 4:19 PM, Carlos Carrasco carlos.carra...@groupalia.com wrote: I think he means something like having a fixed set of coiumns in the table definition, then in the actual rows having other columns not specified in the defintion, indepentent of the composited part of the PK. When I reviewed CQL3 for using in Gossie[1] I realized I couldn't have this, and that it would complicate things like migrations or optional columns. For this reason I didn't use CQL3 and instead wrote a row unmaper that detects the discontinuities in the composited part and uses those as the boundaries for the individual concrete rows stored in a wide row [2]. For example: Given a Timeline table defined as key validation UTF8Type, column name validation CompositeType(LongType, AsciiType), value validation BytesType: Timeline: { user1: { 134193302100: { Author: Tom, Body: Hey! }, 134193302200: { Author: Paul, Body: Nice, Lat: 40.0, Lon: 20.0 }, 134193302300: { Author: Lana, Body: Cool } }, ... } Both of the following structs are valid and will be able to be unmaped from the wide row user1: type Tweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring } type GeoTweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring Lat float32 Lon float32 } That's exactly how CQL3 works. In that example, you would declare: CREATE TABLE tweet ( UserID text, When int, Author text, Body text, Lat float, Long float, PRIMARY KEY (UserId, When) ) and that would layout things *exactly* like your Timeline above, but with validation. The fact that you have to declare Lat and Long does not mean that every CQL row must have them. much nicer behaviour for model changes and migrations. Not sure what you mean by that since adding new columns to a CQL3 definition is basically free. -- Sylvain
Re: Dynamic CF
On Tue, Jul 10, 2012 at 4:17 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: So I guess, in the batch_mutate call, in the map that I pass to it, only the first element of the composite key should be used as a key (because it is the real key), and the other parts of the key should be passed as regular columns? Is this correct? While I am waiting for your confirmation, I am going to try it. I would really advise you to use the BATCH statement of CQL3 rather than the thrift batch_mutate call. If only because until https://issues.apache.org/jira/browse/CASSANDRA-4377 is resolved it won't work at all, but also because the whole point of CQL3 is to hide that kind of complexity. -- Sylvain -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Tuesday, July 10, 2012 8:24 AM To: user@cassandra.apache.org Subject: Re: Dynamic CF On Fri, Jul 6, 2012 at 10:49 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: At this point I am really confused about what direction Cassandra is going. CQL 3 has the benefit of composite keys, but no dynamic columns. I thought, the whole point of Cassandra was to provide dynamic tables. CQL3 absolutely provide dynamic tables/wide rows, the syntax is just different. The typical example for wide rows is a time serie, for instance keeping all the events for a given event_kind in the same C* row ordered by time. You declare that in CQL3 using: CREATE TABLE events ( event_kind text, time timestamp, event_name text, event_details text, PRIMARY KEY (event_kind, time) ) The important part in such definition is that one CQL row (i.e a given event_kind, time, event_name, even_details) does not map to an internal Cassandra row. More precisely, all events sharing the same event_kind will be in the same internal row. This is a wide row/dynamic table in the sense of thrift. I need to have a huge table to store market quotes, and be able to query it by name and timestamp (t1 = t = t2), therefore I wanted the composite key. Loading data to such table using prepared statements (CQL 3-based) was very slow, because it makes a server call for each row. You should use a BATCH statement which is the equivalent to batch_mutate. -- Sylvain This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email.
Re: Dynamic CF
I am confused then. I remember reviewing the source for CQL3 and finding that the row reader used the column count in the CF definition in order to find how many columns it needed to read a single row. I guess I missed a filter over the composited part or that I reviewed an old version. On 10 July 2012 16:34, Sylvain Lebresne sylv...@datastax.com wrote: On Tue, Jul 10, 2012 at 4:19 PM, Carlos Carrasco carlos.carra...@groupalia.com wrote: I think he means something like having a fixed set of coiumns in the table definition, then in the actual rows having other columns not specified in the defintion, indepentent of the composited part of the PK. When I reviewed CQL3 for using in Gossie[1] I realized I couldn't have this, and that it would complicate things like migrations or optional columns. For this reason I didn't use CQL3 and instead wrote a row unmaper that detects the discontinuities in the composited part and uses those as the boundaries for the individual concrete rows stored in a wide row [2]. For example: Given a Timeline table defined as key validation UTF8Type, column name validation CompositeType(LongType, AsciiType), value validation BytesType: Timeline: { user1: { 134193302100: { Author: Tom, Body: Hey! }, 134193302200: { Author: Paul, Body: Nice, Lat: 40.0, Lon: 20.0 }, 134193302300: { Author: Lana, Body: Cool } }, ... } Both of the following structs are valid and will be able to be unmaped from the wide row user1: type Tweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring } type GeoTweet struct { UserID string `cf:Timeline key:UserID cols:When` Whenint64 Author string Bodystring Lat float32 Lon float32 } That's exactly how CQL3 works. In that example, you would declare: CREATE TABLE tweet ( UserID text, When int, Author text, Body text, Lat float, Long float, PRIMARY KEY (UserId, When) ) and that would layout things *exactly* like your Timeline above, but with validation. The fact that you have to declare Lat and Long does not mean that every CQL row must have them. much nicer behaviour for model changes and migrations. Not sure what you mean by that since adding new columns to a CQL3 definition is basically free. -- Sylvain -- http://www.groupalia.com/Carlos CarrascoIT - Software Architect Llull, 95-97, 2º planta, 08005 BarcelonaSkype: carlos.carrasco.groupalia www.groupalia.comcarlos.carra...@groupalia.com
Re: Trigger and customized filter
While Jonathan and crew work on the infrastructure to support triggers: https://issues.apache.org/jira/browse/CASSANDRA-4285 We have a project going over here that provides a trigger-like capability: https://github.com/hmsonline/cassandra-triggers/ https://github.com/hmsonline/cassandra-triggers/wiki/GettingStarted We are working enhancements that would support synchronous triggers w/ javascript. For now, they are processed asynchronously, and you implement a Java interface. -brian On Tue, Jul 10, 2012 at 9:24 AM, Felipe Schmidt felipef...@gmail.com wrote: Does anyone know something about the following questions? 1. Does Cassandra support customized filter? customized filter means programmer can define his desired filter to select the data. 2. Does Cassandra support trigger? trigger has the same meaning as in RDBMS. Thanks in advance. Regards, Felipe Mathias Schmidt (Computer Science UFRGS, RS, Brazil) -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Serious issue updating Cassandra version and topology
Hello Aaron, Thank you for responding. Since the time of my original email, we noticed that in the process of performing this upgrade that data was lost. We have restored from backup and are now trying this again with two changes: 1) We will be using 1.1.2 throughout the cluster 2) We have switched back to Tiered compaction In the process I've hit another very interesting issue that I will write a separate email about. However, to answer your questions, this happened on the 1.1.2 node and it happened against after you ran the scrub. The data has been around for a while. We upgraded from 1.0.7 - 1.1.2. Unfortunately, I can't check the sstables as we've restarted the migration from the beginning. If it happens again, I'll respond with more information. Thanks again, -Mike On Jul 10, 2012, at 5:05 AM, aaron morton wrote: To be clear, this happened on a 1.1.2 node and it happened again *after* you had run a scrub ? Has this cluster been around for a while or was the data created with 1.1 ? Can you confirm that all sstables were re-written for the CF? Check the timestamp on the files. Also also files should have the same version, the -h?- part of the name. Can you repair the other CF's ? If this cannot be repaired by scrub or upgradetables you may need to cut the row out of the sstables. Using sstable2json and json2sstable. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/07/2012, at 4:05 PM, Michael Theroux wrote: Hello, We're in the process of trying to move a 6-node cluster from RF=1 to RF=3. Once our replication factor was upped to 3, we ran nodetool repair, and immediately hit an issue on the first node we ran repair on: INFO 03:08:51,536 Starting repair command #1, repairing 2 ranges. INFO 03:08:51,552 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] new session: will sync xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101, /10.29.187.61 on range (Token(bytes[d558]),Token(bytes[])] for x.[a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s] INFO 03:08:51,555 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] requesting merkle trees for a (to [/10.29.187.61, xxx-xx-xx-xxx-compute-1.amazonaws.com/10.202.99.101]) INFO 03:08:52,719 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for a from /10.29.187.61 INFO 03:08:53,518 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for a from xxx-xx-xx-xxx-.compute-1.amazonaws.com/10.202.99.101 INFO 03:08:53,519 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] requesting merkle trees for b (to [/10.29.187.61, xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101]) INFO 03:08:53,639 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Endpoints /10.29.187.61 and xxx-xx-xx-xxx-132.compute-1.amazonaws.com/10.202.99.101 are consistent for a INFO 03:08:53,640 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] a is fully synced (18 remaining column family to sync for this session) INFO 03:08:54,049 [repair #3e724fe0-c8aa-11e1--4f728ab9d6ff] Received merkle tree for b from /10.29.187.61 ERROR 03:09:09,440 Exception in thread Thread[ValidationExecutor:1,1,main] java.lang.AssertionError: row DecoratedKey(Token(bytes[efd5654ce92a705b14244e2f5f73ab98c3de2f66c7adbd71e0e893997e198c47]), efd5654ce92a705b14244e2f5f73ab98c3de2f66c7adbd71e0e893997e198c47) received out of order wrt DecoratedKey(Token(bytes[f33a5ad4a45e8cac7987737db246ddfe9294c95bea40f411485055f5dbecbadb]), f33a5ad4a45e8cac7987737db246ddfe9294c95bea40f411485055f5dbecbadb) at org.apache.cassandra.service.AntiEntropyService$Validator.add(AntiEntropyService.java:349) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:712) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) It looks from the log above, the sync of the a column family was successful. However, the b column family resulted in this error. In addition, the repair hung after this error. We ran node tool scrub on all nodes and invalidated the key and row caches and tried again (with RF=2), and it didn't help alleviate the problem. Some other important pieces of information: We use ByteOrderedPartitioner (we MD5 hash the
RE: Dynamic CF
I see. I actually tried it, and it consistently throws an exception. Below is my test code. I have two tests; test1 is for the composite key case, and test2 is for the simple key. The test2 works fine, while test1 gives me: Exception in thread main InvalidRequestException(why:Not enough bytes to read value of component 0) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20253) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:922) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:908) at com.moon.cql.BatchTest.test1(BatchTest.java:99) at com.moon.cql.BatchTest.main(BatchTest.java:45) So you suggest to use BATCH statement. Since I do it from Java, it means creating a huge string (I may need to update thousands records at once), and executing it. Does it even make sense? Why is this going to be any better than simply execute prepared statement multiple times? The only thing it does is reduce number of calls to the server, but I have to figure out if this is the bottle neck I need to optimize. Or maybe I need to break all my updates in a number of batches. By the way, can a batch statement be prepared? With thousands of question marks in it? public class BatchTest { /** * @param args the command line arguments */ public static void main(String[] args) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { String host = args[0]; int port = Integer.parseInt(args[1]); test1(host, port); //test2(host, port); } private static void test1(String host, int port) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { TTransport transport = new TFramedTransport(new org.apache.thrift.transport.TSocket( host, port)); transport.open(); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); client.set_cql_version(3.0.0); client.set_keyspace(test); MapByteBuffer, MapString, ListMutation mutationMap = new HashMap(); MapString, ListMutation mutations = new HashMap(); ListMutation columnsMutations = new ArrayList(); // key ByteBuffer keyBuffer = AsciiType.instance.decompose(KEY1); // key1 as column Column key1 = new Column(); key1.setName(key1.getBytes()); key1.setValue(LongType.instance.decompose(System.nanoTime())); key1.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cc = new ColumnOrSuperColumn(); cc.setColumn(key1); Mutation m = new Mutation(); m.setColumn_or_supercolumn(cc); columnsMutations.add(m); // value column Column value = new Column(); value.setName(value.getBytes()); value.setValue(DoubleType.instance.decompose(5.3)); value.setTimestamp(System.currentTimeMillis()); cc = new ColumnOrSuperColumn(); cc.setColumn(value); m = new Mutation(); m.setColumn_or_supercolumn(cc); columnsMutations.add(m); // Inner mutation map mutations.put(testtable1, columnsMutations); // outer map : use the partition key mutationMap.put(keyBuffer, mutations); // Execute client.batch_mutate(mutationMap, ConsistencyLevel.ANY); } private static void test2(String host, int port) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { TTransport transport = new TFramedTransport(new org.apache.thrift.transport.TSocket( host, port)); transport.open(); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); client.set_cql_version(3.0.0); client.set_keyspace(test); MapByteBuffer, MapString, ListMutation mutationMap = new HashMap(); MapString, ListMutation mutations = new HashMap(); ListMutation columnsMutations = new ArrayList(); // key ByteBuffer keyBuffer = AsciiType.instance.decompose(KEY1); // value column Column value = new Column(); value.setName(value.getBytes()); value.setValue(DoubleType.instance.decompose(5.3)); value.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cc = new ColumnOrSuperColumn(); cc.setColumn(value); Mutation m = new Mutation();
reading deleted rows is super-slow
We're finding that reading deleted columns can be very slow and I'm trying to get confirmation for our analysis of what happens. We wrote lots of data eons ago into fairly large rows (up to 1MB). We recently read those rows and then deleted them. After this, we ran a verification-type pass that attempts to re-read these rows and verifies that they are indeed deleted. The interval between the deletion and verification pass was far less than gc_grace. We noticed that the verification pass took as much time as the readdelete pass(!), while verifying the non-existence of rows that never existed is blindingly fast in comparison. So it seems that cassandra is reading the old data, reading the new tombstones, and then returning there is no data. Functionally correct, but rather unexpected performance characteristics... Am I missing something or is this expected? Thanks! Thorsten
Re: Composite Slice Query returning non-sliced data
I think in this case that's just Hector's way of setting the EOC byte for a component. My guess is that the composite isn't being structured correctly through Hector, as well. On Tue, Jul 10, 2012 at 4:40 AM, aaron morton aa...@thelastpickle.comwrote: The first thing that stands out is that (in cassandra) comparison operations are not used in a slice range. -- Tyler Hobbs DataStax http://datastax.com/
Re: reading deleted rows is super-slow
This is expected due to tombstones, which this explains pretty well: http://wiki.apache.org/cassandra/DistributedDeletes If you don't have any tombstones for the row, the bloom filter will let Cassandra avoid doing any disk reads at all 99% of the time. On Tue, Jul 10, 2012 at 10:50 AM, Thorsten von Eicken t...@rightscale.comwrote: We're finding that reading deleted columns can be very slow and I'm trying to get confirmation for our analysis of what happens. We wrote lots of data eons ago into fairly large rows (up to 1MB). We recently read those rows and then deleted them. After this, we ran a verification-type pass that attempts to re-read these rows and verifies that they are indeed deleted. The interval between the deletion and verification pass was far less than gc_grace. We noticed that the verification pass took as much time as the readdelete pass(!), while verifying the non-existence of rows that never existed is blindingly fast in comparison. So it seems that cassandra is reading the old data, reading the new tombstones, and then returning there is no data. Functionally correct, but rather unexpected performance characteristics... Am I missing something or is this expected? Thanks! Thorsten -- Tyler Hobbs DataStax http://datastax.com/
what is the best data model for time series of small data chunks...
Hi, I have an application that consists of multiple (possible 1000's) of measurement series, and each measurement series generates a small amount of data output (only about 500 bytes) every 10 seconds. This time series of data should be stored in Cassandra in a fashion that both read access is possible for a given time range. What I do today is - assign a timeuuid to each data output - write in two CF: - first CF has key = measurement series ID, column name = timeuuid_of_output - second CF has key = timeuuid_of_output, column value = data output (~ 500 bytes) When someone requests a time range of data, I read the first CF, get a series of timeuuid's, and then do a row-multiget on the second CF. This works great, but tends to be slow for big series of data (lets say for 10 days, nearly 100,000 records will be requested from the second CF). This load of 100,000 reads will be distributed through the cluster (because the second CF scales very nicely with a RandomPartitioner), but more or less one ends up with 100,000 individual read requests, at least that's what I suspect. Can anyone say if there is a better data model for this type of queries? Would it be a reasonable improvement to put all data to a single CF with - single CF, key = measurement series ID, column name = timeuuid_of_output, column value = data output When I request a series of 100,000 columns from this row (now it's a single row), can the performance really be better? Is there any chance that Cassandra will be able to read this data en bloc from the hard drive? Any advise is appreciated... Greetings, Roland
Re: what is the best data model for time series of small data chunks...
On Tue, Jul 10, 2012 at 12:14 PM, Roland Hänel rol...@haenel.me wrote: Hi, I have an application that consists of multiple (possible 1000's) of measurement series, and each measurement series generates a small amount of data output (only about 500 bytes) every 10 seconds. This time series of data should be stored in Cassandra in a fashion that both read access is possible for a given time range. What I do today is - assign a timeuuid to each data output - write in two CF: - first CF has key = measurement series ID, column name = timeuuid_of_output - second CF has key = timeuuid_of_output, column value = data output (~ 500 bytes) When someone requests a time range of data, I read the first CF, get a series of timeuuid's, and then do a row-multiget on the second CF. This works great, but tends to be slow for big series of data (lets say for 10 days, nearly 100,000 records will be requested from the second CF). This load of 100,000 reads will be distributed through the cluster (because the second CF scales very nicely with a RandomPartitioner), but more or less one ends up with 100,000 individual read requests, at least that's what I suspect. Can anyone say if there is a better data model for this type of queries? Would it be a reasonable improvement to put all data to a single CF with - single CF, key = measurement series ID, column name = timeuuid_of_output, column value = data output When I request a series of 100,000 columns from this row (now it's a single row), can the performance really be better? Is there any chance that Cassandra will be able to read this data en bloc from the hard drive? This is definitely the approach I would take. Reading a single row is nearly sequential, so you'll get very good performance. I recommend you check these out: - http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/ - http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra -- Tyler Hobbs DataStax http://datastax.com/
Using a node in separate cluster without decommissioning.
Hi I want to take out 2 nodes from a 8 node cluster and use in another cluster, but can't afford the overhead of streaming the data and rebalance cluster. Since replication factor is 2 in first cluster, I won't lose any data. I'm planning to save my commit_log and data directories and bootstrapping the node in the second cluster. Afterwards I'll just replace both the directories and join the node back to the original cluster. This should work since cassandra saves all the cluster and schema info in the system keyspace. Is it advisable and safe to go ahead? Thanks Rohit
RE: Dynamic CF
I see now there is a package org.apache.cassandra.cql3.statements, with BatchStatement class. Is this what I should use? -Original Message- From: Leonid Ilyevsky [mailto:lilyev...@mooncapital.com] Sent: Tuesday, July 10, 2012 11:45 AM To: user@cassandra.apache.org Subject: RE: Dynamic CF I see. I actually tried it, and it consistently throws an exception. Below is my test code. I have two tests; test1 is for the composite key case, and test2 is for the simple key. The test2 works fine, while test1 gives me: Exception in thread main InvalidRequestException(why:Not enough bytes to read value of component 0) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20253) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:922) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:908) at com.moon.cql.BatchTest.test1(BatchTest.java:99) at com.moon.cql.BatchTest.main(BatchTest.java:45) So you suggest to use BATCH statement. Since I do it from Java, it means creating a huge string (I may need to update thousands records at once), and executing it. Does it even make sense? Why is this going to be any better than simply execute prepared statement multiple times? The only thing it does is reduce number of calls to the server, but I have to figure out if this is the bottle neck I need to optimize. Or maybe I need to break all my updates in a number of batches. By the way, can a batch statement be prepared? With thousands of question marks in it? public class BatchTest { /** * @param args the command line arguments */ public static void main(String[] args) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { String host = args[0]; int port = Integer.parseInt(args[1]); test1(host, port); //test2(host, port); } private static void test1(String host, int port) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { TTransport transport = new TFramedTransport(new org.apache.thrift.transport.TSocket( host, port)); transport.open(); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); client.set_cql_version(3.0.0); client.set_keyspace(test); MapByteBuffer, MapString, ListMutation mutationMap = new HashMap(); MapString, ListMutation mutations = new HashMap(); ListMutation columnsMutations = new ArrayList(); // key ByteBuffer keyBuffer = AsciiType.instance.decompose(KEY1); // key1 as column Column key1 = new Column(); key1.setName(key1.getBytes()); key1.setValue(LongType.instance.decompose(System.nanoTime())); key1.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cc = new ColumnOrSuperColumn(); cc.setColumn(key1); Mutation m = new Mutation(); m.setColumn_or_supercolumn(cc); columnsMutations.add(m); // value column Column value = new Column(); value.setName(value.getBytes()); value.setValue(DoubleType.instance.decompose(5.3)); value.setTimestamp(System.currentTimeMillis()); cc = new ColumnOrSuperColumn(); cc.setColumn(value); m = new Mutation(); m.setColumn_or_supercolumn(cc); columnsMutations.add(m); // Inner mutation map mutations.put(testtable1, columnsMutations); // outer map : use the partition key mutationMap.put(keyBuffer, mutations); // Execute client.batch_mutate(mutationMap, ConsistencyLevel.ANY); } private static void test2(String host, int port) throws TTransportException, InvalidRequestException, TException, UnavailableException, TimedOutException { TTransport transport = new TFramedTransport(new org.apache.thrift.transport.TSocket( host, port)); transport.open(); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); client.set_cql_version(3.0.0); client.set_keyspace(test); MapByteBuffer, MapString, ListMutation mutationMap = new HashMap(); MapString, ListMutation mutations = new HashMap(); ListMutation columnsMutations = new ArrayList(); // key ByteBuffer keyBuffer = AsciiType.instance.decompose(KEY1); // value column Column value = new Column();
Re: Composite Slice Query returning non-sliced data
I have tested this extensively and EOC has huge issue in terms of usability of CompositeTypes in Cassandra. As an example: If you have 2 Composite Columns such as A:B:C and A:D:C. And if you do search on A:B as start and end Composite Components, it will return D as well. Because it returns all the remaining columns from your start range. Similarly if you do search on A:D as start and end Composite Components, it will not return B because the D comes after B. Sadly, the information given here on intro to composite Types: http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1 also does not work. On Tue, Jul 10, 2012 at 9:24 AM, Tyler Hobbs ty...@datastax.com wrote: I think in this case that's just Hector's way of setting the EOC byte for a component. My guess is that the composite isn't being structured correctly through Hector, as well. On Tue, Jul 10, 2012 at 4:40 AM, aaron morton aa...@thelastpickle.com wrote: The first thing that stands out is that (in cassandra) comparison operations are not used in a slice range. -- Tyler Hobbs DataStax
Re: Composite Slice Query returning non-sliced data
On Tue, Jul 10, 2012 at 2:20 PM, Sunit Randhawa sunit.randh...@gmail.comwrote: I have tested this extensively and EOC has huge issue in terms of usability of CompositeTypes in Cassandra. As an example: If you have 2 Composite Columns such as A:B:C and A:D:C. And if you do search on A:B as start and end Composite Components, it will return D as well. Because it returns all the remaining columns from your start range. That shouldn't be happening, and I can test that it works correctly using pycassa. So I suspect a problem with Hector. Similarly if you do search on A:D as start and end Composite Components, it will not return B because the D comes after B. This is expected behavior. Sadly, the information given here on intro to composite Types: http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1 also does not work. On Tue, Jul 10, 2012 at 9:24 AM, Tyler Hobbs ty...@datastax.com wrote: I think in this case that's just Hector's way of setting the EOC byte for a component. My guess is that the composite isn't being structured correctly through Hector, as well. On Tue, Jul 10, 2012 at 4:40 AM, aaron morton aa...@thelastpickle.com wrote: The first thing that stands out is that (in cassandra) comparison operations are not used in a slice range. -- Tyler Hobbs DataStax -- Tyler Hobbs DataStax http://datastax.com/
help using org.apache.cassandra.cql3
I am trying to use the org.apache.cassandra.cql3 package. Having problem connecting to the server using ClientState. I was not sure what to put in the credentials map (I did not set any users/passwords on my server), so I tried setting empty strings for username and password, setting them to bogus values, passing null to the login method - there was no difference. It does not complain at the login(), but then it complains about setKeyspace(my keyspace), saying that the specified keyspace does not exist (it obviously does exist). The configuration was loaded from cassandra.yaml used by the server. I did not have any problem like this when I used org.apache.cassandra.thrift.Cassandra.Client . What am I doing wrong? Appreciate your help, Leonid This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email.
failed to delete commitlog, cassandra can't accept writes
after reading the JIRA, I decided to use Java 6. with Casandra 1.1.2 on Java 6 x64 on Win7 sp1 x64 (all latest versions), after a several minutes of sustained writes, I see: from system.log: java.io.IOError: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:176) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:223) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:172) ... 4 more anybody seen this before? is this related to 4337 ? On Sat, Jul 7, 2012 at 6:36 PM, Frank Hsueh frank.hs...@gmail.com wrote: bug already reported: https://issues.apache.org/jira/browse/CASSANDRA-4337 On Sat, Jul 7, 2012 at 6:26 PM, Frank Hsueh frank.hs...@gmail.com wrote: Hi, I'm running Casandra 1.1.2 on Java 7 x64 on Win7 sp1 x64 (all latest versions). If it matters, I'm using a late version of Astyanax as my client. I'm using 4 threads to write a lot of data into a single CF. After several minutes of load (~ 30m at last incident), Cassandra stops accepting writes (client reports an OperationTimeoutException). I looked at the logs and I see on the Cassandra server: ERROR 18:00:42,807 Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] java.io.IOError: java.io.IOException: Rename from \var\lib\cassandra\commitlog\CommitLog-701533048437587.log to 703272597990002 failed at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:127) at org.apache.cassandra.db.commitlog.CommitLogSegment.recycle(CommitLogSegment.java:204) at org.apache.cassandra.db.commitlog.CommitLogAllocator$2.run(CommitLogAllocator.java:166) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: Rename from \var\lib\cassandra\commitlog\CommitLog-701533048437587.log to 703272597990002 failed at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:105) ... 5 more Anybody else seen this before ? -- Frank Hsueh | frank.hs...@gmail.com -- Frank Hsueh | frank.hs...@gmail.com -- Frank Hsueh | frank.hs...@gmail.com
Re: failed to delete commitlog, cassandra can't accept writes
oops; I missed log line: ERROR [COMMIT-LOG-ALLOCATOR] 2012-07-10 14:19:39,776 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] java.io.IOError: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:176) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:223) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:172) ... 4 more On Tue, Jul 10, 2012 at 2:35 PM, Frank Hsueh frank.hs...@gmail.com wrote: after reading the JIRA, I decided to use Java 6. with Casandra 1.1.2 on Java 6 x64 on Win7 sp1 x64 (all latest versions), after a several minutes of sustained writes, I see: from system.log: java.io.IOError: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:176) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:223) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Failed to delete C:\var\lib\cassandra\commitlog\CommitLog-948695923996466.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:172) ... 4 more anybody seen this before? is this related to 4337 ? On Sat, Jul 7, 2012 at 6:36 PM, Frank Hsueh frank.hs...@gmail.com wrote: bug already reported: https://issues.apache.org/jira/browse/CASSANDRA-4337 On Sat, Jul 7, 2012 at 6:26 PM, Frank Hsueh frank.hs...@gmail.comwrote: Hi, I'm running Casandra 1.1.2 on Java 7 x64 on Win7 sp1 x64 (all latest versions). If it matters, I'm using a late version of Astyanax as my client. I'm using 4 threads to write a lot of data into a single CF. After several minutes of load (~ 30m at last incident), Cassandra stops accepting writes (client reports an OperationTimeoutException). I looked at the logs and I see on the Cassandra server: ERROR 18:00:42,807 Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] java.io.IOError: java.io.IOException: Rename from \var\lib\cassandra\commitlog\CommitLog-701533048437587.log to 703272597990002 failed at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:127) at org.apache.cassandra.db.commitlog.CommitLogSegment.recycle(CommitLogSegment.java:204) at org.apache.cassandra.db.commitlog.CommitLogAllocator$2.run(CommitLogAllocator.java:166) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: Rename from \var\lib\cassandra\commitlog\CommitLog-701533048437587.log to 703272597990002 failed at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:105) ... 5 more Anybody else seen this before ? -- Frank Hsueh | frank.hs...@gmail.com -- Frank Hsueh | frank.hs...@gmail.com -- Frank Hsueh | frank.hs...@gmail.com -- Frank Hsueh | frank.hs...@gmail.com
Re: help using org.apache.cassandra.cql3
On Tue, Jul 10, 2012 at 3:04 PM, Leonid Ilyevsky lilyev...@mooncapital.comwrote: I am trying to use the org.apache.cassandra.cql3 package. Having problem connecting to the server using ClientState. I was not sure what to put in the credentials map (I did not set any users/passwords on my server), so I tried setting empty strings for “username” and “password”, setting them to bogus values, passing null to the login method – there was no difference. It does not complain at the login(), but then it complains about setKeyspace(my keyspace), saying that the specified keyspace does not exist (it obviously does exist). The configuration was loaded from cassandra.yaml used by the server. ** ** I did not have any problem like this when I used org.apache.cassandra.thrift.Cassandra.Client . ** ** What am I doing wrong? ** I think that package just contains server classes. Everything you need should be in org.apache.cassandra.thrift. To use cql3 I just use the client methods 'execute_cql_query', 'prepare_cql_query' and 'execute_prepared_cql_query', after setting cql version to '3.0.0'. -- Derek Williams
Re: Multiple keyspace question
A problem of many keyspaces is clients are bound to a keyspace so connection pooling multiple keyspaces is an issue. Cql has support for some limited cross keyspace operations. On Sunday, July 8, 2012, aaron morton aa...@thelastpickle.com wrote: I would do a test to see the latency difference under load between having 1 KS with 5 CF's and 50 KS with 5 CF's. Your test will need to read and write to all the CF's. Having many CF's may result in more frequent memtables flushes. (Personally it's not an approach I would take.) Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 7/07/2012, at 8:15 AM, Shahryar Sedghi wrote: Aaron I am going to have many (over 50 eventually) keyspaces with limited number of CFs (5-6) do you think this one can cause a problem too. Thanks On Fri, Jul 6, 2012 at 2:28 PM, aaron morton aa...@thelastpickle.com wrote: Also, all CF's in the same KS share one commit log. So all writes for the row row key, across all CF's, are committed at the same time. Some other settings, such as caches in 1.1, are machine wide. If you have a small KS for something like app config, I'd say go with whatever feels right. If you are talking about two full application KS's I would think about their prospective workloads and growth patterns. Will you always want to manage the two together ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/07/2012, at 9:47 PM, Robin Verlangen wrote: Hi Ben, The amount of keyspaces is not the problem: the amount of column families is. Each column family adds a certain amount of memory usage to the system. You can cope with this by adding memory or using generic column families that store different types of data. With kind regards, Robin Verlangen Software engineer W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. 2012/7/6 Ben Kaehne ben.kae...@sirca.org.au Good evening, I have read multiple keyspaces are bad before in a few discussions, but to what extent? We have some reasonably powerful machines and looking to host an additional (currently we have 1) 2 keyspaces within our cassandra cluster (of 3 nodes, using RF3). At what point does adding extra keyspaces start becoming an issue? Is there anything special we should be considering or watching out for as we implement this? I could not imagine that all cassandra users out there are running one massive keyspace, and at the same time can not imaging that all cassandra users have multiple clusters just to host different keyspaces. Regards. -- -Ben
Cassandra take 100% CPU for 2~3 minutes every half an hour and mutation lost
Hi I encounter the High CPU problem, Cassandra 1.0.3, happened on both sized and leveled compaction, 6G heap, 64bit Oracle java. For normal traffic, Cassandra will use 15% CPU. But every half a hour, Cassandra will use almost 100% total cpu (SUSE, 12 Core). And here is the top information for that moment. #top -H -p 12451 top - 12:30:14 up 15 days, 12:49, 6 users, load average: 10.52, 8.92, 8.14 Tasks: 706 total, 21 running, 685 sleeping, 0 stopped, 0 zombie Cpu(s): 25.7%us, 14.0%sy, 48.9%ni, 6.5%id, 0.0%wa, 0.0%hi, 4.9%si, 0.0%st Mem: 24150M total,12218M used,11932M free, 142M buffers Swap:0M total,0M used,0M free, 3714M cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 20291 casadm24 4 8003m 5.4g 167m R 92 22.7 0:42.46 java 20276 casadm24 4 8003m 5.4g 167m R 88 22.7 0:43.88 java 20181 casadm24 4 8003m 5.4g 167m R 86 22.7 0:52.97 java 20213 casadm24 4 8003m 5.4g 167m R 85 22.7 0:49.21 java 20188 casadm24 4 8003m 5.4g 167m R 82 22.7 0:54.34 java 20268 casadm24 4 8003m 5.4g 167m R 81 22.7 0:46.25 java 20269 casadm24 4 8003m 5.4g 167m R 41 22.7 0:15.11 java 20316 casadm24 4 8003m 5.4g 167m S 20 22.7 0:02.35 java 20191 casadm24 4 8003m 5.4g 167m R 15 22.7 0:16.85 java 12500 casadm20 0 8003m 5.4g 167m R6 22.7 1:07.86 java 15245 casadm20 0 8003m 5.4g 167m D5 22.7 0:36.45 java Jstack can not print the stack. Thread 20291: (state = IN_JAVA) Error occurred during stack walking: ... Thread 20276: (state = IN_JAVA) Error occurred during stack walking: After it come back, the stack shows: Thread 20291: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=196 (Compiled frame) - java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode, boolean, long) @bci=174, line=424 (Compiled frame) - java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object, boolean, long) @bci=102, line=323 (Compiled frame) - java.util.concurrent.SynchronousQueue.poll(long, java.util.concurrent.TimeUnit) @bci=11, line=874 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=62, line=945 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=18, line=907 (Compiled frame) - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame And after this happened, the data is not correct, some large column which suppose to be deleted, come back again. Here is the suspect thread when it use up 100% Thread 20191: (state = IN_VM) - sun.misc.Unsafe.unpark(java.lang.Object) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.unpark(java.lang.Thread) @bci=8, line=122 (Compiled frame) - java.util.concurrent.SynchronousQueue$TransferStack$SNode.tryMatch(java.util.concurrent.SynchronousQueue$TransferStack$SNode) @bci=34, line=242 (Compiled frame) - java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object, boolean, long) @bci=268, line=344 (Compiled frame) - java.util.concurrent.SynchronousQueue.offer(java.lang.Object) @bci=19, line=846 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.execute(java.lang.Runnable) @bci=43, line=653 (Compiled frame) - java.util.concurrent.AbstractExecutorService.submit(java.util.concurrent.Callable) @bci=20, line=92 (Compiled frame) - org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getCompactedRow(java.util.List) @bci=86, line=190 (Compiled frame) - org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getReduced() @bci=31, line=164 (Compiled frame) - org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getReduced() @bci=1, line=144 (Compiled frame) - org.apache.cassandra.utils.MergeIterator$ManyToOne.consume() @bci=88, line=116 (Compiled frame) - org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext() @bci=5, line=99 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=140 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135 (Compiled frame) - org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext() @bci=4, line=103 (Compiled frame) - org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext() @bci=1, line=90 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=140 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135 (Compiled frame) - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=614 (Compiled frame) -