Secondary index query + 2 Datacenters + Row Cache + Restart = 0 rows
Hello, I've found a combination that doesn't work: A column family that have a secondary index and caching='ALL' with data in two datacenters and I do a restart of the nodes, then my secondary index queries start returning 0 rows. It happens when amount of data goes over a certain threshold, so I suspect that compactions are involved in this as well. Taking out one of the ingredients fixes the problem and my queries return rows from secondary index. I suspect that this guy is struggling with the same thing https://issues.apache.org/jira/browse/CASSANDRA-4785 Here is a sequence of actions that reproduces it with help of CCM: $ ccm create --cassandra-version 1.2.1 --nodes 2 -p RandomPartitioner testRowCacheDC $ ccm updateconf 'endpoint_snitch: PropertyFileSnitch' $ ccm updateconf 'row_cache_size_in_mb: 200' $ cp ~/Downloads/cassandra-topology.properties ~/.ccm/testRowCacheDC/node1/conf/ (please find .properties file below) $ cp ~/Downloads/cassandra-topology.properties ~/.ccm/testRowCacheDC/node2/conf/ $ ccm start $ ccm cli -create keyspace and column family(please find schema below) $ python populate_rowcache.py $ ccm stop (I tried flush first, doesn't help) $ ccm start $ ccm cli Connected to: testRowCacheDC on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.2.1-SNAPSHOT Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_75'; 0 Row Returned. Elapsed time: 68 msec(s). My cassandra instances run with -Xms1927M -Xmx1927M -Xmn400M Thanks for help. Best regards, Alexei -- START cassandra-topology.properties -- 127.0.0.1=DC1:RAC1 127.0.0.2=DC2:RAC1 default=DC1:r1 -- FINISH cassandra-topology.properties -- -- START cassandra-cli schema --- create keyspace testks with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 1, DC1 : 1} and durable_writes = true; use testks; create column family cf1 with column_type = 'Standard' and comparator = 'org.apache.cassandra.db.marshal.AsciiType' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'ALL' and column_metadata = [ {column_name : 'indexedColumn', validation_class : UTF8Type, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; ---FINISH cassandra-cli schema --- -- START populate_rowcache.py --- from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('testks', timeout=5) cf = pycassa.ColumnFamily(pool, 'cf1') for userId in xrange(0, 1000): print userId b = Mutator(pool, queue_size=200) for itemId in xrange(20): rowKey = 'userId_%s:itemId_%s'%(userId, itemId) for message_number in xrange(10): b.insert(cf, rowKey, {'indexedColumn': 'userId_%s'%userId, str(message_number): str(message_number)}) b.send() pool.dispose() -- FINISH populate_rowcache.py ---
Re: Start token sorts after end token
See https://issues.apache.org/jira/browse/CASSANDRA-5168 - should be fixed in 1.1.10 and 1.2.2. On Jan 30, 2013, at 9:18 AM, Tejas Patil tejas.patil...@gmail.com wrote: While reading data from Cassandra in map-reduce, I am getting InvalidRequestException(why:Start token sorts after end token) Below is the code snippet that I used and the entire stack trace. (I am using Cassandra 1.2.0 and hadoop 0.20.2) Can you point out the issue here ? Code snippet: SlicePredicate predicate = new SlicePredicate(); SliceRange sliceRange = new SliceRange(); sliceRange.start = ByteBuffer.wrap((1.getBytes())); sliceRange.finish = ByteBuffer.wrap((100.getBytes())); sliceRange.reversed = false; //predicate.slice_range = sliceRange; ListByteBuffer colNames = new ArrayListByteBuffer(); colNames.add(ByteBuffer.wrap(url.getBytes())); colNames.add(ByteBuffer.wrap(Parent.getBytes())); predicate.column_names = colNames; ConfigHelper.setInputSlicePredicate(job.getConfiguration(), predicate); Full stack trace: java.lang.RuntimeException: InvalidRequestException(why:Start token sorts after end token) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:184) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:
Re: too many warnings of Heap is full
What is the cardinality like on these indexes? Can you provide the schema creation for these two column families? This is the schema of the CFs: create column family CF_users with comparator = UTF8Type and column_metadata = [ {column_name: userSBCode, validation_class: UTF8Type, index_type: KEYS}, {column_name: userEmail, validation_class: UTF8Type, index_type: KEYS}, {column_name: userName, validation_class: UTF8Type}, {column_name: userLastName, validation_class: UTF8Type}, {column_name: userOwnPhoneKey, validation_class: UTF8Type, index_type: KEYS}, {column_name: userOwnPhone, validation_class: UTF8Type, index_type: KEYS}, {column_name: userPasswordMD5, validation_class: UTF8Type}, {column_name: userDOB, validation_class: UTF8Type}, {column_name: userGender, validation_class: UTF8Type}, {column_name: userProfilePicMD5, validation_class: UTF8Type}, {column_name: userAbout, validation_class: UTF8Type}, {column_name: userLastSession, validation_class: UTF8Type} {column_name: userMasterKey, validation_class: UTF8Type} ]; create column family CF_SBMessages with comparator = UTF8Type and column_metadata = [ {column_name: SBMessageId, validation_class: UTF8Type, index_type: KEYS}, {column_name: fromSBCode, validation_class: UTF8Type, index_type: KEYS}, {column_name: SBMessageDate, validation_class: UTF8Type, index_type: KEYS}, {column_name: SBMessageType, validation_class: UTF8Type}, {column_name: SBMessageText, validation_class: UTF8Type}, {column_name: SBMessageAttachments, validation_class: UTF8Type}, ]; I've read about the importance of keeping the cardinality of the secondary indexes low (great article at http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/), and I'm afraid that we did completely the opposite (we did consider the secondary indexes as alternate indexes). I guess here is some work to do to create other CFs to store these secondary indexes. Anyway, I still don't understand why did appear these peaks (by the way, last night there wasn't any)
neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes
Hi again, Once started playing with CCM it's hard to stop, such a great tool. My issue with secondary indexes is following: neither explicit 'nodetool repair' nor implicit 'hinted handoffs/read repairs' resolve inconsistencies in data I get from secondary indexes. I observe this for both one- and 2-datacenter deployments, independent of caching settings. Rebuilding/droping and creating index or restarting nodes doesn't help. In the following scenario I start up 2 nodes and insert some rows with CL.ONE. During this process I deliberately stop and start the nodes in order to trigger inconsistencies. I then query all data by its index with read CL.ONE and stop if I see that data is missing. I see that none of C* repair mechanisms work for secondary indexes. $ ccm create --cassandra-version 1.2.1 --nodes 2 -p RandomPartitioner test2ndIndexRepair $ ccm start $ ccm node1 cli - create keyspace and column family (please find schemas attached) $ python populate_repair.py (in first terminal) $ ccm node1 stop; sleep 10; ccm node1 start (in second terminal, while populate_repair.py runs) $ ccm node2 stop; sleep 10; ccm node2 start (in second terminal, while populate_repair.py runs. Hinted Handoffs do the work but unfortunately not on Secondary Indexes) $ python fetcher_repair.py 254 255 256 Traceback (most recent call last): File fetcher_repair.py, line 19, in module raise Exception('missing rows for userId %s, data length is %d'%(userId, len(data))) Exception: missing rows for userId 256, data length is 0 $ ccm cli [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_256'; 0 Row Returned. Elapsed time: 47 msec(s). $ python fetcher_repair.py (running one more time in hope that 'read repair' kicked in after the last query, but unfortunately no) 254 255 256 Traceback (most recent call last): File fetcher_repair.py, line 19, in module raise Exception('missing rows for userId %s, data length is %d'%(userId, len(data))) Exception: missing rows for userId 256, data length is 0 $ ccm node1 repair $ ccm node2 repair $ ccm cli [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_256'; 0 Row Returned. Both cassandra instances run with -Xms1927M -Xmx1927M -Xmn400M Thanks for help. Best regards, Alexei --START cassandra-cli schemas create keyspace testks with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 2} and durable_writes = true; use testks; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 1.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : UTF8Type, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; --FINISH cassandra-cli schemas --START populate_repair.py -- import datetime from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('testks', timeout=5, server_list=['127.0.0.1:9160', '127.0.0.2:9160']) cf = pycassa.ColumnFamily(pool, 'cf1') for userId in xrange(0, 2000): print userId b = Mutator(pool, queue_size=200) for itemId in xrange(20): rowKey = 'userId_%s:itemId_%s'%(userId, itemId) for message_number in xrange(10): b.insert(cf, rowKey, {'indexedColumn': 'userId_%s'%userId, str(message_number): str(message_number)}) b.send() pool.dispose() --FINISH populate_repair.py -- --START fetcher_repair.py -- import pycassa from pycassa.columnfamily import ColumnFamily from pycassa.pool import ConnectionPool from pycassa.index import * pool = pycassa.ConnectionPool('testks', server_list=['127.0.0.1:9160', '127.0.0.2:9160']) cf = pycassa.ColumnFamily(pool, 'cf1') for userId in xrange(2000): print userId index_expr = create_index_expression('indexedColumn', 'userId_%s'%userId) index_clause = create_index_clause([index_expr], count=1000) data = list(cf.get_indexed_slices(index_clause=index_clause)) if len(data) != 20: raise Exception('missing rows for userId %s, data length is %d'%(userId, len(data))) pool.dispose() --FINISH fetcher_repair.py --
Re: Inserting via thrift interface to column family created with Compound Key via cql3
Whats the full error stack on the client ? Are you using a pre-build thrift client or you own ? If the later try using a pre built client first, like Hector or pycassa. If it works there look into how that code works and go from there. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 5:24 AM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: BTW, thanks for chiming in! No-no, I'm using Thrift client, not inserting via cql. I'm serializing via CompositeType, actually. CompositeType.getInstance(UTF8Type, UTF8Type).decompose([firstkeypart, secondkeypart]); Hm... From what you say I understand that it's technically possible :/ So I must be wrong somewhere,
Re: cluster issues
For Data Stax Enterprise specific questions try the support forums http://www.datastax.com/support-forums/ Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 8:27 AM, S C as...@outlook.com wrote: I am using DseDelegateSnitch Thanks, SC From: aa...@thelastpickle.com Subject: Re: cluster issues Date: Tue, 29 Jan 2013 20:15:45 +1300 To: user@cassandra.apache.org • We can always be proactive in keeping the time sync. But, Is there any way to recover from a time drift (in a reactive manner)? Since it was a lab environment, I dropped the KS (deleted data directory) There is a way to remove future dated columns, but it not for the faint hearted. Basically: 1) Drop the gc_grace_seconds to 0 2) Delete the column with a timestamp way in the future, so it is guaranteed to be higher than the value you want to delete. 3) Flush the CF 4) Compact all the SSTables that contain the row. The easiest way to do that is a major compaction, but we normally advise not to do that because it creates one big file. You can also do a user defined compaction. • Are there any other scenarios that would lead a cluster look like below? Note:Actual topology of the cluster - ONE Cassandra node and TWO Analytic nodes. • What snitch are you using? If you have the property file snitch do all nodes have the same configuration ? There is a lot of sickness there. If possible I would scrub and start again. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 29/01/2013, at 6:29 AM, S C as...@outlook.com wrote: One of our node in a 3 node cluster drifted by ~ 20-25 seconds. While I figured this pretty quickly, I had few questions that am looking for some answers. • We can always be proactive in keeping the time sync. But, Is there any way to recover from a time drift (in a reactive manner)? Since it was a lab environment, I dropped the KS (deleted data directory). • Are there any other scenarios that would lead a cluster look like below?Note:Actual topology of the cluster - ONE Cassandra node and TWO Analytic nodes. On 192.168.2.100 Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 192.168.2.100 Cassandra rack1 Up Normal 601.34 MB 33.33% 0 192.168.2.101 Analytics rack1 Down Normal 149.75 MB 33.33% 56713727820156410577229101238628035242 192.168.2.102 Analytics rack1 Down Normal ? 33.33% 113427455640312821154458202477256070485 On 192.168.2.101 Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 192.168.2.100 Analytics rack1 Down Normal ? 33.33% 0 192.168.2.101 Analytics rack1 Up Normal 158.59 MB 33.33% 56713727820156410577229101238628035242 192.168.2.102 Analytics rack1 Down Normal ? 33.33% 113427455640312821154458202477256070485 On 192.168.2.102 Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 192.168.2.100 Analytics rack1 Down Normal ? 33.33% 0 192.168.2.101 Analytics rack1 Down Normal ? 33.33% 56713727820156410577229101238628035242 192.168.2.102 Analytics rack1 Up Normal 117.02 MB 33.33% 113427455640312821154458202477256070485 Appreciate your valuable inputs. Thanks, SC
Re: CASSANDRA-5152
Can you update the ticket with your experiences ? https://issues.apache.org/jira/browse/CASSANDRA-5152 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 11:13 AM, yen-fen_...@mcafee.com wrote: I had the same problem with 1.2.0. The problem went away after readline was easy-installed. Regards, Yen-Fen Hsu
Re: why set replica placement strategy at keyspace level ?
Many of my mental models bother people :) This particular one came from my understanding of Big Table and the code. For me this works, I think of (internal) rows as roughly containing the CF's. In the CQL world it works for me as well, the partition key (first part of the primary key) is important and identifies the storage container that has the columns. Your milage may vary - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 4:43 PM, Edward Capriolo edlinuxg...@gmail.com wrote: That should not bother you. For example, if your doing an hbase scan that crosses two column families, that count end up being two (disk) seeks. Having an API that hides the seeks from you does not give you better performance, it only helps you when your debating with people that do not understand the fundamentals.
Re: Cassandra pending compaction tasks keeps increasing
Will that cause the symptom of no data streamed from other nodes? Other nodes still think the node had all the data? AFAIk they will not make assumptions like that. Can I just change it in yaml and restart C* and it will correct itself? It's a schema config change, check the help for the CLI or the CQL docs. Any side effect? Since we are using SSD, a bit bigger SSD won't slow down the read too much, I suppose that is the main concern for bigger size of SSTable? Do some experiments to see how it works, and let others know :) Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 5:30 PM, Wei Zhu wz1...@yahoo.com wrote: Some updates: Since we still have not fully turned on the system. We did something crazy today. We tried to treat the node as dead one. (My boss wants us to practice replacing a dead node before going to full production) and boot strap it. Here is what we did: • drain the node • check nodetool on other nodes, and this node is marked down (the token for this node is 100) • clear the data, commit log, saved cache • change initial_token from 100 to 99 in the yaml file • start the node • check nodetool, the down node of 100 disappeared by itself (!!) and new node with token 99 showed up • checked log, see the message saying bootstrap completed. But only a couple of MB streamed. • nodetool movetoken 98 • nodetool, see the node with token 98 comes up. • check log, see the message saying bootstrap completed. But still only a couple of MB streamed. The only reason I can think of is that the new node has the same IP as the dead node we tried to replace? Will that cause the symptom of no data streamed from other nodes? Other nodes still think the node had all the data? We had to do nodetool repair -pr to bring in the data. After 3 hours, 150G transferred. And no surprise, pending compaction tasks are now at 30K. There are about 30K SStable transferred and I guess all of them needs to be compacted since we use LCS. My concern is that if we did nothing wrong, replacing a dead node will cause such a hugh back log of pending compaction. It might take a week to clear that off. And we have RF = 3, we still need to bring in the data for the other two replicates since we use pr for nodetool repair. It will take about 3 weeks to fully replace a 200G node using LCS? We tried everything we can to speed up the compaction and no luck. The only thing I can think of is to increase the default size of SSTable, so less number of compaction will be needed. Can I just change it in yaml and restart C* and it will correct itself? Any side effect? Since we are using SSD, a bit bigger SSD won't slow down the read too much, I suppose that is the main concern for bigger size of SSTable? I think 1.2 comes with parallel LC which should help the situation. But we are not going to upgrade for a little while. Did I miss anything? It might not be practical to use LCS for 200G node? But if we use Sized compaction, we need to have at least 400G for the HD...Although SSD is cheap now, still hard to convince the management. three replicates + double the Disk for compaction? that is 6 times of the real data size! Sorry for the long email. Any suggestion or advice? Thanks. -Wei From: aaron morton aa...@thelastpickle.com To: Cassandra User user@cassandra.apache.org Sent: Tuesday, January 29, 2013 12:59:42 PM Subject: Re: Cassandra pending compaction tasks keeps increasing * Will try it tomorrow. Do I need to restart server to change the log level? You can set it via JMX, and supposedly log4j is configured to watch the config file. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 29/01/2013, at 9:36 PM, Wei Zhu wz1...@yahoo.com wrote: Thanks for the reply. Here is some information: Do you have wide rows ? Are you seeing logging about Compacting wide rows ? * I don't see any log about wide rows Are you seeing GC activity logged or seeing CPU steal on a VM ? * There is some GC, but CPU general is under 20%. We have heap size of 8G, RAM is at 72G. Have you tried disabling multithreaded_compaction ? * By default, it's disabled. We enabled it, but doesn't see much difference. Even a little slower with it's enabled. Is it bad to enable it? We have SSD, according to comment in yaml, it should help while using SSD. Are you using Key Caches ? Have you tried disabling compaction_preheat_key_cache? * We have fairly big Key caches, we set as 10% of Heap which is 800M. Yes, compaction_preheat_key_cache is disabled. Can you enabled DEBUG level logging and make them available ? * Will try it tomorrow. Do I need to restart server to change the log
Re: CPU hotspot at BloomFilterSerializer#deserialize
5. the problematic Data file contains only 5 to 10 keys data but large(2.4G) So very large rows ? What does nodetool cfstats or cfhistograms say about the row sizes ? 1. what is happening? I think this is partially large rows and partially the query pattern, this is only by roughly correct http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and my talk here http://www.datastax.com/events/cassandrasummit2012/presentations 3. any more info required to proceed? Do some tests with different query techniques… Get a single named column. Get the first 10 columns using the natural column order. Get the last 10 columns using the reversed order. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 7:20 PM, Takenori Sato ts...@cloudian.com wrote: Hi all, We have a situation that CPU loads on some of our nodes in a cluster has spiked occasionally since the last November, which is triggered by requests for rows that reside on two specific sstables. We confirmed the followings(when spiked): version: 1.0.7(current) - 0.8.6 - 0.8.5 - 0.7.8 jdk: Oracle 1.6.0 1. a profiling showed that BloomFilterSerializer#deserialize was the hotspot(70% of the total load by running threads) * the stack trace looked like this(simplified) 90.4% - org.apache.cassandra.db.ReadVerbHandler.doVerb 90.4% - org.apache.cassandra.db.SliceByNamesReadCommand.getRow ... 90.4% - org.apache.cassandra.db.CollationController.collectTimeOrderedData ... 89.5% - org.apache.cassandra.db.columniterator.SSTableNamesIterator.read ... 79.9% - org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter 68.9% - org.apache.cassandra.io.sstable.BloomFilterSerializer.deserialize 66.7% - java.io.DataInputStream.readLong 2. Usually, 1 should be so fast that a profiling by sampling can not detect 3. no pressure on Cassandra's VM heap nor on machine in overal 4. a little I/O traffic for our 8 disks/node(up to 100tps/disk by iostat 1 1000) 5. the problematic Data file contains only 5 to 10 keys data but large(2.4G) 6. the problematic Filter file size is only 256B(could be normal) So now, I am trying to read the Filter file in the same way BloomFilterSerializer#deserialize does as possible as I can, in order to see if the file is something wrong. Could you give me some advise on: 1. what is happening? 2. the best way to simulate the BloomFilterSerializer#deserialize 3. any more info required to proceed? Thanks, Takenori
Re: initial_token
Do not set initial_token when using murmur3partitioner. instead, set num_tokens. For example, u have 3 hosts with the same hardware setup, then, for each one set the same num_tokens. But now consider adding another better host, this time i'd suggest you to set previous num_tokens * 2. num_tokens: 128 (worse machines) num_tokens: 256(twice better machine) This is the setup of virtual nodes. Check current datastax docs for it. On Thu, Jan 31, 2013 at 8:43 PM, Edward Capriolo edlinuxg...@gmail.comwrote: This is the bad side of changing default. There are going to be a few groups unfortunates. The first group, who only can not setup their cluster, and eventually figure out their tokens. (this thread) The second group, who assume their tokens were correct and run around with an unbalanced cluster thinking the performance sucks. (the threads for the next few months) The third group, who will google how to balance my ring and find a page with random partitioner instructions. (the occasional thread for the next N years) The fourth group, because as of now map reduce is highly confused by this. On Thu, Jan 31, 2013 at 4:52 PM, Rob Coli rc...@palominodb.com wrote: On Thu, Jan 31, 2013 at 12:17 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Now by default a new partitioner is chosen Murmer3. Now = as of 1.2, to be unambiguous. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Understanding Virtual Nodes on Cassandra 1.2
Are there tickets/documents explain how data be replicated on Virtual Nodes? This http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 Check the changes.txt file, they link to tickets. not many people use BOP so you may be exploring new'ish territory. Try asking someone on the IRC channel. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 31/01/2013, at 11:47 PM, Manu Zhang owenzhang1...@gmail.com wrote: On Thu 31 Jan 2013 03:43:32 AM CST, Zhong Li wrote: Are there tickets/documents explain how data be replicated on Virtual Nodes? If there are multiple tokens on one physical host, may a chance two or more tokens chosen by replication strategy located on same host? If move/remove/add a token manually, does Cassandra Engine validate the case? Thanks. On Jan 30, 2013, at 12:46 PM, Zhong Li wrote: You add a physical node and that in turn adds num_token tokens to the ring. No, I am talking about Virtual Nodes with order preserving partitioner. For an existing host with multiple tokens setting list on cassandra.inital_token. After initial bootstrapping, the host will not aware changes of cassandra.inital_token. If I want add a new token( virtual node), I have to rebuild the host with new token list. My question is if there is way to add a virtual nodes without rebuild it? Thanks, On Jan 30, 2013, at 10:21 AM, Manu Zhang wrote: On Wed 30 Jan 2013 02:29:27 AM CST, Zhong Li wrote: One more question, can I add a virtual node manually without reboot and rebuild a host data? I checked nodetool command, there is no option to add a node. Thanks. Zhong On Jan 29, 2013, at 11:09 AM, Zhong Li wrote: I was misunderstood this http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 , especially If you want to get started with vnodes on a fresh cluster, however, that is fairly straightforward. Just don’t set the |initial_token| parameter in your|conf/cassandra.yaml| and instead enable the |num_tokens| parameter. A good default value for this is 256 Also I couldn't find document about set multiple tokens for cassandra.inital_token Anyway, I just tested, it does work to set comma separated list of tokens. Thanks, Zhong On Jan 29, 2013, at 3:06 AM, aaron morton wrote: After I searched some document on Datastax website and some old ticket, seems that it works for random partitioner only, and leaves order preserved partitioner out of the luck. Links ? or allow add Virtual Nodes manually? If not looked into it but there is a cassandra.inital_token startup param that takes a comma separated list of tokens for the node. There also appears to be support for the ordered partitions to generate random tokens. But you would still have the problem of having to balance your row keys around the token space. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ http://www.thelastpickle.com/ On 29/01/2013, at 10:31 AM, Zhong Li z...@voxeo.com mailto:z...@voxeo.com mailto:z...@voxeo.com wrote: Hi All, Virtual Nodes is great feature. After I searched some document on Datastax website and some old ticket, seems that it works for random partitioner only, and leaves order preserved partitioner out of the luck. I may misunderstand, please correct me. if it doesn't love order preserved partitioner, would be possible to add support multiple initial_token(s) for order preserved partitioner or allow add Virtual Nodes manually? Thanks, Zhong You add a physical node and that in turn adds num_token tokens to the ring. no, those tokens will be skipped
Re: JDBC : CreateresultSet fails with null column in CqlResultSet
I think http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/issues/list is the place to raise the issue. Can you update the mail thread with the ticket as well? Thanks - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/02/2013, at 3:25 AM, Andy Cobley acob...@computing.dundee.ac.uk wrote: As you may be aware I've been trying to track down a problem using JDBC 1.1.2 with Cassandra 1.2.0 I was getting a null pointer exception in the result set. I've done some digging into the JDBC driver and found the following. In CassandraResultSet.java the new result set is Instantiated in CassandraResultSet(Statement statement, CqlResult resultSet, String keyspace) I decided to trace the result set with the following code: rowsIterator = resultSet.getRowsIterator(); System.out.println(---); while(rowsIterator.hasNext()){ CqlRow row = rowsIterator.next(); curRowKey = row.getKey(); System.out.println(Row Key +curRowKey); ListColumn cols = row.getColumns(); IteratorColumn iterator; iterator = cols.iterator(); while (iterator.hasNext()){ Column col=(Column)iterator.next(); String Name= new String(col.getName()); String Value = new String(col.getValue()); System.out.println(Col +Name+ : +Value); } } This produced the following output: --- Row Key [B@617e53c9 Col key : jsmith Col : Col password : ch@ngem3a Row Key [B@2caee320 Col key : jbrown Col : Col gender : male --- As you can see there is a black column at position 2 in each of the rows. As this resultset has come from the Cassandra thrift client ( I believe) the problem amy lay there. There is no blank column defined by my SQL create statements I believe. If I'm correct here, should I raise a ticket with JDBC or Cassandra ? (for now I've patched my local JDBC driver so it doesn't create a TypedColumn if the result set produces a null column) Andy The University of Dundee is a Scottish Registered Charity, No. SC015096.
Re: JDBC : CreateresultSet fails with null column in CqlResultSet
Aaron, Ticket is at http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/issues/detail?id=61 Andy On 1 Feb 2013, at 18:01, aaron morton aa...@thelastpickle.com wrote: I think http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/issues/list is the place to raise the issue. Can you update the mail thread with the ticket as well? Thanks - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com The University of Dundee is a Scottish Registered Charity, No. SC015096.
conditional update or insert
Hi All, On each row I have a column which maintains the timestamp like lastUpdated etc. While inserting such row I want to make sure that the row should be only updated if the lastUpdated is older than the new one I am inserting. One way to do this is - Read the record first check the timestamp if newer is latest then update. Since I have higher volume of read and writes load. This additional read will add to it. Any alternative to achieve this? Thanks, Jay
Re: initial_token
You do not just want to vnodes without being sure. Some queries are not optimized for vnodes and issue 128 slices to solve some secondaryIndexQueries. On Fri, Feb 1, 2013 at 12:55 PM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Do not set initial_token when using murmur3partitioner. instead, set num_tokens. For example, u have 3 hosts with the same hardware setup, then, for each one set the same num_tokens. But now consider adding another better host, this time i'd suggest you to set previous num_tokens * 2. num_tokens: 128 (worse machines) num_tokens: 256(twice better machine) This is the setup of virtual nodes. Check current datastax docs for it. On Thu, Jan 31, 2013 at 8:43 PM, Edward Capriolo edlinuxg...@gmail.com wrote: This is the bad side of changing default. There are going to be a few groups unfortunates. The first group, who only can not setup their cluster, and eventually figure out their tokens. (this thread) The second group, who assume their tokens were correct and run around with an unbalanced cluster thinking the performance sucks. (the threads for the next few months) The third group, who will google how to balance my ring and find a page with random partitioner instructions. (the occasional thread for the next N years) The fourth group, because as of now map reduce is highly confused by this. On Thu, Jan 31, 2013 at 4:52 PM, Rob Coli rc...@palominodb.com wrote: On Thu, Jan 31, 2013 at 12:17 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Now by default a new partitioner is chosen Murmer3. Now = as of 1.2, to be unambiguous. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Cassandra pending compaction tasks keeps increasing
Did the node list itself as a seed node in cassandra.yaml? Unless something has changed, a node that considers itself a seed will not auto bootstrap. Although I haven't tried it, I think running 'nodetool rebuild' will cause it to stream in the data it needs without doing a repair. On Wed, Jan 30, 2013 at 9:30 PM, Wei Zhu wz1...@yahoo.com wrote: Some updates: Since we still have not fully turned on the system. We did something crazy today. We tried to treat the node as dead one. (My boss wants us to practice replacing a dead node before going to full production) and boot strap it. Here is what we did: - drain the node - check nodetool on other nodes, and this node is marked down (the token for this node is 100) - clear the data, commit log, saved cache - change initial_token from 100 to 99 in the yaml file - start the node - check nodetool, the down node of 100 disappeared by itself (!!) and new node with token 99 showed up - checked log, see the message saying bootstrap completed. But only a couple of MB streamed. - nodetool movetoken 98 - nodetool, see the node with token 98 comes up. - check log, see the message saying bootstrap completed. But still only a couple of MB streamed. The only reason I can think of is that the new node has the same IP as the dead node we tried to replace? Will that cause the symptom of no data streamed from other nodes? Other nodes still think the node had all the data? We had to do nodetool repair -pr to bring in the data. After 3 hours, 150G transferred. And no surprise, pending compaction tasks are now at 30K. There are about 30K SStable transferred and I guess all of them needs to be compacted since we use LCS. My concern is that if we did nothing wrong, replacing a dead node will cause such a hugh back log of pending compaction. It might take a week to clear that off. And we have RF = 3, we still need to bring in the data for the other two replicates since we use pr for nodetool repair. It will take about 3 weeks to fully replace a 200G node using LCS? We tried everything we can to speed up the compaction and no luck. The only thing I can think of is to increase the default size of SSTable, so less number of compaction will be needed. Can I just change it in yaml and restart C* and it will correct itself? Any side effect? Since we are using SSD, a bit bigger SSD won't slow down the read too much, I suppose that is the main concern for bigger size of SSTable? I think 1.2 comes with parallel LC which should help the situation. But we are not going to upgrade for a little while. Did I miss anything? It might not be practical to use LCS for 200G node? But if we use Sized compaction, we need to have at least 400G for the HD...Although SSD is cheap now, still hard to convince the management. three replicates + double the Disk for compaction? that is 6 times of the real data size! Sorry for the long email. Any suggestion or advice? Thanks. -Wei -- *From: *aaron morton aa...@thelastpickle.com *To: *Cassandra User user@cassandra.apache.org *Sent: *Tuesday, January 29, 2013 12:59:42 PM *Subject: *Re: Cassandra pending compaction tasks keeps increasing * Will try it tomorrow. Do I need to restart server to change the log level? You can set it via JMX, and supposedly log4j is configured to watch the config file. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 29/01/2013, at 9:36 PM, Wei Zhu wz1...@yahoo.com wrote: Thanks for the reply. Here is some information: Do you have wide rows ? Are you seeing logging about Compacting wide rows ? * I don't see any log about wide rows Are you seeing GC activity logged or seeing CPU steal on a VM ? * There is some GC, but CPU general is under 20%. We have heap size of 8G, RAM is at 72G. Have you tried disabling multithreaded_compaction ? * By default, it's disabled. We enabled it, but doesn't see much difference. Even a little slower with it's enabled. Is it bad to enable it? We have SSD, according to comment in yaml, it should help while using SSD. Are you using Key Caches ? Have you tried disabling compaction_preheat_key_cache? * We have fairly big Key caches, we set as 10% of Heap which is 800M. Yes, compaction_preheat_key_cache is disabled. Can you enabled DEBUG level logging and make them available ? * Will try it tomorrow. Do I need to restart server to change the log level? -Wei -- From: aaron morton aa...@thelastpickle.com To: user@cassandra.apache.org Sent: Monday, January 28, 2013 11:31:42 PM Subject: Re: Cassandra pending compaction tasks keeps increasing * Why nodetool repair increases the data size that much? It's not likely that much data needs to be repaired. Will that happen for all the subsequent repair? Repair only
Not enough replicas???
I need to offer my profound thanks to this community which has been so helpful in trying to figure this system out. I've setup a simple ring with two nodes and I'm trying to insert data to them. I get failures 100% with this error: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. I'm not doing anything fancy - this is just from setting up the cluster following the basic instructions from datastax for a simple one data center cluster. My config is basically the default except for the changes they discuss (except that I have configured for my IP addresses... my two boxes are .126 and .127) cluster_name: 'MyDemoCluster' num_tokens: 256 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 10.28.205.126 listen_address: 10.28.205.126 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch Nodetool shows both nodes active in the ring, status = up, state = normal. For the CF: ColumnFamily: SystemEvent Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [SystemEvent.IdxName] Column Metadata: Column Name: eventTimeStamp Validation Class: org.apache.cassandra.db.marshal.DateType Column Name: name Validation Class: org.apache.cassandra.db.marshal.UTF8Type Index Name: IdxName Index Type: KEYS Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor Any ideas?
Re: Not enough replicas???
Please include the information on how your keyspace was created. This may indicate you set the replication factor to 3, when you only have 1 node, or some similar condition. On Fri, Feb 1, 2013 at 4:57 PM, stephen.m.thomp...@wellsfargo.com wrote: I need to offer my profound thanks to this community which has been so helpful in trying to figure this system out. I’ve setup a simple ring with two nodes and I’m trying to insert data to them. I get failures 100% with this error: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. I’m not doing anything fancy – this is just from setting up the cluster following the basic instructions from datastax for a simple one data center cluster. My config is basically the default except for the changes they discuss (except that I have configured for my IP addresses… my two boxes are .126 and .127) cluster_name: 'MyDemoCluster' num_tokens: 256 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 10.28.205.126 listen_address: 10.28.205.126 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch Nodetool shows both nodes active in the ring, status = up, state = normal. For the CF: ColumnFamily: SystemEvent Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [SystemEvent.IdxName] Column Metadata: Column Name: eventTimeStamp Validation Class: org.apache.cassandra.db.marshal.DateType Column Name: name Validation Class: org.apache.cassandra.db.marshal.UTF8Type Index Name: IdxName Index Type: KEYS Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor Any ideas?
Re: Cassandra pending compaction tasks keeps increasing
That is must be it. Yes. it happens to be the seed. I should have tried rebuild. Instead I did repair and now I am sitting here waiting for the compaction to finish... Thanks. -Wei From: Derek Williams de...@fyrie.net To: user@cassandra.apache.org; Wei Zhu wz1...@yahoo.com Sent: Friday, February 1, 2013 1:56 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Did the node list itself as a seed node in cassandra.yaml? Unless something has changed, a node that considers itself a seed will not auto bootstrap. Although I haven't tried it, I think running 'nodetool rebuild' will cause it to stream in the data it needs without doing a repair. On Wed, Jan 30, 2013 at 9:30 PM, Wei Zhu wz1...@yahoo.com wrote: Some updates: Since we still have not fully turned on the system. We did something crazy today. We tried to treat the node as dead one. (My boss wants us to practice replacing a dead node before going to full production) and boot strap it. Here is what we did: * drain the node * check nodetool on other nodes, and this node is marked down (the token for this node is 100) * clear the data, commit log, saved cache * change initial_token from 100 to 99 in the yaml file * start the node * check nodetool, the down node of 100 disappeared by itself (!!) and new node with token 99 showed up * checked log, see the message saying bootstrap completed. But only a couple of MB streamed. * nodetool movetoken 98 * nodetool, see the node with token 98 comes up. * check log, see the message saying bootstrap completed. But still only a couple of MB streamed. The only reason I can think of is that the new node has the same IP as the dead node we tried to replace? Will that cause the symptom of no data streamed from other nodes? Other nodes still think the node had all the data? We had to do nodetool repair -pr to bring in the data. After 3 hours, 150G transferred. And no surprise, pending compaction tasks are now at 30K. There are about 30K SStable transferred and I guess all of them needs to be compacted since we use LCS. My concern is that if we did nothing wrong, replacing a dead node will cause such a hugh back log of pending compaction. It might take a week to clear that off. And we have RF = 3, we still need to bring in the data for the other two replicates since we use pr for nodetool repair. It will take about 3 weeks to fully replace a 200G node using LCS? We tried everything we can to speed up the compaction and no luck. The only thing I can think of is to increase the default size of SSTable, so less number of compaction will be needed. Can I just change it in yaml and restart C* and it will correct itself? Any side effect? Since we are using SSD, a bit bigger SSD won't slow down the read too much, I suppose that is the main concern for bigger size of SSTable? I think 1.2 comes with parallel LC which should help the situation. But we are not going to upgrade for a little while. Did I miss anything? It might not be practical to use LCS for 200G node? But if we use Sized compaction, we need to have at least 400G for the HD...Although SSD is cheap now, still hard to convince the management. three replicates + double the Disk for compaction? that is 6 times of the real data size! Sorry for the long email. Any suggestion or advice? Thanks. -Wei From: aaron morton aa...@thelastpickle.com To: Cassandra User user@cassandra.apache.org Sent: Tuesday, January 29, 2013 12:59:42 PM Subject: Re: Cassandra pending compaction tasks keeps increasing * Will try it tomorrow. Do I need to restart server to change the log level? You can set it via JMX, and supposedly log4j is configured to watch the config file. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 29/01/2013, at 9:36 PM, Wei Zhu wz1...@yahoo.com wrote: Thanks for the reply. Here is some information: Do you have wide rows ? Are you seeing logging about Compacting wide rows ? * I don't see any log about wide rows Are you seeing GC activity logged or seeing CPU steal on a VM ? * There is some GC, but CPU general is under 20%. We have heap size of 8G, RAM is at 72G. Have you tried disabling multithreaded_compaction ? * By default, it's disabled. We enabled it, but doesn't see much difference. Even a little slower with it's enabled. Is it bad to enable it? We have SSD, according to comment in yaml, it should help while using SSD. Are you using Key Caches ? Have you tried disabling compaction_preheat_key_cache? * We have fairly big Key caches, we set as 10% of Heap which is 800M. Yes, compaction_preheat_key_cache is disabled. Can you enabled DEBUG level logging and make them available ? * Will try it tomorrow. Do I need to restart server to change the
Re: Cassandra behavior on single node
You are likely hitting the point where compaction is running all the time and consuming all the weak cloud io. Ebs is not suggested for performance you should use the ephermal drives. On Friday, February 1, 2013, Marcelo Elias Del Valle wrote: Hello, I am trying to figure out why the following behavior happened. Any help would be highly appreciated. This graph shows the server resources allocation of my single cassandra machine (running at Amazon EC2): http://mvalle.com/downloads/cassandra_host1.png I ran a hadoop process that reads a CSV file and writtes data to Cassandra. For about 1 h, the process ran fine, but taking about 100% of CPU. After 1 h, my hadoop process started to have its connection attempts refused by cassandra, as shown bellow. Since them, it has been taking 100% of the machine IO. It has been 2 h already since the IO is 100% on the machine running Cassandra. I am running Cassandra under Amazon EBS, which is slow, but I didn't think it would be that slow. Just wondering, is it normal for Cassandra to use a high amount of CPU? I am guessing all the writes were going to the memtables and when it was time to flush the server went down. Makes sense? I am still learning Cassandra as it's the first time I use it in production, so I am not sure if I am missing something really basic here. 2013-02-01 16:44:43,741 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-18): EXCEPTION:PoolTimeoutException: [host=(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=nosql1.s1mbi0se.com.br(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:201) at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:158) at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:60) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:50) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:229) at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1.execute(ThriftColumnFamilyQueryImpl.java:186) at com.s1mbi0se.dmp.input.service.InputService.searchUserByKey(InputService.java:700) ... at com.s1mbi0se.dmp.importer.map.ImporterMapper.map(ImporterMapper.java:20) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268) 2013-02-01 16:44:43,743 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-15): EXCEPTION:PoolTimeoutException: Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
Re: CQL binary protocol
The spec for the protocol is here https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec;hb=refs/heads/cassandra-1.2 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/02/2013, at 6:42 AM, Gabriel Ciuloaica gciuloa...@gmail.com wrote: Hi, You may take a look to java-driver project. It has an implementation for connection pool. Cheers, Gabi On 1/31/13 6:48 PM, Vivek Mishra wrote: Hi, Any connection pool API available for cassandra transport Client(org.apache.cassandra.transport.Client)? -Vivek
Re: rangeQuery to traverse keys backward?
There is no facility to do a get_range in reverse. Rows are ordered by their token, and using the Random or Murmur3 partitioner this means they are randomly ordered. So there is not much need to go backwards, or get 10 rows from either side of a particular row. Can you change your data model to not require precise range scans ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/02/2013, at 1:36 PM, Yuhan Zhang yzh...@onescreen.com wrote: Hi all, I'm tryinng to use get_range to traverse the rows by page by providing a :start_key and an :finish_key. This works fine when I traverse forward with :start_key=last_key, :finish_key= However, when I tried to traversed backward with :start_key=, :finish_key=first_key, this always gave me the first few rows in the column family. (my goal is to get the rows adjacent to my first_key) looks like it always takes priority of :start_key over the :finish_key. as for column range, there is an option to reverse the order. but there is an option for traversing rows. so I'm wondering whether cassandra is capable of doing this task with the current api I tried both twitter cassandra client and hector client, but couldn't find a way to perform it. have someone been able to do this? Thank you Yuhan The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly print, copy, retransmit, disseminate, or otherwise use the information. In addition, please delete this e-mail and all copies and notify the sender.