Re: Cassandra API Library.
On 08/23/2012 01:40 PM, Thomas Spengler wrote: 4) pelops (Thrift,Java) I've been using Pelops for quite some time with pretty good results; it felt much cleaner than Hector. Paolo -- @bernarpa http://paolobernardi.wordpress.com
Re: Secondary index partially created
If you are still having problems can you post the query and the output from nodetool cfstats on one of the nodes that fails ? cfstats will tell us if the secondary index was built. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 6:02 AM, Roshni Rajagopal roshni.rajago...@wal-mart.com wrote: What does List my_column_family in CLI show on all the nodes? Perhaps the syntax u're using isn't correct? You should be getting the same data on all the nodes irrespective of which node's CLI you use. The replication factor is for redundancy to have copies of the data on different nodes to help if nodes go down. Even if you had a replication factor of 1 you should still get the same data on all nodes. On 24/08/12 11:05 PM, Richard Crowley r...@rcrowley.org wrote: On Thu, Aug 23, 2012 at 6:54 PM, Richard Crowley r...@rcrowley.org wrote: I have a three-node cluster running Cassandra 1.0.10. In this cluster is a keyspace with RF=3. I *updated* a column family via Astyanax to add a column definition with an index on that column. Then I ran a backfill to populate the column in every row. Then I tried to query the index from Java and it failed but so did cassandra-cli: get my_column_family where my_column = 'my_value'; Two out of the three nodes are unable to query the new index and throw this error: InvalidRequestException(why:No indexed columns present in index clause with operator EQ) The third is able to query the new index happily but doesn't find any results, even when I expect it to. This morning the one node that's able to query the index is also able to produce the expected results. I'm a dummy and didn't use science so I don't know if the `nodetool compact` I ran across the cluster had anything to do with it. Regardless, it did not change the situation in any other way. `describe cluster;` in cassandra-cli confirms that all three nodes have the same schema and `show schema;` confirms that schema includes the new column definition and its index. The my_column_family.my_index-hd-* files only exist on that one node that can query the index. I ran `nodetool repair` on each node and waited for `nodetool compactionstats` to report zero pending tasks. Ditto for `nodetool compact`. The nodes that failed still fail. The node that succeeded still succeed. Can anyone shed some light? How do I convince it to let me query the index from any node? How do I get it to find results? Thanks, Richard This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***
Re: Commit log periodic sync?
Brutally. kill -9. that's fine. I was thinking about reboot -f -n We are wondering if the fsync of the commit log was working. I would say yes only because there other reported problems. I think case I would not expect to see data lose. If you are still in a test scenario can you try to reproduce the problem ? If possible can you reproduce it with a single node ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 11:00 AM, rubbish me rubbish...@googlemail.com wrote: Thanks, Aaron, for your reply - please see the inline. On 24 Aug 2012, at 11:04, aaron morton wrote: - we are running on production linux VMs (not ideal but this is out of our hands) Is the VM doing anything wacky with the IO ? Could be. But I thought we would ask here first. This is a bit difficult to prove cos we dont have the control over these VMs. As part of a DR exercise, we killed all 6 nodes in DC1, Nice disaster. Out of interest, what was the shutdown process ? Brutally. kill -9. We noticed that data that was written an hour before the exercise, around the last memtables being flushed,was not found in DC1. To confirm, data was written to DC 1 at CL LOCAL_QUORUM before the DR exercise. Was the missing data written before or after the memtable flush ? I'm trying to understand if the data should have been in the commit log or the memtables. Missing data was those written after the last flush. These data was retrievable before the DR exercise. Can you provide some more info on how you are detecting it is not found in DC 1? We tried hector, consistencylevel=local quorum. We had missing column or the whole row. We tried cassandra-cli on DC1 nodes, same. However once we run the same query on DC2, C* must have then done a read-repair. That particular piece of result data would appear in DC1 again. If we understand correctly, commit logs are being written first and then to disk every 10s. Writes are put into a bounded queue and processed as fast as the IO can keep up. Every 10s a sync messages is added to the queue. Not that the commit log segment may rotate at any time which requires a sync. A loss of data across all nodes in a DC seems odd. If you can provide some more information we may be able to help. We are wondering if the fsync of the commit log was working. But we saw no errors / warning in logs. Wondering if there is way to verify Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/08/2012, at 6:01 AM, rubbish me rubbish...@googlemail.com wrote: Hi all First off, let's introduce the setup. - 6 x C* 1.1.2 in active DC (DC1), another 6 in another (DC2) - keyspace's RF=3 in each DC - Hector as client. - client talks only to DC1 unless DC1 can't serve the request. In which case talks only to DC2 - commit log was periodically sync with the default setting of 10s. - consistency policy = LOCAL QUORUM for both read and write. - we are running on production linux VMs (not ideal but this is out of our hands) - As part of a DR exercise, we killed all 6 nodes in DC1, hector starts talking to DC2, all the data was still there, everything continued to work perfectly. Then we brought all nodes, one by one, in DC1 up. We saw a message saying all the commit logs were replayed. No errors reported. We didn't run repair at this time. We noticed that data that was written an hour before the exercise, around the last memtables being flushed,was not found in DC1. If we understand correctly, commit logs are being written first and then to disk every 10s. At worst we lost the last 10s of data. What could be the cause of this behaviour? With the blessing of C* we could recovered all these data from DC2. But we would like to understand why. Many thanks in advanced. Amy
Re: Cassandra 1.1.4 RPM required
On Aug 23, 2012, at 12:15 PM, Adeel Akbar wrote: Dear Aaron, Its required username and password which I have not. Can yo share direct link? There is no username and password for the Datastax rpm repository. http://rpm.datastax.com/community/ But there is no 1.1.4 version yet from Datastax. If you really need a 1.1.4 rpm. You can give my build a shot. I just started rolling my own packages for some reasons. Until my public rpm repo goes online, you can grab here the cassandra rpm. http://people.ogilvy.de/~mschirrmeister/linux/cassandra/ If you want, test it out. It's just a first build and not heavily tested. Marco
Re: optimizing use of sstableloader / SSTableSimpleUnsortedWriter
After thinking about how sstables are done on disk, it seems best (required??) to write out each row at once. Sort of. We only want one instance of the row per SSTable created. Any other tips to improve load time or reduce the load on the cluster or subsequent compaction activity? Less SSTables means less compaction. So go as high as you can on the bufferSizeInMB param for the SSTableSimpleUnsortedWriter. There is also a SSTableSimpleWriter. Because it expects rows to be ordered it does not buffer and can create bigger sstables. https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableSimpleWriter.java Right now my Cassandra data store has about 4 months of data and we have 5 years of historical ingest all the histories! Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 12:56 PM, Aaron Turner synfina...@gmail.com wrote: So I've read: http://www.datastax.com/dev/blog/bulk-loading Are there any tips for using sstableloader / SSTableSimpleUnsortedWriter to migrate time series data from a our old datastore (PostgreSQL) to Cassandra? After thinking about how sstables are done on disk, it seems best (required??) to write out each row at once. Ie: if each row == 1 years worth of data and you have say 30,000 rows, write one full row at a time (a full years worth of data points for a given metric) rather then 1 data point for 30,000 rows. Any other tips to improve load time or reduce the load on the cluster or subsequent compaction activity? All my CF's I'll be writing to use compression and leveled compaction. Right now my Cassandra data store has about 4 months of data and we have 5 years of historical (not sure yet how much we'll actually load yet, but minimally 1 years worth). Thanks! -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: unsubscribe
http://wiki.apache.org/cassandra/FAQ#unsubscribe - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/08/2012, at 4:12 AM, Shen szs...@gmail.com wrote:
Re: Expanding cluster to include a new DR datacenter
I did a quick test on a clean 1.1.4 and it worked Can you check the logs for errors ? Can you see your schema change in there ? Also what is the output from show schema; in the cli ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 6:53 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: Yes [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.PropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Friday, August 24, 2012 1:55 PM To: user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: So I’m at the point of updating the keyspaces from Simple to NetworkTopology and I’m not sure if the changes are being accepted using Cassandra-cli. I issue the change: [default@EBonding] update keyspace EBonding ... with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options={Fisher:2}; 9511e292-f1b6-3f78-b781-4c90aeb6b0f6 Waiting for schema agreement... ... schemas agree across the cluster Then I do a describe and it still shows the old strategy. Is there something else that I need to do? I’ve exited and restarted Cassandra-cli and it still shows the SimpleStrategy for that keyspace. Other nodes show the same information. [default@EBonding] describe EBonding; Keyspace: EBonding: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com] Sent: Thursday, August 23, 2012 11:06 AM To: user@cassandra.apache.org Subject: RE: Expanding cluster to include a new DR datacenter Thanks for the information! Answers my questions. From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: Wednesday, August 22, 2012 7:10 PM To: user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter If you didn't see this particular section, you may find it useful:http://www.datastax.com/docs/1.1/operations/cluster_management#adding-a-data-center-to-a-cluster Some comments inline: On Wed, Aug 22, 2012 at 3:43 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: We are in the process of building out a new DR system in another Data Center, and we want to mirror our Cassandra environment to that DR. I have a couple questions on the best way to do this after reading the documentation on the Datastax website. We didn’t initially plan for this to be a DR setup when first deployed a while ago due to budgeting, but now we need to. So I’m just trying to nail down the order of doing this as well as any potential issues. For the nodes, we don’t plan on querying the servers in this DR until we fail over to this data center. We are going to have 5 similar nodes in the DR, should I join them into the ring at token+1? Join them at token+10 just to leave a little space. Make sure you're using LOCAL_QUORUM for your queries instead of regular QUORUM. All keyspaces are set to the replication strategy of SimpleStrategy. Can I change the replication strategy after joining the new nodes in the DR to NetworkTopologyStategy with the updated replication factor for each dr? Switch your keyspaces over to NetworkTopologyStrategy before adding the new nodes. For the strategy options, just list the first dc until the second is up (e.g. {main_dc: 3}). Lastly, is changing snitch from default of SimpleSnitch to RackInferringSnitch going to cause any issues? Since its in the Cassandra.yaml file I assume a rolling restart to pick up the value would be ok? This is the first thing you'll want to do. Unless your node IPs would naturally put all nodes in a DC in the same rack, I recommend using PropertyFileSnitch, explicitly using the same rack. (I tend to prefer PFSnitch regardless; it's harder to accidentally mess up.) A rolling restart is required to pick up the change. Make sure to fill out cassandra-topology.properties first if using PFSnitch. This is all on Cassandra 1.1.4, Thanks for any help! -- Tyler Hobbs DataStax
Re: QUORUM writes, QUORUM reads -- and eventual consistency
Doesn't this mean that the read does not reflect the most recent write? Yes. A write that fails is not a write. If it were to have read the newer data from the 1 node and then afterwards read the old data from the other 2 then there is a consistency problem, but in the example you give the second reader seems to still have a consistent view. In the scenario of a TimedOutException for a write that is entirely possible. The write is not considered to be successful at the CL requested. So R + W N does not hold for that datum. When in doubt, ask Werner… when R + W N we have strong consistency… Strong consistency. After the update completes, any subsequent access (by A, B, or C) will return the updated value. when R + W = N we have weak / eventual consistency… *Eventual consistency. This is a specific form of weak consistency; the storage system guarantees that if no new updates are made to the object, eventually *all* accesses will return the last updated value. http://queue.acm.org/detail.cfm?id=1466448 (emphasis added) In C* this may mean HH or RR or repair or standard CL checks kicking in to make the second read return the correct consistent value. Isn't it cheaper to retry the mutation on _any exception_ than to have a transaction in place for the majority of non failing writes? Yes (with the counter exception). if you get an UnavailableException it's from the point of view of the coordinator. it may be the case that the coordinator is isolated and all the other nodes are UP and happy. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/08/2012, at 5:03 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Isn't it cheaper to retry the mutation on _any exception_ than to have a transaction in place for the majority of non failing writes? The special case to be considered is obviously counters which are not idempotent https://issues.apache.org/jira/browse/CASSANDRA-2495 On Sat, Aug 25, 2012 at 4:38 AM, Russell Haering russellhaer...@gmail.com wrote: The issue is that it is possible for a quorum write to return an error, but for the result of the write to still be reflected in the view seen by the client. There is really no performant way around this (although reading at ALL can make it much less frequent). Guaranteeing complete success or failure would (barring a creative solution I'm unaware of) require a transactional commit of some sort across the replica nodes for the key being written to. The performance tradeoff might be desirable under some circumstances, but if this is a requirement you should probably look at other databases. Some good rules to play by (someone correct me if these aren't 100% true): 1. For writes to a single key, an UnavailableException means the write failed totally (clients will never see the data you wrote) 2. For writes to a single key, a TimedOutException means you cannot know whether the write succeeded or failed 3. For writes to multiple keys, either an UnavailableException or a TimedOutException means you cannot know whether the write succeeded or failed. -Russell On Sat, Aug 25, 2012 at 12:17 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi Philip, From http://wiki.apache.org/cassandra/ArchitectureOverview Quorum write: blocks until quorum is reached By my understanding if you _did_ a quorum write it means it successfully completed. Guille I *think* we're saying the same thing here. The addition of the word successful (or something more suitable) would make the documentation more precise, not less.
Re: Decreasing the number of nodes in the ring
Removetoken should only be used when removing a dead node from a cluster, it's a much slower and more expensive operation since it triggers a repair so that the remaining nodes can figure out which data they should now have. Decommission on the other hand is much simpler, the node that's being decommissioned streams the data it has to the nodes that should have it, and then removes itself. I don't know exactly what your load is like, but I think the best way to accomplish it is like this: You have nodes: 1 2 3 4 5 6 7 8 9 Add SSD nodes: S1 1 2 3 S2 4 5 6 S3 7 8 9 Decommission 1, 4, 7 Check if you can remove more nodes Decommission 2, 5, 8 Check if you can remove more nodes Decommission 3, 6, 9 And when you've stopped, make sure your ring is balanced by using nodetool move. It's probably a bad idea to run with a lopsided cluster where some servers are much faster than the others. If you have a replication factor of 3, that means that half of your data will be on two slow and one fast machines (so quorum will be slow) and the oher half will be on two fast and one slow machine (so quorum will be fast). This leads to the somewhat unintuitive conclusion that you can make the cluster go faster by removing nodes. But it's your data and your cluster, so you need to measure and benchmark and figure out what's best for you and your app. /Henrik On Mon, Aug 27, 2012 at 4:22 AM, Mohit Anchlia mohitanch...@gmail.comwrote: use nodetool decommission and nodetool removetoken On Sun, Aug 26, 2012 at 5:31 PM, Senthilvel Rangaswamy senthil...@gmail.com wrote: We have a cluster of 9 nodes in the ring. We would like SSD backed boxes. But we may not need 9 nodes in that case. What is the best way to downscale the cluster to 6 or 3 nodes. -- ..Senthil If there's anything more important than my ego around, I want it caught and shot now. - Douglas Adams.
Understanding Cassandra + MapReduce + Composite Columns
Hi! I'm trying to use Hadoop's MapReduce on top of a Cassandra environment, and I've running into some issues while using Composite Columns. I'm currently using Cassandra 1.1.2 (I wouldn't mind having to update it) and Hadoop 1.0.3 (I'd rather keep this version). What I would like to do is send slices divided by the first key of the composite key, and do some processing, taking into account the rest of the elements of the composite key (as well as other columns). I've built a sandbox keyspace with some column families in order to test this: CREATE TABLE test_1 ( field1 text, field2 text, field3 text, field4 text, PRIMARY KEY (field1) ) ; CREATE TABLE test_2 ( field1 text, field2 text, field3 text, field4 text, PRIMARY KEY (field1, field2) ) ; The Job configuration (the relevant elements for Cassandra) is as follows: // Cassandra config ConfigHelper.setInputRpcPort(conf, 9160); ConfigHelper.setInputInitialAddress(conf, localhost); ConfigHelper.setInputPartitioner(conf, ByteOrderedPartitioner); ConfigHelper.setInputColumnFamily(conf, KEYSPACE, INPUT_COLUMN_FAMILY); SlicePredicate predicate = new SlicePredicate(); predicate.setSlice_range(new SliceRange().setStart(ByteBufferUtil.EMPTY_BYTE_BUFFER).setFinish(ByteBufferUtil.EMPTY_BYTE_BUFFER).setCount(5)); ConfigHelper.setInputSlicePredicate(conf, predicate); My dummy maps tries only to log the different keys and values received. map(ByteBuffer key, SortedMapByteBuffer, IColumn columns, OutputCollectorText, IntWritable output, Reporter reporter) { ... } With CF test_1 everything seems to work fine. With CF test_2, I only receive field1 value inside the ByteBuffer key. The rest of the composite key seems to be encoded into each key of the SortedMap with the particular key of that column (field3, field4, ...), but I don't know exactly how to extract it (I'm a bit new with ByteBuffers, so any help there will be welcome :)). Is there anyway to specify the schema of this particular CF at MR level, in order to be able to extract the secondary key? Thanks!
Re: Secondary index partially created
On Mon, Aug 27, 2012 at 12:59 AM, aaron morton aa...@thelastpickle.com wrote: If you are still having problems can you post the query and the output from nodetool cfstats on one of the nodes that fails ? driftx got me sorted. It escaped me that a rolling restart was necessary to build secondary indexes, which was masked by one node deciding to build its portion without a restart. Thanks, Richard
unsubscribe
This e-mail and files transmitted with it are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you are not one of the named recipient(s) or otherwise have reason to believe that you received this message in error, please immediately notify sender by e-mail, and destroy the original message. Thank You.
Re: unsubscribe
On Mon, Aug 27, 2012 at 9:50 AM, Nikolaidis, Christos cnikolai...@epsilon.com wrote: This e-mail and files transmitted with it are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you are not one of the named recipient(s) or otherwise have reason to believe that you received this message in error, please immediately notify sender by e-mail, and destroy the original message. Thank You. Since I am not in a position to unsubscribe anyone, I can only assume that I have received this message in error. As per the frightening legalese quoted above, I hereby notify you by email, and will now proceed to destroy the original message. Please don't sue me. Love, -- Eric Evans Acunu | http://www.acunu.com | @acunu
Re: unsubscribe
On Aug 27, 2012, at 4:11 PM, Eric Evans eev...@acunu.com wrote: Since I am not in a position to unsubscribe anyone, I can only assume that I have received this message in error. As per the frightening legalese quoted above, I hereby notify you by email, and will now proceed to destroy the original message. I think you have just engaged in dissemination, distribution or copying of this communication. Better blow up your computer and make a run for it. André
RE: unsubscribe
No worries :-) I was replying to the list so whoever manages it can unsubscribe me. -Original Message- From: Eric Evans [mailto:eev...@acunu.com] Sent: Monday, August 27, 2012 11:12 AM To: user@cassandra.apache.org Subject: Re: unsubscribe On Mon, Aug 27, 2012 at 9:50 AM, Nikolaidis, Christos cnikolai...@epsilon.com wrote: This e-mail and files transmitted with it are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you are not one of the named recipient(s) or otherwise have reason to believe that you received this message in error, please immediately notify sender by e-mail, and destroy the original message. Thank You. Since I am not in a position to unsubscribe anyone, I can only assume that I have received this message in error. As per the frightening legalese quoted above, I hereby notify you by email, and will now proceed to destroy the original message. Please don't sue me. Love, -- Eric Evans Acunu | http://www.acunu.com | @acunu This e-mail and files transmitted with it are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you are not one of the named recipient(s) or otherwise have reason to believe that you received this message in error, please immediately notify sender by e-mail, and destroy the original message. Thank You.
Re: unsubscribe
On Aug 27, 2012, at 4:16 PM, Nikolaidis, Christos cnikolai...@epsilon.com wrote: No worries :-) I was replying to the list so whoever manages it can unsubscribe me. That's not how you unsubscribe. You need to send an email to user-unsubscr...@cassandra.apache.org. André
can you use hostnames in the topology file?
In the example, I see all ips being used, but our machines are on dhcp so I would prefer using hostnames for everything(plus if a machine goes down, I can bring it back online on another machine with a different ip but same hostname). If I use hostname, does the listen_address have to be hardwired to that same EXACT hostname for lookup purposes as well? Or will localhost grab the hostname though it looks like it grabs the ip. Thanks, Dean
Re: optimizing use of sstableloader / SSTableSimpleUnsortedWriter
On Mon, Aug 27, 2012 at 1:19 AM, aaron morton aa...@thelastpickle.com wrote: After thinking about how sstables are done on disk, it seems best (required??) to write out each row at once. Sort of. We only want one instance of the row per SSTable created. Ah, good clarification, although I think for my purposes they're one in the same. Any other tips to improve load time or reduce the load on the cluster or subsequent compaction activity? Less SSTables means less compaction. So go as high as you can on the bufferSizeInMB param for the SSTableSimpleUnsortedWriter. Ok. There is also a SSTableSimpleWriter. Because it expects rows to be ordered it does not buffer and can create bigger sstables. https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableSimpleWriter.java Hmmm prolly not realistic in my situation... doing so would likely thrash the disks on my PG server a lot more and kill my read throughput and that server is already hitting a wall. Right now my Cassandra data store has about 4 months of data and we have 5 years of historical ingest all the histories! Actually, I was a little worried about how much space that would take... my estimates was ~305GB/year, which is a lot when you consider the 300-400GB/node limit (something I didn't know about at the time). However, compression has turned out to be extremely efficient on my dataset... just under 4 months of data is less then 2GB! I'm pretty thrilled. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
RE: Expanding cluster to include a new DR datacenter
Show schema output show the simple strategy still [default@unknown] show schema EBonding; create keyspace EBonding with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; This is the only thing I see in the system log at the time on all the nodes: INFO [MigrationStage:1] 2012-08-27 10:54:18,608 ColumnFamilyStore.java (line 659) Enqueuing flush of Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,612 Memtable.java (line 264) Writing Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,627 Memtable.java (line 305) Completed flushing /opt/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-he-34817-Data.db (241 bytes) for commitlog p$ Should I turn the logging level up on something to see some more info maybe? From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Monday, August 27, 2012 1:35 AM To: user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter I did a quick test on a clean 1.1.4 and it worked Can you check the logs for errors ? Can you see your schema change in there ? Also what is the output from show schema; in the cli ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 6:53 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: Yes [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.PropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] From: Mohit Anchlia [mailto:mohitanch...@gmail.comhttp://gmail.com] Sent: Friday, August 24, 2012 1:55 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: So I'm at the point of updating the keyspaces from Simple to NetworkTopology and I'm not sure if the changes are being accepted using Cassandra-cli. I issue the change: [default@EBonding] update keyspace EBonding ... with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options={Fisher:2}; 9511e292-f1b6-3f78-b781-4c90aeb6b0f6 Waiting for schema agreement... ... schemas agree across the cluster Then I do a describe and it still shows the old strategy. Is there something else that I need to do? I've exited and restarted Cassandra-cli and it still shows the SimpleStrategy for that keyspace. Other nodes show the same information. [default@EBonding] describe EBonding; Keyspace: EBonding: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] From: Bryce Godfrey [mailto:bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com] Sent: Thursday, August 23, 2012 11:06 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Expanding cluster to include a new DR datacenter Thanks for the information! Answers my questions. From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: Wednesday, August 22, 2012 7:10 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter If you didn't see this particular section, you may find it useful:http://www.datastax.com/docs/1.1/operations/cluster_management#adding-a-data-center-to-a-cluster Some comments inline: On Wed, Aug 22, 2012 at 3:43 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: We are in the process of building out a new DR system in another Data Center, and we want to mirror our Cassandra environment to that DR. I have a couple questions on the best way to do this after reading the documentation on the Datastax website. We didn't initially plan for this to be a DR setup when first deployed a while ago due to budgeting, but now we need to. So I'm just trying to nail down the order of doing this as well as any potential issues. For the nodes, we don't plan on querying the servers in this DR until we fail over to this data center. We are going to have 5 similar nodes in the DR, should I join them into the ring at token+1? Join them at token+10 just to leave a little space. Make sure you're using LOCAL_QUORUM for your queries instead of regular QUORUM. All keyspaces are set to the replication strategy of SimpleStrategy. Can I change the replication strategy after joining the new nodes in the DR to NetworkTopologyStategy with the updated replication factor for each dr? Switch your keyspaces over to
Re: Expanding cluster to include a new DR datacenter
In your update command is it possible to specify RF for both DC? You could just do DC1:2, DC2:0. On Mon, Aug 27, 2012 at 11:16 AM, Bryce Godfrey bryce.godf...@azaleos.comwrote: Show schema output show the simple strategy still [default@unknown] show schema EBonding; create keyspace EBonding with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; ** ** This is the only thing I see in the system log at the time on all the nodes: ** ** INFO [MigrationStage:1] 2012-08-27 10:54:18,608 ColumnFamilyStore.java (line 659) Enqueuing flush of Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,612 Memtable.java (line 264) Writing Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,627 Memtable.java (line 305) Completed flushing /opt/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-he-34817-Data.db (241 bytes) for commitlog p$ ** ** ** ** Should I turn the logging level up on something to see some more info maybe? ** ** *From:* aaron morton [mailto:aa...@thelastpickle.com] *Sent:* Monday, August 27, 2012 1:35 AM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter ** ** I did a quick test on a clean 1.1.4 and it worked ** ** Can you check the logs for errors ? Can you see your schema change in there ? ** ** Also what is the output from show schema; in the cli ? ** ** Cheers ** ** - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com ** ** On 25/08/2012, at 6:53 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: Yes [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.PropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] *Sent:* Friday, August 24, 2012 1:55 PM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: So I’m at the point of updating the keyspaces from Simple to NetworkTopology and I’m not sure if the changes are being accepted using Cassandra-cli. I issue the change: [default@EBonding] update keyspace EBonding ... with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options={Fisher:2}; 9511e292-f1b6-3f78-b781-4c90aeb6b0f6 Waiting for schema agreement... ... schemas agree across the cluster Then I do a describe and it still shows the old strategy. Is there something else that I need to do? I’ve exited and restarted Cassandra-cli and it still shows the SimpleStrategy for that keyspace. Other nodes show the same information. [default@EBonding] describe EBonding; Keyspace: EBonding: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] *From:* Bryce Godfrey [mailto:bryce.godf...@azaleos.com] *Sent:* Thursday, August 23, 2012 11:06 AM *To:* user@cassandra.apache.org *Subject:* RE: Expanding cluster to include a new DR datacenter Thanks for the information! Answers my questions. *From:* Tyler Hobbs [mailto:ty...@datastax.com ty...@datastax.com] *Sent:* Wednesday, August 22, 2012 7:10 PM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter If you didn't see this particular section, you may find it useful: http://www.datastax.com/docs/1.1/operations/cluster_management#adding-a-data-center-to-a-cluster Some comments inline: On Wed, Aug 22, 2012 at 3:43 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: We are in the process of building out a new DR system in another Data Center, and we want to mirror our Cassandra environment to that DR. I have a couple questions on the best way to do this after reading the documentation on the Datastax website. We didn’t initially plan for this to be a DR setup when first deployed a while ago due to budgeting, but now we need to. So I’m just trying to nail down the order of doing this as well as any potential issues. For the nodes, we don’t plan on querying the servers in this DR until we fail over to this data center.
Re: one node with very high loads
On Mon, Aug 27, 2012 at 9:25 AM, Senthilvel Rangaswamy senthil...@gmail.com wrote: We are running 1.1.2 on m1.xlarge with ephemeral store for data. We are seeing very high loads on one of the nodes in the ring, 30+. My first hunch would be that you are sending all client requests to this one node, so it is coordinating 30x as many requests as it should. If that's not the case, if I were you I would attempt to determine if the high i/o is high read or write on the node, via a tool like iotop. You can also compare the tpstats of two nodes with similar uptimes to see if your node is performing more of any stage than other members of its cohort. Once you determine whether it's read or write, determine which files are being read or written.. :) =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
JMX(RMI) dynamic port allocation problem still exists?
in my previous job we ran across the issue that JMX allocates ports for RMI dynamically, so that nodetool does not work if our env is in EC2, and all the ports have to be specifically opened, and we can't open a range of ports, but only specific ports. at the time, we followed this: https://blogs.oracle.com/jmxetc/entry/connecting_through_firewall_using_jmx to create a small javaagent jar for cassandra startup, so that we use a fixed RMI port. now, does Cassandra come with an out-of-the box solution to fix the above problem? or do I have to create that little javaagent jar myself? Thanks Yang
cassandra twitter ruby client
Hi all, I'm playing with cassandra's ruby client written by twitter, trying to perform a simple get. but looks like it assumed the value types to be uft8 string. however, my values are in double (keyed and column names are utf8types). The values that I got are like: {Top:?\ufffd\ufffd\ufffd\u\u\u\u, ... } how do I pass double serializer to the api client? Thank you. Yuhan
Re: JMX(RMI) dynamic port allocation problem still exists?
In cassandra-env.sh, search on JMX_PORT and it is set to 7199 (ie. Fixed) so that solves your issue, correct? Dean From: Yang tedd...@gmail.commailto:tedd...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Monday, August 27, 2012 3:44 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: JMX(RMI) dynamic port allocation problem still exists? ow, does Cassandra come with an out-of-the box solution to fix the above problem? or do I have to create that little javaagent jar myself?
Re: JMX(RMI) dynamic port allocation problem still exists?
no, the priblem is that jmx listens on 7199, once an incoming connection is made, it literally tells the other side come and connect to me on these 2 rmi ports, and open up 2 random Rmi ports we used to use the trick in the above link to resolve this On Aug 27, 2012 3:04 PM, Hiller, Dean dean.hil...@nrel.gov wrote: In cassandra-env.sh, search on JMX_PORT and it is set to 7199 (ie. Fixed) so that solves your issue, correct? Dean From: Yang tedd...@gmail.commailto:tedd...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Monday, August 27, 2012 3:44 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: JMX(RMI) dynamic port allocation problem still exists? ow, does Cassandra come with an out-of-the box solution to fix the above problem? or do I have to create that little javaagent jar myself?
Re: Counters and replication factor
Dne 25.5.2012 2:41, Edward Capriolo napsal(a): Also it does not sound like you have run anti entropy repair. You should do that when upping rf. i run entropy repairs and it still does not fix counters. I have some reports from users with same problem but nobody discovered repeatable scenario. I am currently in migrating phase to Infinispan data grid, it does not seems to have problems with distributed counters.
RE: Expanding cluster to include a new DR datacenter
Same results. I restarted the node also to see if it just wasn't picking up the changes and it still shows Simple. When I specify the DC for strategy_options I should be using the DC name from properfy file snitch right? Ours is Fisher and TierPoint so that's what I used. From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Monday, August 27, 2012 1:21 PM To: user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter In your update command is it possible to specify RF for both DC? You could just do DC1:2, DC2:0. On Mon, Aug 27, 2012 at 11:16 AM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: Show schema output show the simple strategy still [default@unknown] show schema EBonding; create keyspace EBonding with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; This is the only thing I see in the system log at the time on all the nodes: INFO [MigrationStage:1] 2012-08-27 10:54:18,608 ColumnFamilyStore.java (line 659) Enqueuing flush of Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,612 Memtable.java (line 264) Writing Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,627 Memtable.java (line 305) Completed flushing /opt/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-he-34817-Data.db (241 bytes) for commitlog p$ Should I turn the logging level up on something to see some more info maybe? From: aaron morton [mailto:aa...@thelastpickle.commailto:aa...@thelastpickle.com] Sent: Monday, August 27, 2012 1:35 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter I did a quick test on a clean 1.1.4 and it worked Can you check the logs for errors ? Can you see your schema change in there ? Also what is the output from show schema; in the cli ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.comhttp://www.thelastpickle.com/ On 25/08/2012, at 6:53 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: Yes [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.PropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] From: Mohit Anchlia [mailto:mohitanchlia@mailto:mohitanchlia@gmail.comhttp://gmail.com/] Sent: Friday, August 24, 2012 1:55 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: So I'm at the point of updating the keyspaces from Simple to NetworkTopology and I'm not sure if the changes are being accepted using Cassandra-cli. I issue the change: [default@EBonding] update keyspace EBonding ... with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options={Fisher:2}; 9511e292-f1b6-3f78-b781-4c90aeb6b0f6 Waiting for schema agreement... ... schemas agree across the cluster Then I do a describe and it still shows the old strategy. Is there something else that I need to do? I've exited and restarted Cassandra-cli and it still shows the SimpleStrategy for that keyspace. Other nodes show the same information. [default@EBonding] describe EBonding; Keyspace: EBonding: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] From: Bryce Godfrey [mailto:bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com] Sent: Thursday, August 23, 2012 11:06 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Expanding cluster to include a new DR datacenter Thanks for the information! Answers my questions. From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: Wednesday, August 22, 2012 7:10 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Expanding cluster to include a new DR datacenter If you didn't see this particular section, you may find it useful:http://www.datastax.com/docs/1.1/operations/cluster_management#adding-a-data-center-to-a-cluster Some comments inline: On Wed, Aug 22, 2012 at 3:43 PM, Bryce Godfrey bryce.godf...@azaleos.commailto:bryce.godf...@azaleos.com wrote: We are in the process of building out a new DR system in another Data Center, and we want to mirror our Cassandra environment to that DR. I have a couple questions on the best way to do this after reading the documentation on the Datastax website. We didn't initially plan for this to
Re: cassandra twitter ruby client
That library requires you to serialize and deserialize the data yourself. So to insert a ruby Float you would value = 28.21 [value].pack('G') @client.insert(:somecf, 'key', {'floatval' = [value].pack('G')}) and to read it back out: value = @client.get(:somecf, 'key', ['floatval']).unpack('G')[0] Note that the cassandra-cql library will do (most) typecasts for you. -psanford On Mon, Aug 27, 2012 at 2:49 PM, Yuhan Zhang yzh...@onescreen.com wrote: Hi all, I'm playing with cassandra's ruby client written by twitter, trying to perform a simple get. but looks like it assumed the value types to be uft8 string. however, my values are in double (keyed and column names are utf8types). The values that I got are like: {Top:?\ufffd\ufffd\ufffd\u\u\u\u, ... } how do I pass double serializer to the api client? Thank you. Yuhan
Re: Expanding cluster to include a new DR datacenter
Can you describe your schema again with TierPoint in it? On Mon, Aug 27, 2012 at 3:22 PM, Bryce Godfrey bryce.godf...@azaleos.comwrote: Same results. I restarted the node also to see if it just wasn’t picking up the changes and it still shows Simple. ** ** When I specify the DC for strategy_options I should be using the DC name from properfy file snitch right? Ours is “Fisher” and “TierPoint” so that’s what I used. ** ** *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] *Sent:* Monday, August 27, 2012 1:21 PM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter ** ** In your update command is it possible to specify RF for both DC? You could just do DC1:2, DC2:0. On Mon, Aug 27, 2012 at 11:16 AM, Bryce Godfrey bryce.godf...@azaleos.com wrote: Show schema output show the simple strategy still [default@unknown] show schema EBonding; create keyspace EBonding with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; This is the only thing I see in the system log at the time on all the nodes: INFO [MigrationStage:1] 2012-08-27 10:54:18,608 ColumnFamilyStore.java (line 659) Enqueuing flush of Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,612 Memtable.java (line 264) Writing Memtable-schema_keyspaces@1157216346(183/228 serialized/live bytes, 4 ops) INFO [FlushWriter:765] 2012-08-27 10:54:18,627 Memtable.java (line 305) Completed flushing /opt/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-he-34817-Data.db (241 bytes) for commitlog p$ Should I turn the logging level up on something to see some more info maybe? *From:* aaron morton [mailto:aa...@thelastpickle.com] *Sent:* Monday, August 27, 2012 1:35 AM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter I did a quick test on a clean 1.1.4 and it worked Can you check the logs for errors ? Can you see your schema change in there ? Also what is the output from show schema; in the cli ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/08/2012, at 6:53 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: ** ** Yes [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.PropertyFileSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9511e292-f1b6-3f78-b781-4c90aeb6b0f6: [10.20.8.4, 10.20.8.5, 10.20.8.1, 10.20.8.2, 10.20.8.3] *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] *Sent:* Friday, August 24, 2012 1:55 PM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: So I’m at the point of updating the keyspaces from Simple to NetworkTopology and I’m not sure if the changes are being accepted using Cassandra-cli. I issue the change: [default@EBonding] update keyspace EBonding ... with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options={Fisher:2}; 9511e292-f1b6-3f78-b781-4c90aeb6b0f6 Waiting for schema agreement... ... schemas agree across the cluster Then I do a describe and it still shows the old strategy. Is there something else that I need to do? I’ve exited and restarted Cassandra-cli and it still shows the SimpleStrategy for that keyspace. Other nodes show the same information. [default@EBonding] describe EBonding; Keyspace: EBonding: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] *From:* Bryce Godfrey [mailto:bryce.godf...@azaleos.com] *Sent:* Thursday, August 23, 2012 11:06 AM *To:* user@cassandra.apache.org *Subject:* RE: Expanding cluster to include a new DR datacenter Thanks for the information! Answers my questions. *From:* Tyler Hobbs [mailto:ty...@datastax.com ty...@datastax.com] *Sent:* Wednesday, August 22, 2012 7:10 PM *To:* user@cassandra.apache.org *Subject:* Re: Expanding cluster to include a new DR datacenter If you didn't see this particular section, you may find it useful: http://www.datastax.com/docs/1.1/operations/cluster_management#adding-a-data-center-to-a-cluster Some comments inline:
Re: Dynamic Column Families in CQLSH v3
It's not possible to have Dynamic Columns in CQL 3. The CF definition must specify the column names you expect to store. The COMPACT STORAGE (http://www.datastax.com/docs/1.1/references/cql/CREATE_COLUMNFAMILY) clause of the Create CF statement means can have column names that are part dynamic part static. But if you want to have CF's where the app code controls the column names you need to create the CF using the CLI and stick with the Thrift API. (because SELECT in CQL 3 does not support arbitrary column slicing.) Background http://www.mail-archive.com/user@cassandra.apache.org/msg23636.html Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/08/2012, at 2:24 PM, Erik Onnen eon...@gmail.com wrote: Hello All, Attempting to create what the Datastax 1.1 documentation calls a Dynamic Column Family (http://www.datastax.com/docs/1.1/ddl/column_family#dynamic-column-families) via CQLSH. This works in v2 of the shell: create table data ( key varchar PRIMARY KEY) WITH comparator=LongType; When defined this way via v2 shell, I can successfully switch to v3 shell and query the CF fine. The same syntax in v3 yields: Bad Request: comparator is not a valid keyword argument for CREATE TABLE The 1.1 documentation indicates that comparator is a valid option for at least ALTER TABLE: http://www.datastax.com/docs/1.1/configuration/storage_configuration#comparator This leads me to believe that the correct way to create a dynamic column family is to create a table with no named columns and alter the table later but that also does not work: create table data (key varchar PRIMARY KEY); yields: Bad Request: No definition found that is not part of the PRIMARY KEY So, my question is, how do I create a Dynamic Column Family via the CQLSH v3? Thanks! -erik
sstableloader error
Hi, I had uploaded data using sstablelaoder to a single node cluster earlier without any problem. Now, while trying to upload to 3 node cluster it is giving me below error: localhost:~/apache-cassandra-1.0.7/sstableloader_folder # bin/sstableloader DEMO/ Starting client (and waiting 30 seconds for gossip) ... Streaming revelant part of DEMO/UMD-hc-1-Data.db to [/10.245.28.232, /10.245.28.231, /10.245.28.230] progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230 0/0 (100)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN 21:41:15,200 Failed attempt 1 to connect to /10.245.28.232 to stream null. Retrying in 2 ms. (java.net.ConnectException: Connection timed out) progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230progress: [/10.245.28.232 0/0 (100)] [/10.245.28.231 0/1 (0)] [/10.245.28.230 0/0 (100)] [total: 0 - 0MB/s (avg: 0MB/s)]^Clocalhost:~/apache-cassandra-1.0.7/sstableloader_folder # I am running cassandra on foreground. So, on all of the cassandra nodes i get the below message: INFO 21:40:30,335 Node /192.168.11.11 is now part of the cluster INFO 21:40:30,336 InetAddress /192.168.11.11 is now UP INFO 21:41:55,320 InetAddress /192.168.11.11 is now dead. INFO 21:41:55,321 FatClient /192.168.11.11 has been silent for 3ms, removing from gossip I used ByteOrderPartitioner and filled intial token on all nodes. I have set seeds as 10.245.28.230,10.245.28.231 I have properly set listen address, rpc_address(0.0.0.0) and ports One thing i noticed is that, when i try to connect to this cluster using client(libQtCassandra) and try to create column family, all the nodes respond and column family got created properly. Can anyone help me please. Thanks and Regards, Swat.vikas
Re: cassandra twitter ruby client
Hi Peter, works well. Thanks for lot! :D will check out cassandra-cql. Yuhan On Mon, Aug 27, 2012 at 3:34 PM, Peter Sanford psanf...@nearbuysystems.comwrote: That library requires you to serialize and deserialize the data yourself. So to insert a ruby Float you would value = 28.21 [value].pack('G') @client.insert(:somecf, 'key', {'floatval' = [value].pack('G')}) and to read it back out: value = @client.get(:somecf, 'key', ['floatval']).unpack('G')[0] Note that the cassandra-cql library will do (most) typecasts for you. -psanford On Mon, Aug 27, 2012 at 2:49 PM, Yuhan Zhang yzh...@onescreen.com wrote: Hi all, I'm playing with cassandra's ruby client written by twitter, trying to perform a simple get. but looks like it assumed the value types to be uft8 string. however, my values are in double (keyed and column names are utf8types). The values that I got are like: {Top:?\ufffd\ufffd\ufffd\u\u\u\u, ... } how do I pass double serializer to the api client? Thank you. Yuhan
Automating nodetool repair
Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net *866.484.6630* New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about *Global Relay Message* http://www.globalrelay.com/services/message*--- *The Future of Collaboration in the Financial Services World * *All email sent to or from this address will be retained by Global Relay's email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners.
Re: Automating nodetool repair
I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: QUORUM writes, QUORUM reads -- and eventual consistency
Cool - thanks to all for the replies. I believe I have what I need now. Philip On Aug 25, 2012, at 12:17 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi Philip, From http://wiki.apache.org/cassandra/ArchitectureOverview Quorum write: blocks until quorum is reached By my understanding if you _did_ a quorum write it means it successfully completed. Guille I *think* we're saying the same thing here. The addition of the word successful (or something more suitable) would make the documentation more precise, not less.
Re: Order of the cyclic group of hashed partitioners
Sorry I don't understand your question. Can you explain it a bit more or maybe someone else knows. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/08/2012, at 7:16 PM, Romain HARDOUIN romain.hardo...@urssaf.fr wrote: Thank you Aaron. This limit was pushed down in RandomPartitioner but the question still exists... aaron morton aa...@thelastpickle.com a écrit sur 26/08/2012 23:35:50 : AbstractHashedPartitioner does not exist in the trunk. https://git-wip-us.apache.org/repos/asf?p=cassandra.git; a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/08/2012, at 10:51 PM, Romain HARDOUIN romain.hardo...@urssaf.fr wrote: Hi, AbstractHashedPartitioner defines a maximum of 2**127 hence an order of (2**127)+1. I'd say that tokens of such partitioners are intented to be distributed in Z/(127), hence a maximum of (2**127)-1. Could there be a mix up between maximum and order? This is a detail but could someone confirm/invalidate? Regards, Romain
Re: Cassandra 1.1.4 RPM required
Dear Aaron, Its required username and password which I have not. Can yo share direct link? There is no security on the wiki, you should be able to see http://wiki.apache.org/cassandra/GettingStarted What about this page ? http://wiki.apache.org/cassandra/DebianPackaging Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/08/2012, at 8:14 PM, Marco Schirrmeister ma...@schirrmeister.net wrote: On Aug 23, 2012, at 12:15 PM, Adeel Akbar wrote: Dear Aaron, Its required username and password which I have not. Can yo share direct link? There is no username and password for the Datastax rpm repository. http://rpm.datastax.com/community/ But there is no 1.1.4 version yet from Datastax. If you really need a 1.1.4 rpm. You can give my build a shot. I just started rolling my own packages for some reasons. Until my public rpm repo goes online, you can grab here the cassandra rpm. http://people.ogilvy.de/~mschirrmeister/linux/cassandra/ If you want, test it out. It's just a first build and not heavily tested. Marco