Re: CQL binary protocol
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec On Fri, Jan 25, 2013 at 10:15 AM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am looking for some sort documentation around usage for CQL binary protocol. Basically i need some documentation about usage of cassandra native transport and it's usage. -Vivek
Re: CQL binary protocol
Thanks Sylvain. I need to refer org.apache.cassandra.transport package for code walkthrough? -vivek On Fri, Jan 25, 2013 at 2:51 PM, Sylvain Lebresne sylv...@datastax.comwrote: https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec On Fri, Jan 25, 2013 at 10:15 AM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am looking for some sort documentation around usage for CQL binary protocol. Basically i need some documentation about usage of cassandra native transport and it's usage. -Vivek
Re: CQL binary protocol
On Fri, Jan 25, 2013 at 10:29 AM, Vivek Mishra mishra.v...@gmail.comwrote: I need to refer org.apache.cassandra.transport package for code walkthrough? Yes, that's where the code is. -- Sylvain -vivek On Fri, Jan 25, 2013 at 2:51 PM, Sylvain Lebresne sylv...@datastax.comwrote: https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec On Fri, Jan 25, 2013 at 10:15 AM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am looking for some sort documentation around usage for CQL binary protocol. Basically i need some documentation about usage of cassandra native transport and it's usage. -Vivek
Re: CQL binary protocol
Any documentation for this? -Vivek On Fri, Jan 25, 2013 at 3:39 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Fri, Jan 25, 2013 at 10:29 AM, Vivek Mishra mishra.v...@gmail.comwrote: I need to refer org.apache.cassandra.transport package for code walkthrough? Yes, that's where the code is. -- Sylvain -vivek On Fri, Jan 25, 2013 at 2:51 PM, Sylvain Lebresne sylv...@datastax.comwrote: https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec On Fri, Jan 25, 2013 at 10:15 AM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am looking for some sort documentation around usage for CQL binary protocol. Basically i need some documentation about usage of cassandra native transport and it's usage. -Vivek
Cassandra-cli : with CQL binary protocol
Is this a valid assumption that cassandra-cli will not work, if Cassandra server is getting started by enabling CQL binary protocol? -Vivek
Re: CQL binary protocol
On Fri, Jan 25, 2013 at 11:24 AM, Vivek Mishra mishra.v...@gmail.comwrote: Any documentation for this? For what? For the code? Everyone knows the code is it's own documentation! -- Sylvain
Re: CQL binary protocol
OK.I guess that's why there is a need of class/method level java comments. -Vivek On Fri, Jan 25, 2013 at 4:28 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Fri, Jan 25, 2013 at 11:24 AM, Vivek Mishra mishra.v...@gmail.comwrote: Any documentation for this? For what? For the code? Everyone knows the code is it's own documentation! -- Sylvain
Re: CQL binary protocol error
cqlsh does not use the native protocol (It may and hopefully will someday but it's not the case right now). -- Sylvain On Fri, Jan 25, 2013 at 11:49 AM, Vivek Mishra mishra.v...@gmail.comwrote: I have enabled *start_native_transport: true* in cassandra.yaml and then tried to connect from cqlsh as: */home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh -3 localhost 9042* But i am getting below error: * vivek@vivek-pc ~/source/apache-cassandra-1.2.0-src $ /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh -3 localhost 9042 Traceback (most recent call last): File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 2247, in module main(*read_options(sys.argv[1:], os.environ)) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 2233, in main display_float_precision=options.float_precision) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 482, in __init__ cql_version=cqlver, transport=transport) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 143, in connect File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 59, in __init__ File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/thrifteries.py, line 159, in establish_connection File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 1255, in describe_version File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 1265, in recv_describe_version File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/protocol/TBinaryProtocol.py, line 126, in readMessageBegin File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/protocol/TBinaryProtocol.py, line 203, in readI32 File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 63, in readAll EOFError * Any idea? -Vivek
Re: CQL binary protocol error
Then only possible way is via creating own java program to connect/execute? And then restart again in Thrift mode to connect via cqlsh/cassandra-cli ? -Vivek On Fri, Jan 25, 2013 at 4:30 PM, Sylvain Lebresne sylv...@datastax.comwrote: cqlsh does not use the native protocol (It may and hopefully will someday but it's not the case right now). -- Sylvain On Fri, Jan 25, 2013 at 11:49 AM, Vivek Mishra mishra.v...@gmail.comwrote: I have enabled *start_native_transport: true* in cassandra.yaml and then tried to connect from cqlsh as: */home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh -3 localhost 9042* But i am getting below error: * vivek@vivek-pc ~/source/apache-cassandra-1.2.0-src $ /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh -3 localhost 9042 Traceback (most recent call last): File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 2247, in module main(*read_options(sys.argv[1:], os.environ)) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 2233, in main display_float_precision=options.float_precision) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/cqlsh, line 482, in __init__ cql_version=cqlver, transport=transport) File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 143, in connect File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/connection.py, line 59, in __init__ File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/thrifteries.py, line 159, in establish_connection File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 1255, in describe_version File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/cql-internal-only-1.4.0.zip/cql-1.4.0/cql/cassandra/Cassandra.py, line 1265, in recv_describe_version File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/protocol/TBinaryProtocol.py, line 126, in readMessageBegin File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/protocol/TBinaryProtocol.py, line 203, in readI32 File /home/vivek/vivek/apache-cassandra-1.2.0/bin/../lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py, line 63, in readAll EOFError * Any idea? -Vivek
Re: Cassandra-cli : with CQL binary protocol
Well, the cassandra-cli uses the thrift protocol, so it will not work is you use start_rpc: false. Whether you start the binary protocol server has no real impact as Cassandra is fine starting both server (as long as you don't set the same port for both for obvious reasons). Obviously, if you point cassandra-cli to the port used by the binary protocol, that won't work either. -- Sylvain On Fri, Jan 25, 2013 at 11:53 AM, Vivek Mishra mishra.v...@gmail.comwrote: Is this a valid assumption that cassandra-cli will not work, if Cassandra server is getting started by enabling CQL binary protocol? -Vivek
Re: Cassandra-cli : with CQL binary protocol
Means start 2 instances , 1 for thrift based access and another for native? Of course on different port. Once i enable *native_transport_port: 9042*, it will not start cassandra thrift automatically? Even if *start_rpc: true* -Vivek On Fri, Jan 25, 2013 at 4:33 PM, Sylvain Lebresne sylv...@datastax.comwrote: Well, the cassandra-cli uses the thrift protocol, so it will not work is you use start_rpc: false. Whether you start the binary protocol server has no real impact as Cassandra is fine starting both server (as long as you don't set the same port for both for obvious reasons). Obviously, if you point cassandra-cli to the port used by the binary protocol, that won't work either. -- Sylvain On Fri, Jan 25, 2013 at 11:53 AM, Vivek Mishra mishra.v...@gmail.comwrote: Is this a valid assumption that cassandra-cli will not work, if Cassandra server is getting started by enabling CQL binary protocol? -Vivek
Re: CQL binary protocol
For Beginners on any new open source, it is always good to have atleast some sort of comments! -Vivek On Fri, Jan 25, 2013 at 4:30 PM, Vivek Mishra mishra.v...@gmail.com wrote: OK.I guess that's why there is a need of class/method level java comments. -Vivek On Fri, Jan 25, 2013 at 4:28 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Fri, Jan 25, 2013 at 11:24 AM, Vivek Mishra mishra.v...@gmail.comwrote: Any documentation for this? For what? For the code? Everyone knows the code is it's own documentation! -- Sylvain
Re: trouble setting up initial cluster: Host ID collision between active endpoint
Hi Ben, Thanks for the tip I will certainly check it out. I really appreciate the information! Tim On Thu, Jan 24, 2013 at 6:32 PM, Ben Bromhead b...@instaclustr.com wrote: Hi Tim If you want to check out Cassandra on AWS you should also have a look www.instaclustr.com. We are still very much in Beta (so if you come across anything, please let us know), but if you have a few minutes and want to deploy a cluster in just a few clicks I highly recommend trying Instaclustr out. Cheers Ben Bromhead *Instaclustr* On Fri, Jan 25, 2013 at 12:35 AM, Tim Dunphy bluethu...@gmail.com wrote: Cool Thanks for the advice Aaron. I actually did get this working before I read your reply. The trick apparently for me was to use the IP for the first node in the seeds setting of each successive node. But I like the idea of using larges for an hour or so and terminating them for some basic experimentation. Also, thanks for pointing me to the Datastax AMIs I'll be sure to check them out. Tim On Thu, Jan 24, 2013 at 3:45 AM, aaron morton aa...@thelastpickle.comwrote: They both have 0 for their token, and this is stored in their System keyspace. Scrub them and start again. But I found that the tokens that were being generated would require way too much memory Token assignments have nothing to do with memory usage. m1.micro instances You are better off using your laptop than micro instances. For playing around try m1.large and terminate them when not in use. To make life easier use this to make the cluster for you http://www.datastax.com/docs/1.2/install/install_ami Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 5:17 AM, Tim Dunphy bluethu...@gmail.com wrote: Hello list, I really do appreciate the advice I've gotten here as I start building familiarity with Cassandra. Aside from the single node instance I setup for a developer friend, I've just been playing with a single node in a VM on my laptop and playing around with the cassandra-cli and PHP. Well I've decided to setup my first cluster on my amazon ec2 account and I'm running into an issue getting the nodes to gossip. I've set the IP's of 'node01' and 'node02' ec2 instances in their respective listen_address, rpc_address and made sure that the 'cluster_name' on both was in agreement. I believe the problem may be in one of two places: either the seeds or the initial_token setting. For the seeds I have it setup as such. I put the IPs for both machines in the 'seeds' settings for each, thinking this would be how each node would discover each other: - seeds: 10.xxx.xxx.248,10.xxx.xxx.123 Initially I tried the tokengen script that I found in the documentation. But I found that the tokens that were being generated would require way too much memory for the m1.micro instances that I'm experimenting with on the Amazon free tier. And according to the docs in the config it is in some cases ok to leave that field blank. So that's what I did on both instances. Not sure how much/if this matters but I am using the setting - endpoint_snitch: Ec2Snitch Finally, when I start up the first node all goes well. But when I startup the second node I see this exception on both hosts: node1 INFO 11:02:32,231 Listening for thrift clients... INFO 11:02:59,262 Node /10.xxx.xxx.123 is now part of the cluster INFO 11:02:59,268 InetAddress /10.xxx.xxx.123 is now UP ERROR 11:02:59,270 Exception in thread Thread[GossipStage:1,5,main] java.lang.RuntimeException: Host ID collision between active endpoint /10..xxx.248 and /10.xxx.xxx.123 (id=54ce7ccd-1b1d-418e-9861-1c281c078b8f) at org.apache.cassandra.locator.TokenMetadata.updateHostId(TokenMetadata.java:227) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1296) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1157) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1895) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:805) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:883) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:43) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) And on node02 I see: INFO 11:02:58,817 Starting Messaging Service on port 7000 INFO 11:02:58,835 Using saved token [0] INFO 11:02:58,837 Enqueuing flush of Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,838 Writing Memtable-local@672636645(84/84
Re: Cassandra-cli : with CQL binary protocol
Any suggestions? Once i enable *native_transport_port: 9042*, it will not start cassandra thrift automatically, Even if *start_rpc: true* ? Or it should start listening to both ports? On Fri, Jan 25, 2013 at 4:39 PM, Vivek Mishra mishra.v...@gmail.com wrote: Means start 2 instances , 1 for thrift based access and another for native? Of course on different port. Once i enable *native_transport_port: 9042*, it will not start cassandra thrift automatically? Even if *start_rpc: true* -Vivek On Fri, Jan 25, 2013 at 4:33 PM, Sylvain Lebresne sylv...@datastax.comwrote: Well, the cassandra-cli uses the thrift protocol, so it will not work is you use start_rpc: false. Whether you start the binary protocol server has no real impact as Cassandra is fine starting both server (as long as you don't set the same port for both for obvious reasons). Obviously, if you point cassandra-cli to the port used by the binary protocol, that won't work either. -- Sylvain On Fri, Jan 25, 2013 at 11:53 AM, Vivek Mishra mishra.v...@gmail.comwrote: Is this a valid assumption that cassandra-cli will not work, if Cassandra server is getting started by enabling CQL binary protocol? -Vivek
JMX CF Beans
Just a quick question about the attributes exposed via JMX. I have some doc [1] but it doesn't help about CF beans. The BloomFilterFalseRatio, is that the ratio of found vs missed, or the ratio of false positive vs the number of tests, or something else ? The ReadCount and WriteCount, how do they count regarding the replication factor ? As far as I understand, the read and write on the StorageProxy is the actual number of requests coming from clients. So judging that the sum on all cf of the read and write is near equal to the replication factor multiply by the number of read and write on the StorageProxy, I am guessing that the read and write per cf are the replicas one. Am I right ? Nicolas [1] http://wiki.apache.org/cassandra/JmxInterface
RE: JMX CF Beans
src/java/org/apache/цassandra/db/DataTracker.java: public double getBloomFilterFalseRatio() { … return (double) falseCount / (trueCount + falseCount); … } ReadCount/WriteCount on CF is for this CF on this node only, so it’s local/internal only reads/writes for the node’s range. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider Take a ride with Adform's Rich Media Suitehttp://vimeo.com/adform/richmedia [Adform News] http://www.adform.com [Adform awarded the Best Employer 2012] http://www.adform.com/site/blog/adform/adform-takes-top-spot-in-best-employer-survey/ Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Nicolas Lalevée [mailto:nicolas.lale...@hibnet.org] Sent: Friday, January 25, 2013 16:08 To: user@cassandra.apache.org Subject: JMX CF Beans Just a quick question about the attributes exposed via JMX. I have some doc [1] but it doesn't help about CF beans. The BloomFilterFalseRatio, is that the ratio of found vs missed, or the ratio of false positive vs the number of tests, or something else ? The ReadCount and WriteCount, how do they count regarding the replication factor ? As far as I understand, the read and write on the StorageProxy is the actual number of requests coming from clients. So judging that the sum on all cf of the read and write is near equal to the replication factor multiply by the number of read and write on the StorageProxy, I am guessing that the read and write per cf are the replicas one. Am I right ? Nicolas [1] http://wiki.apache.org/cassandra/JmxInterface inline: signature-logo914.pnginline: signature-best-employer-logo3b0d.png
Re: Unavaliable Exception
fwiw, I have a mixed ubuntu 11.10 / 12.04 6 node cluster (AWS m1.xlarge). The load average is always between 0 and 5 for 11.10 nodes while 12.04 nodes shows all the time a load between 2 and 20. I have the same configuration on each node and the average request latency is a few better on 12.04 nodes (which have the higher load). htop shows about the same cpu used for any of these nodes so does iostat (same iowait, user, system...). I am not sure why this load increase, but I am not sure either that this is a problem. Alain 2013/1/25 Everton Lima peitin.inu...@gmail.com Hello, I was storing many data in a cluster of 7 nodes with replication factor 3. The variable Load(cpu), that can be viewed in opscenter - cluster - List view is upping over than 15. Is there any manner to minimize this? Thanks -- Everton Lima Aleixo Bacharel em Ciência da Computação pela UFG Mestrando em Ciência da Computação pela UFG Programador no LUPA
Re: Unavaliable Exception
More nodes! On Jan 25, 2013, at 7:21 AM, Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com wrote: fwiw, I have a mixed ubuntu 11.10 / 12.04 6 node cluster (AWS m1.xlarge). The load average is always between 0 and 5 for 11.10 nodes while 12.04 nodes shows all the time a load between 2 and 20. I have the same configuration on each node and the average request latency is a few better on 12.04 nodes (which have the higher load). htop shows about the same cpu used for any of these nodes so does iostat (same iowait, user, system...). I am not sure why this load increase, but I am not sure either that this is a problem. Alain 2013/1/25 Everton Lima peitin.inu...@gmail.commailto:peitin.inu...@gmail.com Hello, I was storing many data in a cluster of 7 nodes with replication factor 3. The variable Load(cpu), that can be viewed in opscenter - cluster - List view is upping over than 15. Is there any manner to minimize this? Thanks -- Everton Lima Aleixo Bacharel em Ciência da Computação pela UFG Mestrando em Ciência da Computação pela UFG Programador no LUPA
Re: JMX CF Beans
On Fri, Jan 25, 2013 at 8:07 AM, Nicolas Lalevée nicolas.lale...@hibnet.org wrote: Just a quick question about the attributes exposed via JMX. I have some doc [1] but it doesn't help about CF beans. The BloomFilterFalseRatio, is that the ratio of found vs missed, or the ratio of false positive vs the number of tests, or something else ? False positives. You should be aware of this bug, though: https://issues.apache.org/jira/browse/CASSANDRA-4043 The ReadCount and WriteCount, how do they count regarding the replication factor ? As far as I understand, the read and write on the StorageProxy is the actual number of requests coming from clients. So judging that the sum on all cf of the read and write is near equal to the replication factor multiply by the number of read and write on the StorageProxy, I am guessing that the read and write per cf are the replicas one. Am I right ? StorageProxy read/write counts should equal the number of client requests. ColumnFamily read/write counts correspond to actual, local data reads, so the sum of this number across all nodes will be approximately RF * the StorageProxy counts. -- Tyler Hobbs DataStax http://datastax.com/
astyanax connection ring describe discovery
IS anyone using astyanax with their cassandra along with TOKEN AWARE as in here (cassandra version 1.1.4) (see Token Aware section) https://github.com/Netflix/astyanax/wiki/Configuration We have maxConnsPerHost 20 right now and 3 seeds for our cluster but astyanax is not discovering any other nodes If so, what version of cassandra and astyanax are you using? For now, we had ot add all nodes ot the seeds list instead so it distributes amongst all nodes. Thanks, Dean
Re: CQL3 and clients for new Cluster
If your project isn't meant to go in production immediately and you want to use Cassandra 1.2+, you can give a try to DataStax' new Java driver which is available at https://github.com/datastax/java-driver Just a snapshot for now. A 1.0 release is expected for late March, meanwhile we'll iterate over a few betas releases. This driver has been designed to be a lightweight and efficient way to communicate with Cassandra using CQL3. It comes with a simple API, an asynchronous architecture, an efficient failover and load balancing handling. Michaël On Thu, Jan 24, 2013 at 4:21 PM, Peter Lin wool...@gmail.com wrote: I use both Thrift and CQL. my bias take is use CQL for select queries and thrift for insert/update. I like being able to insert exactly the data type I want for the column name and value. CQL is more user friendly, but it lacks the flexibility of thrift in terms of using different data types for column names and values. the SQL metaphor only goes so far. My bias opinion, you're not getting the most out of Cassandra if you're only using CQL. On Thu, Jan 24, 2013 at 6:38 PM, Aaron Turner synfina...@gmail.com wrote: Either CQL or a higher level API running on top of Thrift like Hector/Asyntax/etc. Thrift is uh... painful. On Thu, Jan 24, 2013 at 3:35 PM, Matthew Langton mjla...@gmail.com wrote: Hi all, I started looking at Cassandra awhile ago and got used to the Thrift API. I put it on the back burner for awhile though until now. To get back up to speed I have read a lot of documentation at the DataStax website, and it appears that the Thrift API is no longer considered the ideal way to interface with Cassandra. So my questions are these: What is the future of the Thrift API, should I just ignore it going forward and use CQL? If CQL is the preferred way to interface with Cassandra, does using any of the clients listed here: http://wiki.apache.org/cassandra/ClientOptions provide me any benefits over using a JDBC like the one listed here http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/ Thanks, Matt -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
RE: change Storage port
Even I am facing a similar issue. I am using Bulkoutput format for loading data from Hadoop to Cassandra. All I need is to set the storage_port explicitly in my job/program. Can anyone help me in setting the storage_port configuration parameter from my java program? Anand B From: chandra Varahala [mailto:hadoopandcassan...@gmail.com] Sent: Friday, January 25, 2013 3:33 PM To: user@cassandra.apache.org Subject: change Storage port Hello, I am using two cassandra instance/cluster (random byte order) on same server and trying load data from hadoop to Cassandra. Both storage ports are configured as 7000 7002 in yaml file. Is there way i can pass storage port from hadoop driver for specific instance/cluster. thanks chandra The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: astyanax connection ring describe discovery
I use astyanax 1.56.18 with cassandra 1.1.5. Everything works as supposed to. What does ThreadPoolMonitor report? Andrey On Fri, Jan 25, 2013 at 10:26 AM, Hiller, Dean dean.hil...@nrel.gov wrote: IS anyone using astyanax with their cassandra along with TOKEN AWARE as in here (cassandra version 1.1.4) (see Token Aware section) https://github.com/Netflix/astyanax/wiki/Configuration We have maxConnsPerHost 20 right now and 3 seeds for our cluster but astyanax is not discovering any other nodes If so, what version of cassandra and astyanax are you using? For now, we had ot add all nodes ot the seeds list instead so it distributes amongst all nodes. Thanks, Dean
Re: astyanax connection ring describe discovery
Make sure you are setting the discovery type to ring describe or token aware. This is different from the connection pool type, which can also be token aware. On Fri, Jan 25, 2013 at 11:26 AM, Hiller, Dean dean.hil...@nrel.gov wrote: IS anyone using astyanax with their cassandra along with TOKEN AWARE as in here (cassandra version 1.1.4) (see Token Aware section) https://github.com/Netflix/astyanax/wiki/Configuration We have maxConnsPerHost 20 right now and 3 seeds for our cluster but astyanax is not discovering any other nodes If so, what version of cassandra and astyanax are you using? For now, we had ot add all nodes ot the seeds list instead so it distributes amongst all nodes. Thanks, Dean -- Derek Williams
Re: Cassandra pending compaction tasks keeps increasing
To recap the problem, 1.1.6 on SSD, 5 nodes, RF = 3, one CF only. After data load, initially all 5 nodes have very even data size (135G, each). I ran nodetool repair -pr on node 1 which have replicates on node 2, node 3 since we set RF = 3. It appears that huge amount of data got transferred. Node 1 has 220G, node 2, 3 have around 170G. Pending LCS task on node 1 is 15K and node 2, 3 have around 7K each. Questions: * Why nodetool repair increases the data size that much? It's not likely that much data needs to be repaired. Will that happen for all the subsequent repair? * How to make LCS run faster? After almost a day, the LCS tasks only dropped by 1000. I am afraid it will never catch up. We set * compaction_throughput_mb_per_sec = 500 * multithreaded_compaction: true Both Disk and CPU util are less than 10%. I understand LCS is single threaded, any chance to speed it up? * We use default SSTable size as 5M, Will increase the size of SSTable help? What will happen if I change the setting after the data is loaded. Any suggestion is very much appreciated. -Wei - Original Message - From: Wei Zhu wz1...@yahoo.com To: user@cassandra.apache.org Sent: Thursday, January 24, 2013 11:46:04 PM Subject: Re: Cassandra pending compaction tasks keeps increasing I believe I am running into this one: https://issues.apache.org/jira/browse/CASSANDRA-4765 By the way, I am using 1.1.6 (I though I was using 1.1.7) and this one is fixed in 1.1.7. - Original Message - From: Wei Zhu wz1...@yahoo.com To: user@cassandra.apache.org Sent: Thursday, January 24, 2013 11:18:59 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Thanks Derek, in the cassandra-env.sh, it says # reduce the per-thread stack size to minimize the impact of Thrift # thread-per-client. (Best practice is for client connections to # be pooled anyway.) Only do so on Linux where it is known to be # supported. # u34 and greater need 180k JVM_OPTS=$JVM_OPTS -Xss180k What value should I use? Java defaults at 400K? Maybe try that first. Thanks. -Wei - Original Message - From: Derek Williams de...@fyrie.net To: user@cassandra.apache.org, Wei Zhu wz1...@yahoo.com Sent: Thursday, January 24, 2013 11:06:00 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Increasing the stack size in cassandra-env.sh should help you get past the stack overflow. Doesn't help with your original problem though. On Fri, Jan 25, 2013 at 12:00 AM, Wei Zhu wz1...@yahoo.com wrote: Well, even after restart, it throws the the same exception. I am basically stuck. Any suggestion to clear the pending compaction tasks? Below is the end of stack trace: at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$3.iterator(Sets.java:667) at com.google.common.collect.Sets$3.size(Sets.java:670) at com.google.common.collect.Iterables.size(Iterables.java:80) at org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:557) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:105) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Any suggestion is very much appreciated -Wei - Original Message - From: Wei Zhu wz1...@yahoo.com To: user@cassandra.apache.org Sent: Thursday, January 24, 2013 10:55:07 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Do you mean 90% of the reads should come from 1 SSTable? By the way, after I finished the data migrating, I ran nodetool repair -pr on one of the nodes. Before nodetool repair, all the nodes have the same disk space usage. After I ran the nodetool repair, the disk space for that node jumped from 135G to 220G, also there are more than 15000 pending compaction tasks. After a while , Cassandra started to throw the exception like below and stop compacting. I had to restart the node. By the way, we are using 1.1.7. Something doesn't seem right. INFO [CompactionExecutor:108804] 2013-01-24