Re: Linear scalability problems
If you double your nodes, you should be doubling your webservers too(that is if you are trying to prove it scales linearly). We had to spend time finding the correct ratio for our application (it ended up being 19 webservers to 20 data nodes so now just assume 1 to 1…..you can use amazon to find that info for very cheap. Dean From: Anand Somani meatfor...@gmail.commailto:meatfor...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Thursday, April 4, 2013 1:05 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Linear scalability problems RF=3. On Thu, Apr 4, 2013 at 7:08 AM, Cem Cayiroglu cayiro...@gmail.commailto:cayiro...@gmail.com wrote: What was the RF before adding nodes? Sent from my iPhone On 04 Apr 2013, at 15:12, Anand Somani meatfor...@gmail.commailto:meatfor...@gmail.com wrote: We are using a single process with multiple threads, will look at client side delays. Thanks On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com wrote: If I had to guess, I would say that your client is the bottleneck, not the cluster. Are you inserting data with multiple threads or processes? On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani meatfor...@gmail.commailto:meatfor...@gmail.com wrote: Hi, I am running some tests trying to scale out our application from using a 3 node cluster to 6 node cluster. The thing I observed is that when using a 3 node cluster I was able to handle abt 41 req/second, so I added 3 more nodes thinking it should close to double, but instead it only goes upto bat 47 req/second!! I am doing something wrong and it is not obvious, so wanted some help in what stats could/should I monitor to tell me things like if a node has more requests or if the load distribution is not random enough? Note I am using direct thrift (old code base) and cassandra 1.1.6. The data model is for storing blobs (split across columns) and has around 6 CF, RF=3 and all operations are at quorum. Also at the end of the run nodetool ring reports the same data size. Thanks Anand -- Tyler Hobbs DataStaxhttp://datastax.com/
Re: Data Modeling: How to keep track of arbitrarily inserted column names?
Since there are few column names what you can do is this. Make a reverse index, low read repair chance, Be aggressive with compaction. It will be many extra writes but that is ok. Other option is turn on row cache and try read before write. It is a good case for row cache because it is a very small data set. On Thursday, April 4, 2013, Drew Kutcharian d...@venarc.com wrote: I don't really need to answer what rows contain column named X, so no need for a reverse index here. All I want is a distinct set of all the column names, so I can answer what are all the available column names On Apr 4, 2013, at 4:20 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Your reverse index of which rows contain a column named X will have very wide rows. You could look at cassandra's secondary indexing, or possibly look at a solandra/solr approach. Another option is you can shift the problem slightly, which rows have column X that was added between time y and time z. Remember with few distinct column names that reverse index of column to row is going to be a very big list. On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian d...@venarc.com wrote: Hi Edward, I anticipate that the column names will be reused a lot. For example, key1 will be in many rows. So I think the number of distinct column names will be much much smaller than the number of rows. Is there a way to have a separate CF that keeps track of the column names? What I was thinking was to have a separate CF that I write only the column name with a null value in there every time I write a key/value to the main CF. In this case if that column name exist, then it will just be overridden. Now if I wanted to get all the column names, then I can just query that CF. Not sure if that's the best approach at high load (100k inserts a second). -- Drew On Apr 4, 2013, at 12:02 PM, Edward Capriolo edlinuxg...@gmail.com wrote: You can not get only the column name (which you are calling a key) you can use get_range_slice which returns all the columns. When you specify an empty byte array (new byte[0]{}) as the start and finish you get back all the columns. From there you can return only the columns to the user in a format that you like. On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian d...@venarc.com wrote: Hey Guys, I'm working on a project and one of the requirements is to have a schema free CF where end users can insert arbitrary key/value pairs per row. What would be the best way to know what are all the keys that were inserted (preferably w/o any locking). For example, Row1 = key1 - XXX, key2 - XXX Row2 = key1 - XXX, key3 - XXX Row3 = key4 - XXX, key5 - XXX Row4 = key2 - XXX, key5 - XXX … The query would be give me all the inserted keys and the response would be {key1, key2, key3, key4, key5} Thanks, Drew
Re: Really have to repair ?
Hi Cyril, According to the documentation ( http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair), I understand that is is not necessary to repair every node before gc_grace_seconds if you are sure that you don't miss to run a repair each time a node is down longer than gc_grace_seconds. *IF* your operations team is sufficiently on the ball, you can get by without repair as long as you do not have hardware failure -- in that case, HintedHandoff is adequate to repair successful updates that some replicas have missed Am I wrong ? Thoughts ? 2013/4/4 cscetbon@orange.com Hi, I know that deleted rows can reappear if node repair is not run on every node before *gc_grace_seconds* seconds. However do we really need to obey this rule if we run node repair on node that are down for more than * max_hint_window_in_ms* milliseconds ? Thanks -- Cyril SCETBON _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Re: Really have to repair ?
That's exactly what I understood and why I was using the max_hint_window_in_ms threshold to force a manual repair. -- Cyril SCETBON On Apr 5, 2013, at 5:22 PM, Jean-Armel Luce jaluc...@gmail.commailto:jaluc...@gmail.com wrote: Hi Cyril, According to the documentation (http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair), I understand that is is not necessary to repair every node before gc_grace_seconds if you are sure that you don't miss to run a repair each time a node is down longer than gc_grace_seconds. *IF* your operations team is sufficiently on the ball, you can get by without repair as long as you do not have hardware failure -- in that case, HintedHandoff is adequate to repair successful updates that some replicas have missed Am I wrong ? Thoughts ? 2013/4/4 cscetbon@orange.commailto:cscetbon@orange.com Hi, I know that deleted rows can reappear if node repair is not run on every node before gc_grace_seconds seconds. However do we really need to obey this rule if we run node repair on node that are down for more than max_hint_window_in_ms milliseconds ? Thanks -- Cyril SCETBON _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you. _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
skipping sstable due to bloom filter debug messages What were these messages? Do you have the logs from the start up ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote: Hi, I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to 1.2.3. During startup while tailing C*'s system.log, I observed a series of SSTable batch load messages and skipping sstable due to bloom filter debug messages which is normal for startup, but when it reached loading saved key caches, it gets stuck forever. The I/O wait stays high in the CPU graph and I/O ops are sent to disk, but C* never passes that step of loading the key cache file successfully. The saved key cache file was about 75MB on one node and 125MB on the other node and they were for different CFs. image.jpeg The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at loading one saved key cache file. I have marked that on the graph above. The workaround was to delete the saved cache files and things loaded fine (See marked Normal Startup). These machines are m1.xlarge EC2 instances. And this issue happened on both nodes upgraded. This did not happen during exercise of upgrade from 1.1.6 to 1.2.2 using the same snapshot. Should I raise a JIRA? -Arya
Re: Data Model and Query
Whats the recommendation on querying a data model like StartDate “X” and counter “Y” . it's not possible. If you are using secondary indexes you have to have an equals clause in the statement. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:53 AM, shubham srivastava shubha...@gmail.com wrote: Hi, Whats the recommendation on querying a data model like StartDate “X” and counter “Y” . Its kind of range queries across multiple columns and key. I have the flexibility for modelling the data for the above query accordingly. Regards, Shubham
Re: Issues running Bulkloader program on AIX server
Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in java.library.path) You do not have the snappy compression library installed. http://www.datastax.com/docs/1.1/troubleshooting/index#cannot-initialize-class-org-xerial-snappy-snappy Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 1:36 PM, praveen.akun...@wipro.com wrote: Hi All, Sorry, my environment is as below: 3 node cluster with Cassandra 1.1.9 provided with DSE 3.0 on Linux We are trying to run the bulk loader from AIX 6.1 server. Java version 1.5. Regards, Praveen From: Praveen Akunuru praveen.akun...@wipro.com Date: Thursday, April 4, 2013 12:21 PM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Issues running Bulkloader program on AIX server Hi All, I am facing issues with running java Bulkloader program from a AIX server. The program is working fine on Linux server. I am receiving the below error on AIX. Can anyone help me in getting this working? java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at java.lang.J9VMInternals.initializeImpl(Native Method) at java.lang.J9VMInternals.initialize(J9VMInternals.java:200) at org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45) at org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55) at org.apache.cassandra.io.compress.SnappyCompressor.clinit(SnappyCompressor.java:37) at java.lang.J9VMInternals.initializeImpl(Native Method) at java.lang.J9VMInternals.initialize(J9VMInternals.java:200) at org.apache.cassandra.config.CFMetaData.clinit(CFMetaData.java:82) at java.lang.J9VMInternals.initializeImpl(Native Method) at java.lang.J9VMInternals.initialize(J9VMInternals.java:200) at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.init(SSTableSimpleUnsortedWriter.java:80) at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.init(SSTableSimpleUnsortedWriter.java:93) at BulkLoadExample.main(BulkLoadExample.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in java.library.path) at java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1011) at java.lang.ClassLoader.loadLibraryWithClassLoader(ClassLoader.java:975) at java.lang.System.loadLibrary(System.java:469) at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52) ... 25 more log4j:WARN No appenders could be found for logger (org.apache.cassandra.io.compress.SnappyCompressor). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Unhandled exception Type=Segmentation error vmState=0x J9Generic_Signal_Number=0004 Signal_Number=000b Error_Value= Signal_Code=0032 Handler1=09001000A06FF5A0 Handler2=09001000A06F60F0 Regards, Praveen Wipro Limited (Company Regn No in UK - FC 019088) Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703 VAT Number: 563 1964 27 (Branch of Wipro Limited (Incorporated in India at Bangalore with limited liability vide Reg no L9KA1945PLC02800 with Registrar of Companies at Bangalore, India. Authorized share capital: Rs 5550 mn)) Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
Re: Lost data after expanding cluster c* 1.2.3-1
but nothing's happening, how can i monitor the progress? and how can i know when it's finished? check nodetool compacitonstats Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 2:51 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, I ran the command nodetool rebuild_index host keyspace cf on all the nodes, in the log i see : INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641 ColumnFamilyStore.java (line 558) User Requested secondary index re-build for ... but nothing's happening, how can i monitor the progress? and how can i know when it's finished? Thanks, 2013/4/2 aaron morton aa...@thelastpickle.com The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation (http://www.datastax.com/docs/1.2/install/expand_ami) auto_bootstrap defaults to True if not specified in the yaml. can i do that at any time, or when the cluster are not loaded Not sure what the question is. Both those operations are online operations you can do while the node is processing requests. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 1/04/2013, at 9:26 PM, Kais Ahmed k...@neteck-fr.com wrote: At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? The errors was in my side in the application, not cassandra errors I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Thank you for that advice. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation (http://www.datastax.com/docs/1.2/install/expand_ami) if you are using secondary indexes use nodetool rebuild_index to rebuild those. can i do that at any time, or when the cluster are not loaded Thanks aaron, 2013/4/1 aaron morton aa...@thelastpickle.com Please do not rely on colour in your emails, the best way to get your emails accepted by the Apache mail servers is to use plain text. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Now the cluster seem to work normally, but i can use the secondary for the moment, the queryanswer are random run nodetool repair -pr on each node, let it finish before starting the next one. if you are using secondary indexes use nodetool rebuild_index to rebuild those. Add one node new node to the cluster and confirm everything is ok, then add the remaining ones. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 31/03/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, Thanks for reply, i will try to explain what append exactly I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with this config --clustername myDSCcluster --totalnodes 4--version community Two days after this cluster in production, i saw that the cluster was overload, I wanted to extend it by adding 3 another nodes. I create a new cluster with 3 C* [D,E,F] (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) And follow the documentation (http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in the ring. I put for each of them seeds = A ip, and start each with two minutes intervals. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/ Moving -- Address Load Tokens Owns Host ID Rack UN 10.34.142.xxx 10.79 GB 256 15.4% 4e2e26b8-aa38-428c-a8f5-e86c13eb4442 1b UN 10.32.49.xxx 1.48 MB25613.7% e86f67b6-d7cb-4b47-b090-3824a5887145 1b UN
Re: upgrading 1.1.x to 1.2.x via sstableloader
Is it safe to change sstable file name to avoid name collisions? Yes. Make sure to change the generation number for all the components. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 3:01 PM, Michał Czerwiński mic...@qubitproducts.com wrote: I see, thanks for the replay! One more question: I can see that multiple nodes have same sstable names for a certain keyspace / cf. I am moving 8 nodes to a 6 nodes cluster, so at some point when putting sstables in place I would overwrite files from other node. What is the best way to solve this problem? Is it safe to change sstable file name to avoid name collisions? On 4 April 2013 02:54, aaron morton aa...@thelastpickle.com wrote: java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db is not compatible with current version ib You cannot stream files that have a different on disk format. 1.2 can read the old files, but cannot accept them as streams. You can copy the files to the new machines and use nodetool refresh to load them, then upgradesstables to re-write them before running repair. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 3/04/2013, at 10:53 PM, Michał Czerwiński mic...@qubitproducts.com wrote: Does anyone knows what is the best process to put data from cassandra 1.1.x (1.1.7 to be more precise) to cassandra 1.2.3 ? I am trying to use sstableloader and stream data to a new cluster but I get. ERROR [Thread-125] 2013-04-03 16:37:27,330 IncomingTcpConnection.java (line 183) Received stream using protocol version 5 (my version 6). Terminating connection ERROR [Thread-141] 2013-04-03 16:38:05,704 CassandraDaemon.java (line 164) Exception in thread Thread[Thread-141,5,main] java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db is not compatible with current version ib at org.apache.cassandra.streaming.StreamIn.getContextMapping(StreamIn.java:77) at org.apache.cassandra.streaming.IncomingStreamReader.init(IncomingStreamReader.java:87) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238) at org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) I've changed Murmur3Partitioner to RandomPartitioner already and I've noticed I am not able to use 1.1.7's sstableloader so I copied sstables to new nodes and tried doing it locally on cassandra 1.2.3, but it seems protocol versions do not match (see error above) The reason why I want to use sstableloader is that I have different number of nodes and would like to avoid using rsync and then repair/cleanup of excessive data. Thanks!
Re: nodetool status inconsistencies, repair performance and system keyspace compactions
monitor the repair using nodetool compactionstats to see the merkle trees being created, and nodetool netstats to see data streaming. Also look in the logs for messages from AntiEntropyService.java , that will tell you how long the node waited for each replica to get back to it. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 5:42 PM, Ondřej Černoš cern...@gmail.com wrote: Hi, most has been resolved - the failed to uncompress error was really a bug in cassandra (see https://issues.apache.org/jira/browse/CASSANDRA-5391) and the problem with different load reporting is a change between 1.2.1 (reports 100% for 3 replicas/3 nodes/2 DCs setup I have) and 1.2.3 which reports the fraction. Is this correct? Anyway, the nodetool repair still takes ages to finish, considering only megabytes of not changing data are involved in my test: [root@host:/etc/puppet] nodetool repair ks [2013-04-04 13:26:46,618] Starting repair command #1, repairing 1536 ranges for keyspace ks [2013-04-04 13:47:17,007] Repair session 88ebc700-9d1a-11e2-a0a1-05b94e1385c7 for range (-2270395505556181001,-2268004533044804266] finished ... [2013-04-04 13:47:17,063] Repair session 65d31180-9d1d-11e2-a0a1-05b94e1385c7 for range (1069254279177813908,1070290707448386360] finished [2013-04-04 13:47:17,063] Repair command #1 finished This is the status before the repair (by the way, after the datacenter has been bootstrapped from the remote one): [root@host:/etc/puppet] nodetool status Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxx.xxx.xxx.xxx5.74 MB256 17.1% 06ff8328-32a3-4196-a31f-1e0f608d0638 1d UN xxx.xxx.xxx.xxx5.73 MB256 15.3% 7a96bf16-e268-433a-9912-a0cf1668184e 1d UN xxx.xxx.xxx.xxx5.72 MB256 17.5% 67a68a2a-12a8-459d-9d18-221426646e84 1d Datacenter: na-dev == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxx.xxx.xxx.xxx 5.74 MB256 16.4% eb86aaae-ef0d-40aa-9b74-2b9704c77c0a cmp02 UN xxx.xxx.xxx.xxx 5.74 MB256 17.0% cd24af74-7f6a-4eaa-814f-62474b4e4df1 cmp01 UN xxx.xxx.xxx.xxx 5.74 MB256 16.7% 1a55cfd4-bb30-4250-b868-a9ae13d81ae1 cmp05 Why does it take 20 minutes to finish? Fortunately the big number of compactions I reported in the previous email was not triggered. And is there a documentation where I could find the exact semantics of repair when vnodes are used (and what -pr means in such a setup) and when run in multiple datacenter setup? I still don't quite get it. regards, Ondřej Černoš On Thu, Mar 28, 2013 at 3:30 AM, aaron morton aa...@thelastpickle.com wrote: During one of my tests - see this thread in this mailing list: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html That thread has been updated, check the bug ondrej created. How will this perform in production with much bigger data if repair takes 25 minutes on 7MB and 11k compactions were triggered by the repair run? Seems a little odd. See what happens the next time you run repair. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/03/2013, at 2:36 AM, Ondřej Černoš cern...@gmail.com wrote: Hi all, I have 2 DCs, 3 nodes each, RF:3, I use local quorum for both reads and writes. Currently I test various operational qualities of the setup. During one of my tests - see this thread in this mailing list: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html - I ran into this situation: - all nodes have all data and agree on it: [user@host1-dc1:~] nodetool status Datacenter: na-prod === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns (effective) Host IDRack UN XXX.XXX.XXX.XXX 7.74 MB256 100.0% 0b1f1d79-52af-4d1b-a86d-bf4b65a05c49 cmp17 UN XXX.XXX.XXX.XXX 7.74 MB256 100.0% 039f206e-da22-44b5-83bd-2513f96ddeac cmp10 UN XXX.XXX.XXX.XXX 7.72 MB256 100.0% 007097e9-17e6-43f7-8dfc-37b082a784c4 cmp11 Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns (effective) Host IDRack UN
Re: Repair hangs when merkle tree request is not acknowledged
A repair on a certain CF will fail, and I run it again and again, eventually it will succeed. How does it fail? Can you see the repair start on the other node ? If you are getting errors in the log about streaming failing because a node died, and the FailureDetector is in the call stack, change the phi_convict_threshold. You can set it in the yaml file or via JMX on the FailureDetectorMBean, in either case boost it from 8 to 16 to get the repair through. This will make it less likely that a node is marked as down, you probably want to run with 8 or a little bit higher normally. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:41 PM, Paul Sudol paulsu...@gmail.com wrote: Hello, I have a cluster with 4 nodes, 2 nodes in 2 data centers. I had a hardware failure in one DC and had to replace the nodes. I'm running 1.2.3 on all of the nodes now. I was able to run nodetool rebuild on the two replacement nodes, but now I cannot finish a repair on any of them. I have 18 column families, if I run a repair on a single CF at a time, I can get the node repaired eventually. A repair on a certain CF will fail, and I run it again and again, eventually it will succeed. I've got an RF of 2, 1 copy in each DC, so the repair needs to pull data from the other DC to finish it's repair. The problem seems to be that the merkle tree request sometimes is not received by the node in the other DC. Usually when the merkle tree request is sent, the nodes that it was sent to start a compaciton/validation. In certain cases this does not happen, only the node that I ran the repair on will begin compaction/validation and send the merkle tree to itself. Then it's waiting for a merkle tree from the other node, and it will never get it. After about 24 hours it will time out and say the node in question died. Is there a setting I can use to force the merkle tree request to be acknowledged or resent if it's not acknowledged? I setup NTPD on all the nodes and tried the cross_node_timeout, but that did not help. Thanks in advance, Paul
Re: gossip not working
Starting the node with the JVM option -Dcassandra.load_ring_state=false in cassandra-env.sh sometimes works. If not post the output from nodetool gossipinfo Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 5/04/2013, at 9:38 AM, S C as...@outlook.com wrote: Is there a way to force gossip among the nodes? From: as...@outlook.com To: user@cassandra.apache.org Subject: RE: gossip not working Date: Thu, 4 Apr 2013 19:59:45 -0500 I am not seeing anything in the logs other than Starting up server gossip and there is no firewall between the nodes. From: paulsu...@gmail.com Subject: Re: gossip not working Date: Thu, 4 Apr 2013 18:49:29 -0500 To: user@cassandra.apache.org What errors are you seeing in the log files of the down nodes? Did you run upgradesstables? You need to upgradesstables when moving from 1.1.7 to 1.1.9 On Apr 4, 2013, at 6:11 PM, S C as...@outlook.com wrote: I was in the middle of upgrade to 1.1.9. I brought one node with 1.1.9 while the other were running on 1.1.5. Once one of the node was on 1.1.9 it is no longer recognizing other nodes in the ring. On 192.168.56.10 and 11 192.168.56.10 DC1-CassRAC1Up Normal 28.06 GB50.00% 0 192.168.56.11 DC1-CassRAC1Up Normal 31.59 GB25.00% 42535295865117307932921825928971026432 192.168.56.12 DC1-CassRAC1Down Normal 29.02 GB25.00% 85070591730234615865843651857942052864 On 192.168.56.12 192.168.56.10 DC1-CassRAC1Down Normal 28.06 GB 50.00% 0 192.168.56.11 DC1-CassRAC1Down Normal 31.59 GB 25.00% 42535295865117307932921825928971026432 192.168.56.12 DC1-CassRAC1Up Normal 29.02 GB25.00% 85070591730234615865843651857942052864 I do not see anything in the logs that tells me that there is a gossip issue. nodetool info Token: 85070591730234615865843651857942052864 Gossip active: true Thrift active: true Load : 29.05 GB Generation No: 1365114563 Uptime (seconds) : 2127 Heap Memory (MB) : 848.71 / 7945.94 Exceptions : 0 Key Cache: size 2208 (bytes), capacity 104857584 (bytes), 1056 hits, 1099 requests, 0.961 recent hit rate, 14400 save period in seconds Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds nodetool info Token: 42535295865117307932921825928971026432 Gossip active: true Thrift active: true Load : 31.59 GB Generation No: 1364413038 Uptime (seconds) : 703904 Heap Memory (MB) : 733.02 / 7945.94 Exceptions : 1 Key Cache: size 3693312 (bytes), capacity 104857584 (bytes), 26071678 hits, 26616282 requests, 0.980 recent hit rate, 14400 save period in seconds Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds There is no firewall between the nodes and I can reach each other on storage port. What else should I be looking at to find root cause? Appreciate your inputs.
Re: Repair hangs when merkle tree request is not acknowledged
How does it fail? If I wait 24 hours, the repair command will return an error saying that the node died… but the node really didn't die, I watch it the whole time. I have the DEBUG messages on in the log files, when the node I'm repairing sends out a merkle tree request, I will normally see, {ColumnFamilyStore.java (line 700) forceFlush requested but everything is clean in COLUMN FAMILY NAME}, in the log of the node that should be generating the merkle tree request. (in addition, when I run nodetool -h localhost compactionstats, I will see activity). When the node that should be generating a merkle tree does not have this message, and has no activity to see via nodetool compactionstats, it will fail. There are no errors about streaming, it does not even get to the point of streaming. One node will send requests for merkle trees, and sometimes, the node in the other data center just doesn't get the message. At least that's what it looks like. Should I still try the phi_convict_threshold? Thanks! Paul On Apr 5, 2013, at 12:19 PM, aaron morton aa...@thelastpickle.com wrote: A repair on a certain CF will fail, and I run it again and again, eventually it will succeed. Can you see the repair start on the other node ? If you are getting errors in the log about streaming failing because a node died, and the FailureDetector is in the call stack, change the phi_convict_threshold. You can set it in the yaml file or via JMX on the FailureDetectorMBean, in either case boost it from 8 to 16 to get the repair through. This will make it less likely that a node is marked as down, you probably want to run with 8 or a little bit higher normally. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:41 PM, Paul Sudol paulsu...@gmail.com wrote: Hello, I have a cluster with 4 nodes, 2 nodes in 2 data centers. I had a hardware failure in one DC and had to replace the nodes. I'm running 1.2.3 on all of the nodes now. I was able to run nodetool rebuild on the two replacement nodes, but now I cannot finish a repair on any of them. I have 18 column families, if I run a repair on a single CF at a time, I can get the node repaired eventually. A repair on a certain CF will fail, and I run it again and again, eventually it will succeed. I've got an RF of 2, 1 copy in each DC, so the repair needs to pull data from the other DC to finish it's repair. The problem seems to be that the merkle tree request sometimes is not received by the node in the other DC. Usually when the merkle tree request is sent, the nodes that it was sent to start a compaciton/validation. In certain cases this does not happen, only the node that I ran the repair on will begin compaction/validation and send the merkle tree to itself. Then it's waiting for a merkle tree from the other node, and it will never get it. After about 24 hours it will time out and say the node in question died. Is there a setting I can use to force the merkle tree request to be acknowledged or resent if it's not acknowledged? I setup NTPD on all the nodes and tried the cross_node_timeout, but that did not help. Thanks in advance, Paul
Re: Data Modeling: How to keep track of arbitrarily inserted column names?
One thing I can do is to have a client-side cache of the keys to reduce the number of updates. On Apr 5, 2013, at 6:14 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Since there are few column names what you can do is this. Make a reverse index, low read repair chance, Be aggressive with compaction. It will be many extra writes but that is ok. Other option is turn on row cache and try read before write. It is a good case for row cache because it is a very small data set. On Thursday, April 4, 2013, Drew Kutcharian d...@venarc.com wrote: I don't really need to answer what rows contain column named X, so no need for a reverse index here. All I want is a distinct set of all the column names, so I can answer what are all the available column names On Apr 4, 2013, at 4:20 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Your reverse index of which rows contain a column named X will have very wide rows. You could look at cassandra's secondary indexing, or possibly look at a solandra/solr approach. Another option is you can shift the problem slightly, which rows have column X that was added between time y and time z. Remember with few distinct column names that reverse index of column to row is going to be a very big list. On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian d...@venarc.com wrote: Hi Edward, I anticipate that the column names will be reused a lot. For example, key1 will be in many rows. So I think the number of distinct column names will be much much smaller than the number of rows. Is there a way to have a separate CF that keeps track of the column names? What I was thinking was to have a separate CF that I write only the column name with a null value in there every time I write a key/value to the main CF. In this case if that column name exist, then it will just be overridden. Now if I wanted to get all the column names, then I can just query that CF. Not sure if that's the best approach at high load (100k inserts a second). -- Drew On Apr 4, 2013, at 12:02 PM, Edward Capriolo edlinuxg...@gmail.com wrote: You can not get only the column name (which you are calling a key) you can use get_range_slice which returns all the columns. When you specify an empty byte array (new byte[0]{}) as the start and finish you get back all the columns. From there you can return only the columns to the user in a format that you like. On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian d...@venarc.com wrote: Hey Guys, I'm working on a project and one of the requirements is to have a schema free CF where end users can insert arbitrary key/value pairs per row. What would be the best way to know what are all the keys that were inserted (preferably w/o any locking). For example, Row1 = key1 - XXX, key2 - XXX Row2 = key1 - XXX, key3 - XXX Row3 = key4 - XXX, key5 - XXX Row4 = key2 - XXX, key5 - XXX … The query would be give me all the inserted keys and the response would be {key1, key2, key3, key4, key5} Thanks, Drew
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
Here is a chunk of bloom filter sstable skip messages from the node I enabled DEBUG on: DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39459 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39483 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39332 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39335 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39438 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39478 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39456 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39469 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39334 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39406 This is the last chunk of log before C* gets stuck, right before I stop the process, remove key caches and start again (This is from another node that I upgraded 2 days ago): INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 (5273270 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 (5264359 bytes) INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 (5260887 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 (5262864 bytes) INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line 112) reading saved cache /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache I finally upgrade all 12 nodes in our test environment yesterday. This issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get stuck on the same CF loading its saved KeyCache. On Fri, Apr 5, 2013 at 9:56 AM, aaron morton aa...@thelastpickle.comwrote: skipping sstable due to bloom filter debug messages What were these messages? Do you have the logs from the start up ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote: Hi, I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to 1.2.3. During startup while tailing C*'s system.log, I observed a series of SSTable batch load messages and skipping sstable due to bloom filter debug messages which is normal for startup, but when it reached loading saved key caches, it gets stuck forever. The I/O wait stays high in the CPU graph and I/O ops are sent to disk, but C* never passes that step of loading the key cache file successfully. The saved key cache file was about 75MB on one node and 125MB on the other node and they were for different CFs. image.jpeg The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at loading one saved key cache file. I have marked that on the graph above. The workaround was to delete the saved cache files and things loaded fine (See marked Normal Startup). These machines are m1.xlarge EC2 instances. And this issue happened on both nodes upgraded. This did not happen during exercise of upgrade from 1.1.6 to 1.2.2 using the same snapshot. Should I raise a JIRA? -Arya
Re: Data Model and Query
I would partition either with cassandra's partitioning or PlayOrm partitioning and query like so Where beginOfMonth=x and startDateX and counter Y. This only returns stuff after X in that partition though so you may need to run multiple queries like this and if you have billions of rows it could take some time….instead you may want startData X and startDate Z such that Z and X are in the same month or if they span 2-3 partitions, then just run the 2-3 queries. I don't know enough detail on your use case to know if this works for you though. Dean From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, April 5, 2013 10:59 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Data Model and Query Whats the recommendation on querying a data model like StartDate “X” and counter “Y” . it's not possible. If you are using secondary indexes you have to have an equals clause in the statement. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:53 AM, shubham srivastava shubha...@gmail.commailto:shubha...@gmail.com wrote: Hi, Whats the recommendation on querying a data model like StartDate “X” and counter “Y” . Its kind of range queries across multiple columns and key. I have the flexibility for modelling the data for the above query accordingly. Regards, Shubham
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
This has happened before the save caches files were not compatible between 0.6 and 0.7. I have ran into this a couple other times before. The good news is the save key cache is just an optimization, you can blow it away and it is not usually a big deal. On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi gouda...@gmail.com wrote: Here is a chunk of bloom filter sstable skip messages from the node I enabled DEBUG on: DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39459 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39483 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39332 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39335 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39438 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39478 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39456 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39469 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39334 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39406 This is the last chunk of log before C* gets stuck, right before I stop the process, remove key caches and start again (This is from another node that I upgraded 2 days ago): INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 (5273270 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 (5264359 bytes) INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 (5260887 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 (5262864 bytes) INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line 112) reading saved cache /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache I finally upgrade all 12 nodes in our test environment yesterday. This issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get stuck on the same CF loading its saved KeyCache. On Fri, Apr 5, 2013 at 9:56 AM, aaron morton aa...@thelastpickle.comwrote: skipping sstable due to bloom filter debug messages What were these messages? Do you have the logs from the start up ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote: Hi, I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to 1.2.3. During startup while tailing C*'s system.log, I observed a series of SSTable batch load messages and skipping sstable due to bloom filter debug messages which is normal for startup, but when it reached loading saved key caches, it gets stuck forever. The I/O wait stays high in the CPU graph and I/O ops are sent to disk, but C* never passes that step of loading the key cache file successfully. The saved key cache file was about 75MB on one node and 125MB on the other node and they were for different CFs. image.jpeg The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at loading one saved key cache file. I have marked that on the graph above. The workaround was to delete the saved cache files and things loaded fine (See marked Normal Startup). These machines are m1.xlarge EC2 instances. And this issue happened on both nodes upgraded. This did not happen during exercise of upgrade from 1.1.6 to 1.2.2 using the same snapshot. Should I raise a JIRA? -Arya
Re: Really have to repair ?
There are a series of edge cases that dictate the need for repair. The largest cases are 1) lost deletes 2) random disk corruptions In our use case we only delete entire row keys, and if the row key comes back it is not actually a problem because our software will find it an delete it again. In those places we dodge running repair believe it or not. Edward On Fri, Apr 5, 2013 at 11:22 AM, Jean-Armel Luce jaluc...@gmail.com wrote: Hi Cyril, According to the documentation ( http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair), I understand that is is not necessary to repair every node before gc_grace_seconds if you are sure that you don't miss to run a repair each time a node is down longer than gc_grace_seconds. *IF* your operations team is sufficiently on the ball, you can get by without repair as long as you do not have hardware failure -- in that case, HintedHandoff is adequate to repair successful updates that some replicas have missed Am I wrong ? Thoughts ? 2013/4/4 cscetbon@orange.com Hi, I know that deleted rows can reappear if node repair is not run on every node before *gc_grace_seconds* seconds. However do we really need to obey this rule if we run node repair on node that are down for more than *max_hint_window_in_ms* milliseconds ? Thanks -- Cyril SCETBON _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Counter batches query
Hi, I have an application that does batch (counter) writes to multiple CFs. The application itself is multi-threaded and I'm using C* 1.2.2 and Astyanax driver. Could someone share insights on: 1) When I see the cluster write throughput graph in opscenter, the number is not reflective of actual number of writes. For example: If I issue a single batch write ( internally have 5 mutation ), is the opscenter/JMX cluster/node writes suppose to indicate 1 or 5 ? ( I would assume 5 ) 2) I read that from C* 1.2.x, there is atomic counter batches which can cause 30% performance hit - wondering if this applicable to existing thrift based clients like Astyanax/Hector and if so, what is the way to turn it off? Any server side settings too? Thanks!
Re: schema disagrement exception
Thanks a lot, and now I have solved this problem. 2013/3/28 aaron morton aa...@thelastpickle.com Your cluster is angry http://wiki.apache.org/cassandra/FAQ#schema_disagreement If your are just starting I suggest blasting it away and restarting. Hope that helps - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 26/03/2013, at 8:57 PM, zg zhang.i...@gmail.com wrote: Hi, I just try to set up a 2 nodes cluster. It seems work,but when I use CLI to create a keyspace I meet an error SchemaDisagreementException(). Does anyone know how to solve it? Thanks
Failed to connect to '127.0.0.1:7199'
Hi everyone, I have a 3 nodes cluster, one node when I use nodetool, I met a error Failed to connect to '127.0.0.1:7199':Connection timed out, use CLI and CQLsh is ok on this node, the other two nodes is OK when I use nodetool commend. So what's the problem with that node?