Reverting from VirtualNode
Once we set nodes to act as virtualnodes, there is an way to revert to manual assigned token? I have two nodes for testing and there I set 'num_tokens: 256' and let initial_token line commented. VirtualNodes worked fine. But then I tried to switch back by commenting 'num_tokens' line and uncommenting 'initial_token', although after starting cassandra and typing, ./nodetool -h 'ip' ring there are still the default 256 tokens per node. What am I missing? Att, *Víctor Hugo Molinar*
Re: CentOS - Could not setup cluster(snappy error)
Thanks for all the answers. The problem was exactly the noexec setting in fstabs file. Cassandra cluster started succesfully after removing that entry. Att, *Víctor Hugo Molinar* On Mon, Dec 30, 2013 at 6:59 PM, Erik Forkalsud eforkals...@cj.com wrote: You can add something like this to cassandra-env.sh : JVM_OPTS=$JVM_OPTS -Dorg.xerial.snappy.tempdir=/path/that/allows/executables - Erik - On 12/28/2013 08:36 AM, Edward Capriolo wrote: Check your fstabs settings. On some systems /tmp has noexec set and unpacking a library into temp and trying to run it does not work. On Fri, Dec 27, 2013 at 5:33 PM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hi, I'm not being able to start a multiple node cluster in a CentOs environment due to snappy loading error. Here is my current setup for both machines(Node 1 and 2), CentOs: CentOS release 6.5 (Final) Java java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) Cassandra Version: 2.0.3 Also, I've already replaced the current snappy jar(snappy-java-1.0.5.jar) by the older(snappy-java-1.0.4.1.jar). Although the following error is still happening when I try to start the second node: INFO 20:25:51,879 Handshaking version with /200.219.219.51 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:312) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) Caused by: java.lang.UnsatisfiedLinkError: /tmp/ snappy-1.0.4.1-libsnappyjava.so: /tmp/snappy-1.0.4.1-libsnappyjava.so: failed to map segment from shared object: Operation not permitted at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1957) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1882) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1843) at java.lang.Runtime.load0(Runtime.java:795) at java.lang.System.load(System.java:1061) at org.xerial.snappy.SnappyNativeLoader.load(SnappyNativeLoader.java:39) ... 11 more ERROR 20:25:52,201 Exception in thread Thread[WRITE-/200.219.219.51 ,5,main] org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) ERROR 20:26:22,924 Exception encountered during startup java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160) at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:416) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:608) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:576) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:475) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504) What else can I do to fix it? Att, *Víctor Hugo Molinar*
CentOS - Could not setup cluster(snappy error)
Hi, I'm not being able to start a multiple node cluster in a CentOs environment due to snappy loading error. Here is my current setup for both machines(Node 1 and 2), CentOs: CentOS release 6.5 (Final) Java java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) Cassandra Version: 2.0.3 Also, I've already replaced the current snappy jar(snappy-java-1.0.5.jar) by the older(snappy-java-1.0.4.1.jar). Although the following error is still happening when I try to start the second node: INFO 20:25:51,879 Handshaking version with /200.219.219.51 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:312) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) Caused by: java.lang.UnsatisfiedLinkError: /tmp/ snappy-1.0.4.1-libsnappyjava.so: /tmp/snappy-1.0.4.1-libsnappyjava.so: failed to map segment from shared object: Operation not permitted at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1957) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1882) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1843) at java.lang.Runtime.load0(Runtime.java:795) at java.lang.System.load(System.java:1061) at org.xerial.snappy.SnappyNativeLoader.load(SnappyNativeLoader.java:39) ... 11 more ERROR 20:25:52,201 Exception in thread Thread[WRITE-/200.219.219.51,5,main] org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) ERROR 20:26:22,924 Exception encountered during startup java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160) at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:416) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:608) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:576) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:475) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504) What else can I do to fix it? Att, *Víctor Hugo Molinar*
Re: Cleanup understastanding
Thanks for the answers. I got it. I was using cleanup, because I thought it would delete the tombstones. But, that is still awkward. Does cleanup take so much disk space to complete the compaction operation? In other words, twice the size? *Atenciosamente,* *Víctor Hugo Molinar - *@vhmolinar http://twitter.com/#!/vhmolinar On Tue, May 28, 2013 at 9:55 PM, Takenori Sato(Cloudian) ts...@cloudian.com wrote: Hi Victor, As Andrey said, running cleanup doesn't work as you expect. The reason I need to clean things is that I wont need most of my inserted data on the next day. Deleted objects(columns/records) become deletable from sstable file when they get expired(after gc_grace_seconds). Such deletable objects are actually gotten rid of by compaction. The tricky part is that a deletable object remains unless all of its old objects(the same row key) are contained in the set of sstable files involved in the compaction. - Takenori (2013/05/29 3:01), Andrey Ilinykh wrote: cleanup removes data which doesn't belong to the current node. You have to run it only if you move (or add new) nodes. In your case there is no any reason to do it. On Tue, May 28, 2013 at 7:39 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hello everyone. I have a daily maintenance task at c* which does: -truncate cfs -clearsnapshots -repair -cleanup The reason I need to clean things is that I wont need most of my inserted data on the next day. It's kind a business requirement. Well, the problem I'm running to, is the misunderstanding about cleanup operation. I have 2 nodes with lower than half usage of disk, which is moreless 13GB; But, the last few days, arbitrarily each node have reported me a cleanup error indicating that the disk was full. Which is not true. *Error occured during cleanup* *java.util.concurrent.ExecutionException: java.io.IOException: disk full* So I'd like to know more about what does happens in a cleanup operation. Appreciate any help.
Cleanup understastanding
Hello everyone. I have a daily maintenance task at c* which does: -truncate cfs -clearsnapshots -repair -cleanup The reason I need to clean things is that I wont need most of my inserted data on the next day. It's kind a business requirement. Well, the problem I'm running to, is the misunderstanding about cleanup operation. I have 2 nodes with lower than half usage of disk, which is moreless 13GB; But, the last few days, arbitrarily each node have reported me a cleanup error indicating that the disk was full. Which is not true. *Error occured during cleanup* *java.util.concurrent.ExecutionException: java.io.IOException: disk full* So I'd like to know more about what does happens in a cleanup operation. Appreciate any help.
Re: C++ Thrift client
Aaron, whenever I get a GCInspector event log, will it means that I'm having a GC pause? *Atenciosamente,* *Víctor Hugo Molinar - *@vhmolinar http://twitter.com/#!/vhmolinar On Thu, May 16, 2013 at 8:53 PM, aaron morton aa...@thelastpickle.comwrote: (Assuming you have enabled tcp_nodelay on the client socket) Check the server side latency, using nodetool cfstats or nodetool cfhistograms. Check the logs for messages from the GCInspector about ParNew pauses. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/05/2013, at 12:58 PM, Bill Hastings bllhasti...@gmail.com wrote: Hi All I am doing very small inserts into Cassandra in the range of say 64 bytes. I use a C++ Thrift client and seem consistently get latencies anywhere between 35-45 ms. Could some one please advise as to what might be happening? thanks
Re: Delete Issues with cassandra cluster
What is the consistence level of your read and write operations? On Mon, Mar 25, 2013 at 8:39 AM, Byron Wang byron.w...@woowteam.com wrote: Hi, I'm using cassandra 1.2.3. I've successfully clustered 3 machines and created a keyspace with replication factor 3. Node1 seeds Node2 Node2 seeds Node1 Node3 seeds Node1 I insert an entry using node1. Using cqlsh from another node, I try to delete the item by sending out the delete command. After sending the command, there seems to be no error but when I try to select the item it is still there. When I try to send the same delete command from node1 cqlsh it seems to work. Basically any delete command i send from the other nodes doesn't work unless i use it using node1. However I can select the items using the other nodes. Is this a problem? I can't seem to modify objects from node1 using other nodes. Truncate works though. Please help Thanks! Byron
Truncate behaviour
Hello guys! I'm researching the behaviour for truncate operations at cassandra. Reading the oficial wiki page(http://wiki.apache.org/cassandra/API) we can understand it as: *Removes all the rows from the given column family.* And reading the DataStax page( http://www.datastax.com/docs/1.0/references/cql/TRUNCATE) we can understand it as: *A TRUNCATE statement results in the immediate, irreversible removal of all data in the named column family.* But I think there is a missing and important point about truncate operations. At least at 1.2.0 version, whenever I run a truncate operation, C* automatically creates a snapshot file of the column family, resulting in a fake free disk space. I'm intentionally mentioning 'fake free disk space' because I only figured it out when the machine disk space was at high usage. - Is it a security C* behaviour of creating snapshots for each CF before truncate operation? - In my scenario I need to purge my column family data every day. I thought that truncate could handle it based at the docs. But it doesnt. And since I dont want to manually delete those snapshots, I'd like to know if there is a safe and practical way to perform a daily purge of this CF data. Thanks in advance!
Re: Java client options for C* v1.2
I guess Hector fits your requirements. The last release is pretty new. But i'd suggest you to take a look at astyanax too. On Tue, Mar 19, 2013 at 6:34 PM, Marko Asplund marko.aspl...@gmail.comwrote: Hi, I'm about to start my first Cassandra project and am a bit puzzled by the multitude of different client options available for Java. Are there any good comparisons of the different options that's been done recently? I'd like choose a client that - is feature complete (provides access to all Cassandra major features) - works well with Cassandra 1.2 - is being actively developed - is widely used and has an active community Some of the clients don't appear to be actively developed (at least based on latest release dates). Any recommendations? marko
Re: Truncate behaviour
Hum, my bad. Thank you! On Tue, Mar 19, 2013 at 11:55 PM, Wei Zhu wz1...@yahoo.com wrote: There is setting in the cassandra.yaml file which controls that. # Whether or not a snapshot is taken of the data before keyspace truncation # or dropping of column families. The STRONGLY advised default of true # should be used to provide data safety. If you set this flag to false, you will # lose data on truncation or drop. auto_snapshot: true - Original Message - From: Víctor Hugo Oliveira Molinar vhmoli...@gmail.com To: user@cassandra.apache.org Sent: Tuesday, March 19, 2013 11:50:35 AM Subject: Truncate behaviour Hello guys! I'm researching the behaviour for truncate operations at cassandra. Reading the oficial wiki page( http://wiki.apache.org/cassandra/API ) we can understand it as: Removes all the rows from the given column family. And reading the DataStax page( http://www.datastax.com/docs/1.0/references/cql/TRUNCATE ) we can understand it as: A TRUNCATE statement results in the immediate, irreversible removal of all data in the named column family. But I think there is a missing and important point about truncate operations. At least at 1.2.0 version, whenever I run a truncate operation, C* automatically creates a snapshot file of the column family, resulting in a fake free disk space. I'm intentionally mentioning 'fake free disk space' because I only figured it out when the machine disk space was at high usage. - Is it a security C* behaviour of creating snapshots for each CF before truncate operation? - In my scenario I need to purge my column family data every day. I thought that truncate could handle it based at the docs. But it doesnt. And since I dont want to manually delete those snapshots, I'd like to know if there is a safe and practical way to perform a daily purge of this CF data. Thanks in advance!
Column Slice Query performance after deletions
Hello guys. I'm investigating the reasons of performance degradation for my case scenario which follows: - I do have a column family which is filled of thousands of columns inside a unique row(varies between 10k ~ 200k). And I do have also thousands of rows, not much more than 15k. - This rows are constantly updated. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks like that the longer is the number of columns being deleted, the longer is the time spent for a query. - Internally at C*, does column slice query ranges among deleted columns? If so, how can I mitigate the impact in my queries? Or, how can I avoid those deleted columns?
Re: Column Slice Query performance after deletions
I have a daily maintenance of my cluster where I truncate this column family. Because its data doesnt need to be kept more than a day. Since all the regular operations on it finishes around 4 hours before finishing the day. I regurlarly run a truncate on it followed by a repair at the end of the day. And every day, when the operations are started(when are only few deleted columns), the performance looks pretty well. Unfortunately it is degraded along the day. On Sat, Mar 2, 2013 at 2:54 PM, Michael Kjellman mkjell...@barracuda.comwrote: When is the last time you did a cleanup on the cf? On Mar 2, 2013, at 9:48 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hello guys. I'm investigating the reasons of performance degradation for my case scenario which follows: - I do have a column family which is filled of thousands of columns inside a unique row(varies between 10k ~ 200k). And I do have also thousands of rows, not much more than 15k. - This rows are constantly updated. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks like that the longer is the number of columns being deleted, the longer is the time spent for a query. - Internally at C*, does column slice query ranges among deleted columns? If so, how can I mitigate the impact in my queries? Or, how can I avoid those deleted columns? Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
Re: Column Slice Query performance after deletions
What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) gr_grace is default. Casandra's data files are write once. Deletes are another write. Until compaction they all live on disk.Making really big rows has these problem. Oh, so it looks like I should lower the min_compaction_threshold for this column family. Right? What does realy mean this threeshold value? Guys, thanks for the help so far. On Sat, Mar 2, 2013 at 3:42 PM, Michael Kjellman mkjell...@barracuda.comwrote: What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) On Mar 2, 2013, at 10:28 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: I have a daily maintenance of my cluster where I truncate this column family. Because its data doesnt need to be kept more than a day. Since all the regular operations on it finishes around 4 hours before finishing the day. I regurlarly run a truncate on it followed by a repair at the end of the day. And every day, when the operations are started(when are only few deleted columns), the performance looks pretty well. Unfortunately it is degraded along the day. On Sat, Mar 2, 2013 at 2:54 PM, Michael Kjellman mkjell...@barracuda.comwrote: When is the last time you did a cleanup on the cf? On Mar 2, 2013, at 9:48 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hello guys. I'm investigating the reasons of performance degradation for my case scenario which follows: - I do have a column family which is filled of thousands of columns inside a unique row(varies between 10k ~ 200k). And I do have also thousands of rows, not much more than 15k. - This rows are constantly updated. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks like that the longer is the number of columns being deleted, the longer is the time spent for a query. - Internally at C*, does column slice query ranges among deleted columns? If so, how can I mitigate the impact in my queries? Or, how can I avoid those deleted columns? Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com http://www.copy.com?a=em_footer.
Re: Column Slice Query performance after deletions
Tombstones stay around until gc grace so you could lower that to see of that fixes the performance issues. If the tombstones get collected,the column will live again, causing data inconsistency since I cant run a repair during the regular operations. Not sure if I got your thoughts on this. Size tiered or leveled comparison? I'm actuallly running on Size Tiered Compaction, but I've been looking into changing it for Leveled. It seems to be the case. Although even if I achieve some performance, I would still have the same problem with the deleted columns. I need something to keep the deleted columns away from my query fetch. Not only the tombstones. It looks like the min compaction might help on this. But I'm not sure yet on what would be a reasonable value for its threeshold. On Sat, Mar 2, 2013 at 4:22 PM, Michael Kjellman mkjell...@barracuda.comwrote: Tombstones stay around until gc grace so you could lower that to see of that fixes the performance issues. Size tiered or leveled comparison? On Mar 2, 2013, at 11:15 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) gr_grace is default. Casandra's data files are write once. Deletes are another write. Until compaction they all live on disk.Making really big rows has these problem. Oh, so it looks like I should lower the min_compaction_threshold for this column family. Right? What does realy mean this threeshold value? Guys, thanks for the help so far. On Sat, Mar 2, 2013 at 3:42 PM, Michael Kjellman mkjell...@barracuda.comwrote: What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) On Mar 2, 2013, at 10:28 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: I have a daily maintenance of my cluster where I truncate this column family. Because its data doesnt need to be kept more than a day. Since all the regular operations on it finishes around 4 hours before finishing the day. I regurlarly run a truncate on it followed by a repair at the end of the day. And every day, when the operations are started(when are only few deleted columns), the performance looks pretty well. Unfortunately it is degraded along the day. On Sat, Mar 2, 2013 at 2:54 PM, Michael Kjellman mkjell...@barracuda.com wrote: When is the last time you did a cleanup on the cf? On Mar 2, 2013, at 9:48 AM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hello guys. I'm investigating the reasons of performance degradation for my case scenario which follows: - I do have a column family which is filled of thousands of columns inside a unique row(varies between 10k ~ 200k). And I do have also thousands of rows, not much more than 15k. - This rows are constantly updated. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks like that the longer is the number of columns being deleted, the longer is the time spent for a query. - Internally at C*, does column slice query ranges among deleted columns? If so, how can I mitigate the impact in my queries? Or, how can I avoid those deleted columns? Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com http://www.copy.com?a=em_footer. -- Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com http://www.copy.com?a=em_footer.
Re: Reading old data problem
Ok guys let me try to ask it in a different way: Will repair totally ensure a data synchronism among nodes? Extra question: Once I write at CL=All, will C* ensure that I can read from ANY node without an inconsistency? The reverse state, writing at CL=One but reading at CL=All will also ensure that? On Wed, Feb 27, 2013 at 11:24 PM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com wrote: Hello, I need some help to manage my live cluster! I'm currently running a cluster with 2 nodes, RF:2, CL:1. Since I'm limited to hardware upgrade issues, I'm not able to increase my ConsitencyLevel for now. Anyway, * *I ran a full repair on each node of the cluster followed by a flush. Although I'm still reading old data when performing queries. Well it's know that I might read old data during normal operations, but shouldnt it be sync after the full antientropy repair? What I'm missing? Thanks in advance!
Reading old data problem
Hello, I need some help to manage my live cluster! I'm currently running a cluster with 2 nodes, RF:2, CL:1. Since I'm limited to hardware upgrade issues, I'm not able to increase my ConsitencyLevel for now. Anyway, * *I ran a full repair on each node of the cluster followed by a flush. Although I'm still reading old data when performing queries. Well it's know that I might read old data during normal operations, but shouldnt it be sync after the full antientropy repair? What I'm missing? Thanks in advance!
Understanding system.log
Hello everyone! I'd like to know if there is any guide or description of the cassandra server log(system.log). I mean, how should I interpret each log event, and what information may I retain for it;
Re: Mutation dropped
Aaron, what did u mean with RF3 CLQuorum is more a real world scenario? If there are only 2 nodes, where will be placed the third replica? By increasing the CL wont it decrease the performance on w/r and then increase the timeoutexceptions of this mentioned case? On Fri, Feb 22, 2013 at 1:59 PM, aaron morton aa...@thelastpickle.comwrote: If you are running repair, using QUORUM, and there are not dropped writes you should not be getting DigestMismatch during reads. If everything else looks good, but the request latency is higher than the CF latency I would check that client load is evenly distributed. Then start looking to see if the request throughput is at it's maximum for the cluster. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 22/02/2013, at 8:15 PM, Wei Zhu wz1...@yahoo.com wrote: Thanks Aaron for the great information as always. I just checked cfhistograms and only a handful of read latency are bigger than 100ms, but for proxyhistograms there are 10 times more are greater than 100ms. We are using QUORUM for reading with RF=3, and I understand coordinator needs to get the digest from other nodes and read repair on the miss match etc. But is it normal to see the latency from proxyhistograms to go beyond 100ms? Is there anyway to improve that? We are tracking the metrics from Client side and we see the 95th percentile response time averages at 40ms which is a bit high. Our 50th percentile was great under 3ms. Any suggestion is very much appreciated. Thanks. -Wei - Original Message - From: aaron morton aa...@thelastpickle.com To: Cassandra User user@cassandra.apache.org Sent: Thursday, February 21, 2013 9:20:49 AM Subject: Re: Mutation dropped What does rpc_timeout control? Only the reads/writes? Yes. like data stream, streaming_socket_timeout_in_ms in the yaml merkle tree request? Either no time out or a number of days, cannot remember which right now. What is the side effect if it's set to a really small number, say 20ms? You will probably get a lot more requests that fail with a TimedOutException. rpc_timeout needs to be longer than the time it takes a node to process the message, and the time it takes the coordinator to do it's thing. You can look at cfhistograms and proxyhistograms to get a better idea of how long a request takes in your system. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 21/02/2013, at 6:56 AM, Wei Zhu wz1...@yahoo.com wrote: What does rpc_timeout control? Only the reads/writes? How about other inter-node communication, like data stream, merkle tree request? What is the reasonable value for roc_timeout? The default value of 10 seconds are way too long. What is the side effect if it's set to a really small number, say 20ms? Thanks. -Wei From: aaron morton aa...@thelastpickle.com To: user@cassandra.apache.org Sent: Tuesday, February 19, 2013 7:32 PM Subject: Re: Mutation dropped Does the rpc_timeout not control the client timeout ? No it is how long a node will wait for a response from other nodes before raising a TimedOutException if less than CL nodes have responded. Set the client side socket timeout using your preferred client. Is there any param which is configurable to control the replication timeout between nodes ? There is no such thing. rpc_timeout is roughly like that, but it's not right to think about it that way. i.e. if a message to a replica times out and CL nodes have already responded then we are happy to call the request complete. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 19/02/2013, at 1:48 AM, Kanwar Sangha kan...@mavenir.com wrote: Thanks Aaron. Does the rpc_timeout not control the client timeout ? Is there any param which is configurable to control the replication timeout between nodes ? Or the same param is used to control that since the other node is also like a client ? From: aaron morton [mailto:aa...@thelastpickle.com] Sent: 17 February 2013 11:26 To: user@cassandra.apache.org Subject: Re: Mutation dropped You are hitting the maximum throughput on the cluster. The messages are dropped because the node fails to start processing them before rpc_timeout. However the request is still a success because the client requested CL was achieved. Testing with RF 2 and CL 1 really just tests the disks on one local machine. Both nodes replicate each row, and writes are sent to each replica, so the only thing the client is waiting on is the local node to write to it's commit log. Testing with (and running in prod) RF3 and CL QUROUM is a more real world scenario. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton
Re: Both nodes own 100% of cluster
Why have u assigned for both nodes a genenerated token? And how you calculated it? Shouldnt u choose one of them to has its token as the '0' start value? At least that is what is said on the tutorials I've read. On Mon, Feb 18, 2013 at 2:55 PM, Boris Solovyov boris.solov...@gmail.comwrote: What does the it mean that each node owns effective 100% of cluster? Both nodes report same output. [ec2-user@ip-10-152-162-228 ~]$ nodetool status Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.152.162.22862.98 KB 256 100.0% 7c50a482-1a0b-4dda-a58c-9232c2f18149 1a UN 10.147.166.20760.98 KB 256 100.0% 4aebbf59-dbe5-4736-a7b7-6a59611e66e5 1a
Deletion consistency
hello everyone! I have a column family filled with event objects which need to be processed by query threads. Once each thread query for those objects(spread among columns bellow a row), it performs a delete operation for each object in cassandra. It's done in order to ensure that these events wont be processed again. Some tests has showed me that it works, but sometimes i'm not getting those events deleted. I checked it through cassandra-cli,etc. So, reading it (http://wiki.apache.org/cassandra/DistributedDeletes) I came to a conclusion that I may be reading old data. My cluster is currently configured as: 2 nodes, RF1, CL 1. In that case, what should I do? - Increase the consistency level for the write operations( in that case, the deletions ). In order to ensure that those deletions are stored in all nodes. or - Increase the consistency level for the read operations. In order to ensure that I'm reading only those yet processed events(deleted). ? - Thanks in advance
Re: Deletion consistency
*Mike*, for now I can't upgrade my cluster. I'm going to check the servers time sync. Thanks; *Bryan*, so u think it's not a distributed deleted problem. Thanks for bringing it up. Btw, hector should not be hiding any exception from me. Although there's a mutator reuse in my application. I'm gonna check if it may be the problem too. Guys, one more doubt: Is there any limitation on how many columns should I delete per delete operation? I'm currently sending 100 deletions each time. On Fri, Feb 15, 2013 at 4:46 PM, Bryan Talbot btal...@aeriagames.comwrote: With a RF and CL of one, there is no replication so there can be no issue with distributed deletes. Writes (and reads) can only go to the one host that has the data and will be refused if that node is down. I'd guess that your app isn't deleting records when you think that it is, or that the delete is failing but not being detected as failed. -Bryan On Fri, Feb 15, 2013 at 10:21 AM, Mike mthero...@yahoo.com wrote: If you increase the number of nodes to 3, with an RF of 3, then you should be able to read/delete utilizing a quorum consistency level, which I believe will help here. Also, make sure the time of your servers are in sync, utilizing NTP, as drifting time between you client and server could cause updates to be mistakenly dropped for being old. Also, make sure you are running with a gc_grace period that is high enough. The default is 10 days. Hope this helps, -Mike On 2/15/2013 1:13 PM, Víctor Hugo Oliveira Molinar wrote: hello everyone! I have a column family filled with event objects which need to be processed by query threads. Once each thread query for those objects(spread among columns bellow a row), it performs a delete operation for each object in cassandra. It's done in order to ensure that these events wont be processed again. Some tests has showed me that it works, but sometimes i'm not getting those events deleted. I checked it through cassandra-cli,etc. So, reading it (http://wiki.apache.org/**cassandra/DistributedDeleteshttp://wiki.apache.org/cassandra/DistributedDeletes) I came to a conclusion that I may be reading old data. My cluster is currently configured as: 2 nodes, RF1, CL 1. In that case, what should I do? - Increase the consistency level for the write operations( in that case, the deletions ). In order to ensure that those deletions are stored in all nodes. or - Increase the consistency level for the read operations. In order to ensure that I'm reading only those yet processed events(deleted). ? - Thanks in advance
Re:
How do you establish the connection? Are you closing and reopening it? It's normal for cassandra slowing down after many insertions, but it would only take more time to process your write, nothing more than that. On Fri, Feb 1, 2013 at 5:53 PM, Marcelo Elias Del Valle mvall...@gmail.comwrote: Hello, I am trying to figure out why the following behavior happened. Any help would be highly appreciated. This graph shows the server resources allocation of my single cassandra machine (running at Amazon EC2): http://mvalle.com/downloads/cassandra_host1.png I ran a hadoop process that reads a CSV file and writtes data to Cassandra. For about 1 h, the process ran fine, but taking about 100% of CPU. After 1 h, my hadoop process started to have its connection attempts refused by cassandra, as shown bellow. Since them, it has been taking 100% of the machine IO. It has been 2 h already since the IO is 100% on the machine running Cassandra. I am running Cassandra under Amazon EBS, which is slow, but I didn't think it would be that slow. Just wondering, is it normal for Cassandra to use a high amount of CPU? I am guessing all the writes were going to the memtables and when it was time to flush the server went down. Makes sense? I am still learning Cassandra as it's the first time I use it in production, so I am not sure if I am missing something really basic here. 2013-02-01 16:44:43,741 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-18): EXCEPTION:PoolTimeoutException: [host=(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=nosql1.s1mbi0se.com.br(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:201) at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:158) at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:60) at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:50) at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:229) at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1.execute(ThriftColumnFamilyQueryImpl.java:186) at com.s1mbi0se.dmp.input.service.InputService.searchUserByKey(InputService.java:700) ... at com.s1mbi0se.dmp.importer.map.ImporterMapper.map(ImporterMapper.java:20) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268) 2013-02-01 16:44:43,743 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-15): EXCEPTION:PoolTimeoutException: Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
Re: initial_token
Do not set initial_token when using murmur3partitioner. instead, set num_tokens. For example, u have 3 hosts with the same hardware setup, then, for each one set the same num_tokens. But now consider adding another better host, this time i'd suggest you to set previous num_tokens * 2. num_tokens: 128 (worse machines) num_tokens: 256(twice better machine) This is the setup of virtual nodes. Check current datastax docs for it. On Thu, Jan 31, 2013 at 8:43 PM, Edward Capriolo edlinuxg...@gmail.comwrote: This is the bad side of changing default. There are going to be a few groups unfortunates. The first group, who only can not setup their cluster, and eventually figure out their tokens. (this thread) The second group, who assume their tokens were correct and run around with an unbalanced cluster thinking the performance sucks. (the threads for the next few months) The third group, who will google how to balance my ring and find a page with random partitioner instructions. (the occasional thread for the next N years) The fourth group, because as of now map reduce is highly confused by this. On Thu, Jan 31, 2013 at 4:52 PM, Rob Coli rc...@palominodb.com wrote: On Thu, Jan 31, 2013 at 12:17 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Now by default a new partitioner is chosen Murmer3. Now = as of 1.2, to be unambiguous. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Large commit log reasons
Hi fellows. I current have 3 nodes cluster running with a replication factor of 1. It's a pretty simple deployment and all my enforcements are focused in writes rather than reads. Actually I'm noticing that my commit log size is always very big if compared to the ammout of data being persisted(which varies on 5gb). So, that lead me to three doubts: 1- When a commit log gets bigger, does it mean that cassandra hasnt processed yet those writes? 2- How could I speed my flushes to sstables? 3- Does my commit log decrease as much as my sstable increases? Is it a rule?