Re: select count(*) returns 10000
Add limit N and it will count more than 1. Of course it will be slow when you increase N. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Jun 12, 2012 at 10:07 PM, Derek Williams de...@fyrie.net wrote: It's a known issue, here is a bit extra info on it: http://stackoverflow.com/questions/8795923/wrong-count-with-cassandra-cql On Tue, Jun 12, 2012 at 12:40 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: The select count(*) ... query returns correct count only if it is = 1, otherwise it returns exactly 1. This happens in both Java API and cqlsh. Can somebody verify? This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email. -- Derek Williams tokLogo.png
portability between enterprise and community version
Hi All, Is it possible to communicate from a datastax enterprise edition to datastax community edition. Actually i want to set one of my node in linux box and other in windows. Please suggest. With Regards, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395
RE: Hector code not running when replication factor set to 2
The error stack is as follows: Exception in thread main java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:59) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49) at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48) at me.prettyprint.cassandra.service.ColumnSliceIterator.hasNext(ColumnSliceIterator.java:88) at com.musigma.hectortest.HectorUtilTest.getAllColumns(HectorUtil.java:70) at com.musigma.hectortest.HectorUtil.main(HectorUtil.java:168) ... 5 more Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:543) at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:527) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:273) ... 18 more The node tool ring from the node I connected to looks like : Address DC RackStatus State Load Effective-Owership Token 85070591730234615865843651857942052864 162.192.100.16 datacenter1 rack1 Up Normal 239.82 MB 100.00% 0 162.192.100.9 datacenter1 rack1 Down Normal 239.81 MB 100.00% 85070591730234615865843651857942052864 I did not find any error in the logs generated by Cassandra on the running machine. Please help me Thanks and Regards Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:42 PM To: user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What was the exact error stack you got ? What does node tool ring look look from the node you connected to? Did you notice any errors in the logs on the machine you connected to ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 8:41 PM, Prakrati Agrawal wrote: I am using the consistency level one and replication factor 2 Thanks Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:12 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What consistency level and replication factor were you using ? UnavailableException is thrown when less the consistency level nodes are UP. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 11/06/2012, at 10:19 PM, Prakrati Agrawal wrote: Dear all I had a 2 node cluster with replication factor set to 1. Then I changed the replication factor to 2 and brought down one node so that only 1 node was up and running. Then I ran my Hector code on the running node. But it gave me Unavailable Exception. I also had a Thrift code which ran successfully. I am confused as to why the Hector code did not run. Did I miss something? Please help me. Thanks and Regards Prakrati This email
RE: portability between enterprise and community version
Do not mix Linux and Windows nodes. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Abhijit Chanda [mailto:abhijit.chan...@gmail.com] Sent: Wednesday, June 13, 2012 09:21 To: user@cassandra.apache.org Subject: portability between enterprise and community version Hi All, Is it possible to communicate from a datastax enterprise edition to datastax community edition. Actually i want to set one of my node in linux box and other in windows. Please suggest. With Regards, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395 inline: signature-logo368b.png
Re: portability between enterprise and community version
@Viktor: I've read/heard this many times before, however I've never seen a real explanation. Java is cross platform. If Cassandra runs properly on both Linux as Windows clusters: why would it be impossible to communicate? Of course I understand the disadvantages of having a combined cluster. 2012/6/13 Viktor Jevdokimov viktor.jevdoki...@adform.com Do not mix Linux and Windows nodes. ** ** ** ** Best regards / Pagarbiai *Viktor Jevdokimov* Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsider http://twitter.com/#!/adforminsider What is Adform: watch this short video http://vimeo.com/adform/display [image: Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. *From:* Abhijit Chanda [mailto:abhijit.chan...@gmail.com] *Sent:* Wednesday, June 13, 2012 09:21 *To:* user@cassandra.apache.org *Subject:* portability between enterprise and community version ** ** Hi All, ** ** Is it possible to communicate from a datastax enterprise edition to datastax community edition. Actually i want to set one of my node in linux box and other in windows. Please suggest. ** ** With Regards, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395 -- With kind regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. signature-logo368b.png
How commitlog_sync affect the performance??
Hi All?? Is there anybody has ever done some testing about the commitlog_sync mode? and how it affect the performance??
Re: portability between enterprise and community version
Hi Viktor Jevadokimov, May i know what are the issues i may face if i mix windows cluster along with linux cluster. Regards, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395
RE: portability between enterprise and community version
Repair (streaming) will not work. Probably schema update will not work also, it was long time ago, don't remember. Migration of the cluster between Windows and Linux also not an easy task, a lot of manual work. Finally, mixed Cassandra environments are not supported as by DataStax as by anyone else. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Abhijit Chanda [mailto:abhijit.chan...@gmail.com] Sent: Wednesday, June 13, 2012 10:54 To: user@cassandra.apache.org Subject: Re: portability between enterprise and community version Hi Viktor Jevadokimov, May i know what are the issues i may face if i mix windows cluster along with linux cluster. Regards, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395 inline: signature-logo29.png
Re: Much more native memory used by Cassandra then the configured JVM heap size
We suppose the cached memory will be released by OS, but from /proc/meminfo , the cached memory is in Active status, so I am not sure if it will be release by OS. And for low memory, because we found Unable to reduce heap usage since there are no dirty column families in system.log, and then Cassandra on this node marked as down. And because we configure JVM heap 6G and memtable 1G, so I don't know why we have OOMs error. So we wonder the Cassandra down caused by 1. Low OS memory 2. impact by our configuration: memtable_flush_writers=32, memtable_flush_queue_size=12 3. Caused by delete operation (The data in our traffic is dynamical, which means each request may be deleted in one hour, new will be inserted) https://issues.apache.org/jira/browse/CASSANDRA-3741 So we want to find out why the Cassandra down after 24 hours load test. (RCA of OOM) 2012/6/12 aaron morton aa...@thelastpickle.com see http://wiki.apache.org/cassandra/FAQ#mmap which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. Is the low free memory causing a problem ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 5:52 PM, Jason Tang wrote: Hi I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. I use top to monitor total memory used by Cassandra. = -Xms6G -Xmx6G -Xmn1600M 2012/6/12 Jeffrey Kesselman jef...@gmail.com Btw. I suggest you spin up JConsole as it will give you much more detai kon what your VM is actually doing. On Mon, Jun 11, 2012 at 9:14 PM, Jason Tang ares.t...@gmail.com wrote: Hi We have some problem with Cassandra memory usage, we configure the JVM HEAP 6G, but after runing Cassandra for several hours (insert, update, delete). The total memory used by Cassandra go up to 15G, which cause the OS low memory. So I wonder if it is normal to have so many memory used by cassandra? And how to limit the native memory used by Cassandra? === Cassandra 1.0.3, 64 bit jdk. Memory ocupied by Cassandra 15G PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9567 casadm20 0 28.3g 15g 9.1g S 269 65.1 385:57.65 java = -Xms6G -Xmx6G -Xmn1600M # ps -ef | grep 9567 casadm9567 1 55 Jun11 ?05:59:44 /opt/jdk1.6.0_29/bin/java -ea -javaagent:/opt/dve/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=6080 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Daccess.properties=/opt/dve/cassandra/conf/access.properties -Dpasswd.properties=/opt/dve/cassandra/conf/passwd.properties -Dpasswd.mode=MD5 -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -cp
Re: Dead node still being pinged
Here is what I *think* is going on, if Brandon is around he may be able to help out. The old nodes are being included in the Gossip rounds, because Gossiper.doGossipToUnreachableMember() just looks at the nodes that are unreachable. It does not check if they have been removed from the cluster. Information about the removed nodes is kept by gossip so that if a node is removed while it is down it will shut down when restarted. This information *should* stay in gossip for 3 days. In your gossip info, the last long on the STATUS lines is the expiry time for this info… /10.10.0.24 STATUS:removed,127605887595351923798765477786913079296,1336530323263 REMOVAL_COORDINATOR:REMOVER,0 /10.10.0.22 STATUS:removed,42535295865117307932921825928971026432,1336529659203 REMOVAL_COORDINATOR:REMOVER,113427455640312814857969558651062452224 For the first line it's In [48]: datetime.datetime.fromtimestamp(1336530323263/1000) Out[48]: datetime.datetime(2012, 5, 9, 14, 25, 23) So that's good. The Gossip round will remove the 0.24 and 0.22 nodes from the local state if the expiry time has passed, and the node is marked as dead and it's not in the token ring. You can see if the node thinks 0.24 and 0.22 are up by looking getSimpleStates() on the FailureDetectorMBean. (I use jmxterm to do this sort of thing) The other thing that can confuse things is the gossip generation. If your old nodes were started with a datetime in the future that can muck things up. The simple to try is starting the server with the -Dcassandra.join_ring=false JVM option. This will force the node to get the ring info from othernodes. Check things with nodetool gossip info to see if the other nodes tell it about the old ones again. Sorry, gossip can be tricky to diagnose over email. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 10:33 PM, Nicolas Lalevée wrote: I have one dirty solution to try: bring data-2 and data-4 back up and down again. Is there any way I can tell cassandra to not get any data, so when I would get my old node up, no streaming would start ? cheers, Nicolas Le 12 juin 2012 à 12:25, Nicolas Lalevée a écrit : Le 12 juin 2012 à 11:03, aaron morton a écrit : Try purging the hints for 10.10.0.24 using the HintedHandOffManager MBean. As far as I could tell, there were no hinted hand off to be delivered. Nevertheless I have called deleteHintsForEndpoint on every node for the two expected to be out nodes. Nothing changed, I still see packet being send to these old nodes. I looked closer to ResponsePendingTasks of MessagingService. Actually the numbers change, between 0 and about 4. So tasks are ending but new ones come just after. Nicolas Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 3:33 AM, Nicolas Lalevée wrote: finally, thanks to the groovy jmx builder, it was not that hard. Le 11 juin 2012 à 12:12, Samuel CARRIERE a écrit : If I were you, I would connect (through JMX, with jconsole) to one of the nodes that is sending messages to an old node, and would have a look at these MBean : - org.apache.net.FailureDetector : does SimpleStates looks good ? (or do you see an IP of an old node) SimpleStates:[/10.10.0.22:DOWN, /10.10.0.24:DOWN, /10.10.0.26:UP, /10.10.0.25:UP, /10.10.0.27:UP] - org.apache.net.MessagingService : do you see one of the old IP in one of the attributes ? data-5: CommandCompletedTasks: [10.10.0.22:2, 10.10.0.26:6147307, 10.10.0.27:6084684, 10.10.0.24:2] CommandPendingTasks: [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0] ResponseCompletedTasks: [10.10.0.22:1487, 10.10.0.26:6187204, 10.10.0.27:6062890, 10.10.0.24:1495] ResponsePendingTasks: [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0] data-6: CommandCompletedTasks: [10.10.0.22:2, 10.10.0.27:6064992, 10.10.0.24:2, 10.10.0.25:6308102] CommandPendingTasks: [10.10.0.22:0, 10.10.0.27:0, 10.10.0.24:0, 10.10.0.25:0] ResponseCompletedTasks: [10.10.0.22:1463, 10.10.0.27:6067943, 10.10.0.24:1474, 10.10.0.25:6367692] ResponsePendingTasks: [10.10.0.22:0, 10.10.0.27:0, 10.10.0.24:2, 10.10.0.25:0] data-7: CommandCompletedTasks: [10.10.0.22:2, 10.10.0.26:6043653, 10.10.0.24:2, 10.10.0.25:5964168] CommandPendingTasks: [10.10.0.22:0, 10.10.0.26:0, 10.10.0.24:0, 10.10.0.25:0] ResponseCompletedTasks: [10.10.0.22:1424, 10.10.0.26:6090251, 10.10.0.24:1431, 10.10.0.25:6094954] ResponsePendingTasks: [10.10.0.22:4, 10.10.0.26:0, 10.10.0.24:1, 10.10.0.25:0] - org.apache.net.StreamingService : do you see an old IP in StreamSources or StreamDestinations ? nothing streaming on the 3 nodes. nodetool netstats confirmed that. - org.apache.internal.HintedHandoff : are there non-zero ActiveCount, CurrentlyBlockedTasks, PendingTasks, TotalBlockedTask ? On the 3 nodes, all at 0. I don't
Re: Odd problem with cli and display of value types
The set with the type casts updates the client side column meta data with the type (just like assume does). So after the first set the client will act as if you said assume users validator as long; In this case it's not particularly helpful. Can you add a trivial ticket to https://issues.apache.org/jira/browse/CASSANDRA to update the example ? Thanks - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/06/2012, at 3:40 AM, Holger Hoffstaette wrote: While trying to play around with 1.1.1 and secondary indexes I just noticed something odd in cassandra-cli. Example straight from the README: -- show Mr. Smith holgercassandra-cli [..] [default@Users] list users; Using default limit of 100 Using default column limit of 100 --- RowKey: jsmith = (column=first, value=John, timestamp=1339507271651000) = (column=last, value=Smith, timestamp=1339507280745000) 1 Row Returned. Elapsed time: 0 msec(s). -- Hello Mr. Smith with no age. -- You should be 64 years old: [default@Users] set Users[jsmith][age] = long(64); Value inserted. Elapsed time: 16 msec(s). [default@Users] list users; Using default limit of 100 Using default column limit of 100 --- RowKey: jsmith = (column=age, value=64, timestamp=1339513585914000) = (column=first, value=John, timestamp=1339507271651000) = (column=last, value=Smith, timestamp=1339507280745000) 1 Row Returned. Elapsed time: 0 msec(s). [default@Users] -- That worked, as expected. Exit restart the cli holgercassandra-cli [..] [default@Users] list users; Using default limit of 100 Using default column limit of 100 --- RowKey: jsmith = (column=age, value= @, timestamp=1339513585914000) = (column=first, value=John, timestamp=1339507271651000) = (column=last, value=Smith, timestamp=1339507280745000) 1 Row Returned. Elapsed time: 78 msec(s). [default@Users] // age=@ you say? I understand of course that since the default validation class is set to UTF8 I should have inserted '64' as age and not the long(64) as given in the README - probably an oversight/bug/typo. The README uses 42 as value, which results in a * as output. To verify the behaviour I used 64, which is the ASCII value of @. What I find more curious is that the cli displays the value in human-readable form immediately after insertion, yet a new session displays it in native form (as it should). Should it not always display the value according to the validation class, i.e. show the @ immediately after insertion? thanks, Holger
Re: Disappearing keyspaces in Cassandra 1.1
Can you supply the error logs ? Thanks - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/06/2012, at 6:54 AM, Oleg Dulin wrote: I am using cassandra 1.1.0 on a 3 node environment. I just truncated a few column families then restarted the nodes. now when I restarted them it says my keyspace doesn't exist. The data for the keyspace is still in the data directory. Does anyone know what could have caused this?
Re: portability between enterprise and community version
I consistently move keyspaces from linux machines onto windows machines for development purposes. I've had no issues ... but would probably be hesitant in rolling this out into a productive instance. Depends on the level of risk you want to take. : ) Run some tests ... mix things up and share your experiences ... Personally, I could see some value in not really caring what OS my cassandra instances are running on ... just that the JVM's are consistent and the available hardware resources are sufficient I don't speak for the vendors mentioned in this thread, but traditionally, the first step towards supportability is finding the problems / identifying the risks and see if they can be resolved ... -sd On Wed, Jun 13, 2012 at 10:26 AM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Repair (streaming) will not work. ** ** Probably schema update will not work also, it was long time ago, don’t remember. ** ** Migration of the cluster between Windows and Linux also not an easy task, a lot of manual work. ** ** Finally, mixed Cassandra environments are not supported as by DataStax as by anyone else. ** ** ** ** Best regards / Pagarbiai *Viktor Jevdokimov* Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#%21/adforminsider What is Adform: watch this short video http://vimeo.com/adform/display [image: Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. *From:* Abhijit Chanda [mailto:abhijit.chan...@gmail.com] *Sent:* Wednesday, June 13, 2012 10:54 *To:* user@cassandra.apache.org *Subject:* Re: portability between enterprise and community version ** ** Hi Viktor Jevadokimov, May i know what are the issues i may face if i mix windows cluster along with linux cluster. signature-logo29.png
Re: Hector code not running when replication factor set to 2
It looks like the request used a CL higher than ONE. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/06/2012, at 6:49 PM, Prakrati Agrawal wrote: The error stack is as follows: Exception in thread main java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:59) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49) at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48) at me.prettyprint.cassandra.service.ColumnSliceIterator.hasNext(ColumnSliceIterator.java:88) at com.musigma.hectortest.HectorUtilTest.getAllColumns(HectorUtil.java:70) at com.musigma.hectortest.HectorUtil.main(HectorUtil.java:168) ... 5 more Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:543) at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:527) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:273) ... 18 more The node tool ring from the node I connected to looks like : Address DC RackStatus State Load Effective-Owership Token 85070591730234615865843651857942052864 162.192.100.16 datacenter1 rack1 Up Normal 239.82 MB 100.00% 0 162.192.100.9 datacenter1 rack1 Down Normal 239.81 MB 100.00% 85070591730234615865843651857942052864 I did not find any error in the logs generated by Cassandra on the running machine. Please help me Thanks and Regards Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:42 PM To: user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What was the exact error stack you got ? What does node tool ring look look from the node you connected to? Did you notice any errors in the logs on the machine you connected to ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 8:41 PM, Prakrati Agrawal wrote: I am using the consistency level one and replication factor 2 Thanks Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:12 PM To: user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What consistency level and replication factor were you using ? UnavailableException is thrown when less the consistency level nodes are UP. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 11/06/2012, at 10:19 PM, Prakrati Agrawal wrote: Dear all I had a 2 node cluster with replication factor set to 1. Then I
RE: Hector code not running when replication factor set to 2
How do I verify that I am using a wrong consistency level? Thanks and Regards Prakrati Agrawal | Developer - Big Data(ID)| 9731648376 | www.mu-sigma.com From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Wednesday, June 13, 2012 2:38 PM To: user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 It looks like the request used a CL higher than ONE. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/06/2012, at 6:49 PM, Prakrati Agrawal wrote: The error stack is as follows: Exception in thread main java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:59) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49) at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48) at me.prettyprint.cassandra.service.ColumnSliceIterator.hasNext(ColumnSliceIterator.java:88) at com.musigma.hectortest.HectorUtilTest.getAllColumns(HectorUtil.java:70) at com.musigma.hectortest.HectorUtil.main(HectorUtil.java:168) ... 5 more Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:543) at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:527) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:273) ... 18 more The node tool ring from the node I connected to looks like : Address DC RackStatus State Load Effective-Owership Token 85070591730234615865843651857942052864 162.192.100.16 datacenter1 rack1 Up Normal 239.82 MB 100.00% 0 162.192.100.9 datacenter1 rack1 Down Normal 239.81 MB 100.00% 85070591730234615865843651857942052864 I did not find any error in the logs generated by Cassandra on the running machine. Please help me Thanks and Regards Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:42 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What was the exact error stack you got ? What does node tool ring look look from the node you connected to? Did you notice any errors in the logs on the machine you connected to ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 8:41 PM, Prakrati Agrawal wrote: I am using the consistency level one and replication factor 2 Thanks Prakrati From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 2:12 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Hector code not running when replication factor set to 2 What consistency level and replication factor were you using ? UnavailableException is thrown when less the consistency level nodes are UP. Cheers - Aaron Morton Freelance
Re: Much more native memory used by Cassandra then the configured JVM heap size
Low OS memory Low OS memory is not the same as low JVM memory. Normally the JVM allocates and locks all the memory is needs at start up. impact by our configuration: memtable_flush_writers=32, memtable_flush_queue_size=12 increasing flush writers will impact on IO, increasing the queue size would only increase memory usage in extreme circumstances (i.e. when the io system cannot keep up) Caused by delete operation (The data in our traffic is dynamical, which means each request may be deleted in one hour, new will be inserted) https://issues.apache.org/jira/browse/CASSANDRA-3741 Maybe. Some other people have talked about low JVM memory and the Unable to reduce… log message who were not doing a lot of deletes. So we want to find out why the Cassandra down after 24 hours load test. (RCA of OOM) I would reset all configuration to the default settings, including letting it pick the JVM heap size, and run your test. If you still see Java OOM, low JVM memory, or the Unable to… log message try setting the log config described here http://www.mail-archive.com/user@cassandra.apache.org/msg22850.html Then we can see if there is something stopping things flushing. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/06/2012, at 8:28 PM, Jason Tang wrote: We suppose the cached memory will be released by OS, but from /proc/meminfo , the cached memory is in Active status, so I am not sure if it will be release by OS. And for low memory, because we found Unable to reduce heap usage since there are no dirty column families in system.log, and then Cassandra on this node marked as down. And because we configure JVM heap 6G and memtable 1G, so I don't know why we have OOMs error. So we wonder the Cassandra down caused by Low OS memory impact by our configuration: memtable_flush_writers=32, memtable_flush_queue_size=12 Caused by delete operation (The data in our traffic is dynamical, which means each request may be deleted in one hour, new will be inserted) https://issues.apache.org/jira/browse/CASSANDRA-3741 So we want to find out why the Cassandra down after 24 hours load test. (RCA of OOM) 2012/6/12 aaron morton aa...@thelastpickle.com see http://wiki.apache.org/cassandra/FAQ#mmap which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. Is the low free memory causing a problem ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 5:52 PM, Jason Tang wrote: Hi I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. I use top to monitor total memory used by Cassandra. = -Xms6G -Xmx6G -Xmn1600M 2012/6/12 Jeffrey Kesselman jef...@gmail.com Btw. I suggest you spin up JConsole as it will give you much more detai kon what your VM is actually doing. On Mon, Jun 11, 2012 at 9:14 PM, Jason Tang ares.t...@gmail.com wrote: Hi We have some problem with Cassandra memory usage, we configure the JVM HEAP 6G, but after runing Cassandra for several hours (insert, update, delete). The total memory used by Cassandra go up to 15G, which cause the OS low memory. So I wonder if it is normal to have so many memory used by cassandra? And how to limit the native memory used by Cassandra? === Cassandra 1.0.3, 64 bit jdk. Memory ocupied by Cassandra 15G PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9567 casadm20 0 28.3g 15g 9.1g S 269 65.1 385:57.65 java = -Xms6G -Xmx6G -Xmn1600M # ps -ef | grep 9567 casadm9567 1 55 Jun11 ?05:59:44 /opt/jdk1.6.0_29/bin/java -ea -javaagent:/opt/dve/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
RE: portability between enterprise and community version
I remember that join and decommission didn’t worked since using streaming. All problems was due to paths differences between Windows and Linux styles. So how do you move keyspaces? Using streaming (join/decommission), or manually? Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Sasha Dolgy [mailto:sdo...@gmail.com] Sent: Wednesday, June 13, 2012 12:04 To: user@cassandra.apache.org Subject: Re: portability between enterprise and community version I consistently move keyspaces from linux machines onto windows machines for development purposes. I've had no issues ... but would probably be hesitant in rolling this out into a productive instance. Depends on the level of risk you want to take. : ) Run some tests ... mix things up and share your experiences ... Personally, I could see some value in not really caring what OS my cassandra instances are running on ... just that the JVM's are consistent and the available hardware resources are sufficient I don't speak for the vendors mentioned in this thread, but traditionally, the first step towards supportability is finding the problems / identifying the risks and see if they can be resolved ... -sd On Wed, Jun 13, 2012 at 10:26 AM, Viktor Jevdokimov viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com wrote: Repair (streaming) will not work. Probably schema update will not work also, it was long time ago, don’t remember. Migration of the cluster between Windows and Linux also not an easy task, a lot of manual work. Finally, mixed Cassandra environments are not supported as by DataStax as by anyone else. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063tel:%2B370%205%20212%203063, Fax +370 5 261 0453tel:%2B370%205%20261%200453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#%21/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News]http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Abhijit Chanda [mailto:abhijit.chan...@gmail.commailto:abhijit.chan...@gmail.com] Sent: Wednesday, June 13, 2012 10:54 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: portability between enterprise and community version Hi Viktor Jevadokimov, May i know what are the issues i may face if i mix windows cluster along with linux cluster. inline: image001.pnginline: signature-logo2149.png
Re: portability between enterprise and community version
Hi Sasha, Viktor, In my case i have a project in which both java and .NET modules are there. For Java i m using Astyanax API and for .NET Fluent Cassandra. I am using DSE 2.1 for development purpose as i need partial searching in some of queries, which i m doing with the help of solr which is integrated in DSE latest versions. And this is working fine for all the Java modules, but i want the same for .NET modules also. Thats why i was looking for a mixed environment. Finally i want to ask whether i am moving in the right direction or not? please suggest. Thanks, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395
RE: portability between enterprise and community version
Clients are clients, servers are servers. Why do you need mixed environment Cassandra cluster? Isn't enough mixed clients? Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Abhijit Chanda [mailto:abhijit.chan...@gmail.com] Sent: Wednesday, June 13, 2012 12:41 To: user@cassandra.apache.org Subject: Re: portability between enterprise and community version Hi Sasha, Viktor, In my case i have a project in which both java and .NET modules are there. For Java i m using Astyanax API and for .NET Fluent Cassandra. I am using DSE 2.1 for development purpose as i need partial searching in some of queries, which i m doing with the help of solr which is integrated in DSE latest versions. And this is working fine for all the Java modules, but i want the same for .NET modules also. Thats why i was looking for a mixed environment. Finally i want to ask whether i am moving in the right direction or not? please suggest. Thanks, -- Abhijit Chanda VeHere Interactive Pvt. Ltd. +91-974395 inline: signature-logo29.png
Re: portability between enterprise and community version
Viktor, For .NET currently i am using Datastax Community version as DSE can't be installed in windows. And due to absence of DSE i m not able to use solr as an integrated part of Datastax. Thats why i was looking for the mixed environment so that i can use the features of DSE. Thanks, -- Abhijit Chanda Software Developer VeHere Interactive Pvt. Ltd. +91-974395
how to reduce latency?
I have three nodes running cassandra 0.7.4 about two years, as showed below: 10.x.x.x Up Normal 138.07 GB 33.33% 0 10.x.x.x Up Normal 143.97 GB 33.33% 56713727820156410577229101238628035242 10.x.x.x Up Normal 137.33 GB 33.33% 113427455640312821154458202477256070484 the commitlog and data directory are separated on different disk(Western Digital WD RE3 WD1002FBYS 1TB). as the data size grow the reads and write time are keep increasing, slow down the website frequently. Based on the experience that every time using nodetool to maintain the nodes, could cost a very long time, consume a lot system resource(always ended nowhere), and cause my web service very unstable, I really have no idea what to do, upgrade seems not solving this either, I have a newer cluster with have the same configuration but version is 1.0.2, also got increasing latency, and the new system also suffering unstable problem... just wondering does that means I must add more nodes(which is also a painful and slow path)? [image: Inline image 1]
Re: portability between enterprise and community version
Dne 13.6.2012 11:29, Viktor Jevdokimov napsal(a): I remember that join and decommission didn’t worked since using streaming. All problems was due to paths differences between Windows and Linux styles. what about to use unix style File.separator in streaming protocol to make it OS-independent? Its simple fix. I want to mix windows and linux nodes too.
Re: Dead node still being pinged
Le 13 juin 2012 à 10:30, aaron morton a écrit : Here is what I *think* is going on, if Brandon is around he may be able to help out. The old nodes are being included in the Gossip rounds, because Gossiper.doGossipToUnreachableMember() just looks at the nodes that are unreachable. It does not check if they have been removed from the cluster. Information about the removed nodes is kept by gossip so that if a node is removed while it is down it will shut down when restarted. This information *should* stay in gossip for 3 days. In your gossip info, the last long on the STATUS lines is the expiry time for this info… /10.10.0.24 STATUS:removed,127605887595351923798765477786913079296,1336530323263 REMOVAL_COORDINATOR:REMOVER,0 /10.10.0.22 STATUS:removed,42535295865117307932921825928971026432,1336529659203 REMOVAL_COORDINATOR:REMOVER,113427455640312814857969558651062452224 For the first line it's In [48]: datetime.datetime.fromtimestamp(1336530323263/1000) Out[48]: datetime.datetime(2012, 5, 9, 14, 25, 23) So that's good. The Gossip round will remove the 0.24 and 0.22 nodes from the local state if the expiry time has passed, and the node is marked as dead and it's not in the token ring. You can see if the node thinks 0.24 and 0.22 are up by looking getSimpleStates() on the FailureDetectorMBean. (I use jmxterm to do this sort of thing) The two old nodes are still seen as down: SimpleStates:[/10.10.0.22:DOWN, /10.10.0.24:DOWN, /10.10.0.26:UP, /10.10.0.25:UP, /10.10.0.27:UP] The other thing that can confuse things is the gossip generation. If your old nodes were started with a datetime in the future that can muck things up. I have just checked, my old nodes machines are nicely synchronized. My new nodes have some lag of few seconds, some in the future, some in the past. I definitively need to fix that. The simple to try is starting the server with the -Dcassandra.join_ring=false JVM option. This will force the node to get the ring info from othernodes. Check things with nodetool gossip info to see if the other nodes tell it about the old ones again. You meant -Dcassandra.load_ring_state=false right ? Then nothing changed. Sorry, gossip can be tricky to diagnose over email. No worry, I really appreciate that you take time looking into my issues. Maybe I could open a jira about my issue ? Maybe there was a config mess on my part at some point, ie the unsynchronized date on my machines, but I think it would be nice if cassandra could resolve itself of that inconsistent state. Nicolas - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 10:33 PM, Nicolas Lalevée wrote: I have one dirty solution to try: bring data-2 and data-4 back up and down again. Is there any way I can tell cassandra to not get any data, so when I would get my old node up, no streaming would start ? cheers, Nicolas Le 12 juin 2012 à 12:25, Nicolas Lalevée a écrit : Le 12 juin 2012 à 11:03, aaron morton a écrit : Try purging the hints for 10.10.0.24 using the HintedHandOffManager MBean. As far as I could tell, there were no hinted hand off to be delivered. Nevertheless I have called deleteHintsForEndpoint on every node for the two expected to be out nodes. Nothing changed, I still see packet being send to these old nodes. I looked closer to ResponsePendingTasks of MessagingService. Actually the numbers change, between 0 and about 4. So tasks are ending but new ones come just after. Nicolas Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 3:33 AM, Nicolas Lalevée wrote: finally, thanks to the groovy jmx builder, it was not that hard. Le 11 juin 2012 à 12:12, Samuel CARRIERE a écrit : If I were you, I would connect (through JMX, with jconsole) to one of the nodes that is sending messages to an old node, and would have a look at these MBean : - org.apache.net.FailureDetector : does SimpleStates looks good ? (or do you see an IP of an old node) SimpleStates:[/10.10.0.22:DOWN, /10.10.0.24:DOWN, /10.10.0.26:UP, /10.10.0.25:UP, /10.10.0.27:UP] - org.apache.net.MessagingService : do you see one of the old IP in one of the attributes ? data-5: CommandCompletedTasks: [10.10.0.22:2, 10.10.0.26:6147307, 10.10.0.27:6084684, 10.10.0.24:2] CommandPendingTasks: [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0] ResponseCompletedTasks: [10.10.0.22:1487, 10.10.0.26:6187204, 10.10.0.27:6062890, 10.10.0.24:1495] ResponsePendingTasks: [10.10.0.22:0, 10.10.0.26:0, 10.10.0.27:0, 10.10.0.24:0] data-6: CommandCompletedTasks: [10.10.0.22:2, 10.10.0.27:6064992, 10.10.0.24:2, 10.10.0.25:6308102] CommandPendingTasks: [10.10.0.22:0, 10.10.0.27:0, 10.10.0.24:0, 10.10.0.25:0] ResponseCompletedTasks: [10.10.0.22:1463, 10.10.0.27:6067943,
Cassandra out of Heap memory
Hi My cassandra node went out of heap memory with this message GCInspector.java(line 88): Heap is .9934 full. Is this expected? or should I adjust my flush_largest_memtable_at variable. Also one change I did in my cluster was add 5 Column Families which are empty Should empty ColumnFamilies cause significant increase in cassandra heap usage? Thanks Rohit
Re: Cassandra out of Heap memory
To clarify things Our setup contains of 8 nodes of 32 gb ram... with a heap_max size of 12gb and heap new size of 1.6 gb The load on our nodes is write/read ratio of 10 with 6 main Column Families. Although the flushes of column families occur every hour with sstables sizes of around 50-100 mb. The memtable size for those seems to be around 500mb. (Is this 10-20 times overhead expected). Also This is the first time I'm seeing max Heap size reached Exceptions. Could there be a significant reason to this other than that the cassandra server were running without restarting for 2 months, On Wed, Jun 13, 2012 at 6:30 PM, rohit bhatia rohit2...@gmail.com wrote: Hi My cassandra node went out of heap memory with this message GCInspector.java(line 88): Heap is .9934 full. Is this expected? or should I adjust my flush_largest_memtable_at variable. Also one change I did in my cluster was add 5 Column Families which are empty Should empty ColumnFamilies cause significant increase in cassandra heap usage? Thanks Rohit
Re: Snapshot failing on JSON files in 1.1.0
Hi Aaron, We are using Ubuntu (AMI Datastax 1.0.9 as I said). Release:10.10 Codename: maverick ERROR [RMI TCP Connection(37732)-10.248.10.94] 2012-06-13 15:00:17,157 CLibrary.java (line 153) Unable to create hard link com.sun.jna.LastErrorException: errno was 1 at org.apache.cassandra.utils.CLibrary.link(Native Method) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:145) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:857) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1412) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) WARN [RMI TCP Connection(37732)-10.248.10.94] 2012-06-13 15:00:17,158 CLibrary.java (line 97) Obsolete version of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later I see that a JNA update seems necessary. Shouldn't this AMI be fully worrking out of the box ? I am going to search how to upgrade JNA. Alain 2012/5/31 aaron morton aa...@thelastpickle.com: CASSANDRA-4230 is a bug in 1.1 I am not aware of issues using snapshot on 1.0.9. But errno 0 is a bit odd. On the server side there should be a log message at ERROR level that contains the string Unable to create hard link and the error message. What does that say ? Can you also include the OS version. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/05/2012, at 9:27 PM, Alain RODRIGUEZ wrote: I have the same error with the last Datastax AMI (1.0.9). Is that the same bug ? Requested snapshot for: cassa_teads Exception in thread main java.io.IOError: java.io.IOException: Unable to create hard link from /raid0/cassandra/data/cassa_teads/stats_product-hc-233-Index.db to /raid0/cassandra/data/cassa_teads/snapshots/20120528/stats_product-hc-233-Index.db (errno 0) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at
RE: Much more native memory used by Cassandra then the configured JVM heap size
I have experienced the same issue. The Java heap seems fine but eventually the OS runs out of heap. In my case it renders the entire box unusable without a hard reboot. Console shows: is there a way to limit the native heap usage? xfs invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Call Trace: [800c9d3a] out_of_memory+0x8e/0x2f3 [8002dfd7] __wake_up+0x38/0x4f [8000f677] __alloc_pages+0x27f/0x308 [80013034] __do_page_cache_readahead+0x96/0x17b [80013971] filemap_nopage+0x14c/0x360 [8000896c] __handle_mm_fault+0x1fd/0x103b [8002dfd7] __wake_up+0x38/0x4f [800671f2] do_page_fault+0x499/0x842 [800b8f39] audit_filter_syscall+0x87/0xad [8005dde9] error_exit+0x0/0x84 Node 0 DMA per-cpu: empty Node 0 DMA32 per-cpu: empty Node 0 Normal per-cpu: cpu 0 hot: high 186, batch 31 used:23 cpu 0 cold: high 62, batch 15 used:14 ... cpu 23 cold: high 62, batch 15 used:8 Node 1 HighMem per-cpu: empty Free pages: 158332kB (0kB HighMem) Active:16225503 inactive:1 dirty:0 writeback:0 unstable:0 free:39583 slab:21496 Node 0 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB lowmem_reserve[]: 0 0 32320 32320 Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0 lowmem_reserve[]: 0 0 32320 32320 Node 0 Normal free:16136kB min:16272kB low:20340kB high:24408kB active:3255624 From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 4:08 AM To: user@cassandra.apache.org Subject: Re: Much more native memory used by Cassandra then the configured JVM heap size see http://wiki.apache.org/cassandra/FAQ#mmap which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. Is the low free memory causing a problem ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 5:52 PM, Jason Tang wrote: Hi I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. I use top to monitor total memory used by Cassandra. = -Xms6G -Xmx6G -Xmn1600M 2012/6/12 Jeffrey Kesselman jef...@gmail.commailto:jef...@gmail.com Btw. I suggest you spin up JConsole as it will give you much more detai kon what your VM is actually doing. On Mon, Jun 11, 2012 at 9:14 PM, Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com wrote: Hi We have some problem with Cassandra memory usage, we configure the JVM HEAP 6G, but after runing Cassandra for several hours (insert, update, delete). The total memory used by Cassandra go up to 15G, which cause the OS low memory. So I wonder if it is normal to have so many memory used by cassandra? And how to limit the native memory used by Cassandra? === Cassandra 1.0.3, 64 bit jdk. Memory ocupied by Cassandra 15G PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9567 casadm20 0 28.3g 15g 9.1g S 269 65.1 385:57.65 java = -Xms6G -Xmx6G -Xmn1600M # ps -ef | grep 9567 casadm9567 1 55 Jun11 ?05:59:44 /opt/jdk1.6.0_29/bin/java -ea -javaagent:/opt/dve/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=6080 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Daccess.properties=/opt/dve/cassandra/conf/access.properties -Dpasswd.properties=/opt/dve/cassandra/conf/passwd.properties -Dpasswd.mode=MD5 -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -cp
Re: Snapshot failing on JSON files in 1.1.0
if I do : locate jna /opt/java/64/jdk1.6.0_31/db/docs/html/ref/rrefsqljnaturaljoin.html /root/.m2/repository/net/java/dev/jna /root/.m2/repository/net/java/dev/jna/jna /root/.m2/repository/net/java/dev/jna/jna/3.2.7 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7-sources.jar /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7-sources.jar.sha1 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.jar /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.jar.sha1 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.pom /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.pom.sha1 /usr/share/doc/libjna-java /usr/share/doc/libjna-java/README.Debian /usr/share/doc/libjna-java/changelog.Debian.gz /usr/share/doc/libjna-java/copyright /usr/share/java/jna-3.2.4.jar /usr/share/java/jna.jar /usr/share/maven-repo/net/java/dev/jna /usr/share/maven-repo/net/java/dev/jna/jna /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4 /usr/share/maven-repo/net/java/dev/jna/jna/debian /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4/jna-3.2.4.jar /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4/jna-3.2.4.pom /usr/share/maven-repo/net/java/dev/jna/jna/debian/jna-debian.jar /usr/share/maven-repo/net/java/dev/jna/jna/debian/jna-debian.pom /var/cache/apt/archives/libjna-java_3.2.4-2_amd64.deb /var/lib/dpkg/info/libjna-java.list /var/lib/dpkg/info/libjna-java.md5sums So what version am I using (jna 3.2.7 or 3.2.4 ?) Should I do an apt-get install libjna-java and does this need a restart ? Alain 2012/6/13 Alain RODRIGUEZ arodr...@gmail.com: Hi Aaron, We are using Ubuntu (AMI Datastax 1.0.9 as I said). Release: 10.10 Codename: maverick ERROR [RMI TCP Connection(37732)-10.248.10.94] 2012-06-13 15:00:17,157 CLibrary.java (line 153) Unable to create hard link com.sun.jna.LastErrorException: errno was 1 at org.apache.cassandra.utils.CLibrary.link(Native Method) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:145) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:857) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1412) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) WARN [RMI
RE: Much more native memory used by Cassandra then the configured JVM heap size
Seems like my only recourse is to remove jna.jar and just take the performance/swapping pain? Obviously can't have the entire box lock up. I can provide a pmap etc. if needed. From: Poziombka, Wade L [mailto:wade.l.poziom...@intel.com] Sent: Wednesday, June 13, 2012 10:28 AM To: user@cassandra.apache.org Subject: RE: Much more native memory used by Cassandra then the configured JVM heap size I have experienced the same issue. The Java heap seems fine but eventually the OS runs out of heap. In my case it renders the entire box unusable without a hard reboot. Console shows: is there a way to limit the native heap usage? xfs invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Call Trace: [800c9d3a] out_of_memory+0x8e/0x2f3 [8002dfd7] __wake_up+0x38/0x4f [8000f677] __alloc_pages+0x27f/0x308 [80013034] __do_page_cache_readahead+0x96/0x17b [80013971] filemap_nopage+0x14c/0x360 [8000896c] __handle_mm_fault+0x1fd/0x103b [8002dfd7] __wake_up+0x38/0x4f [800671f2] do_page_fault+0x499/0x842 [800b8f39] audit_filter_syscall+0x87/0xad [8005dde9] error_exit+0x0/0x84 Node 0 DMA per-cpu: empty Node 0 DMA32 per-cpu: empty Node 0 Normal per-cpu: cpu 0 hot: high 186, batch 31 used:23 cpu 0 cold: high 62, batch 15 used:14 ... cpu 23 cold: high 62, batch 15 used:8 Node 1 HighMem per-cpu: empty Free pages: 158332kB (0kB HighMem) Active:16225503 inactive:1 dirty:0 writeback:0 unstable:0 free:39583 slab:21496 Node 0 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB lowmem_reserve[]: 0 0 32320 32320 Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0 lowmem_reserve[]: 0 0 32320 32320 Node 0 Normal free:16136kB min:16272kB low:20340kB high:24408kB active:3255624 From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 4:08 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Much more native memory used by Cassandra then the configured JVM heap size see http://wiki.apache.org/cassandra/FAQ#mmap which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. Is the low free memory causing a problem ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 5:52 PM, Jason Tang wrote: Hi I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. I use top to monitor total memory used by Cassandra. = -Xms6G -Xmx6G -Xmn1600M 2012/6/12 Jeffrey Kesselman jef...@gmail.commailto:jef...@gmail.com Btw. I suggest you spin up JConsole as it will give you much more detai kon what your VM is actually doing. On Mon, Jun 11, 2012 at 9:14 PM, Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com wrote: Hi We have some problem with Cassandra memory usage, we configure the JVM HEAP 6G, but after runing Cassandra for several hours (insert, update, delete). The total memory used by Cassandra go up to 15G, which cause the OS low memory. So I wonder if it is normal to have so many memory used by cassandra? And how to limit the native memory used by Cassandra? === Cassandra 1.0.3, 64 bit jdk. Memory ocupied by Cassandra 15G PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9567 casadm20 0 28.3g 15g 9.1g S 269 65.1 385:57.65 java = -Xms6G -Xmx6G -Xmn1600M # ps -ef | grep 9567 casadm9567 1 55 Jun11 ?05:59:44 /opt/jdk1.6.0_29/bin/java -ea -javaagent:/opt/dve/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=6080 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Daccess.properties=/opt/dve/cassandra/conf/access.properties
Distinct Counter Proposal for Cassandra
Hi All, Let's assume we have a use case where we need to count the number of columns for a given key. Let's say the key is the URL and the column-name is the IP address or any cardinality identifier. The straight forward implementation seems to be simple, just inserting the IP Adresses as columns under the key defined by the URL and using get_count to count them back. However the problem here is in case of large rows (where too many IP addresses are in); the get_count method has to de-serialize the whole row and calculate the count. As also defined in the user guides, it's not an O(1) operation and it's quite costly. However, this problem seems to have better solutions if you don't have a strict requirement for the count to be exact. There are streaming algorithms that will provide good cardinality estimations within a predefined failure rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, also there's an optimal one developed recently, please check http://dl.acm.org/citation.cfm?doid=1807085.1807094 If you want to take a look at the Java implementation for LogLog, Clearspring has both LogLog and space optimized HyperLogLog available at https://github.com/clearspring/stream-lib I don't see a reason why this can't be implemented in Cassandra. The distributed nature of all these algorithms can easily be adapted to Cassandra's model. I think most of us would love to see come cardinality estimating columns in Cassandra. Regards, Utku
Release: OpsCenter 2.1
Hey Everyone, Today we released version 2.1 of OpsCenter. The main theme of the release was around bug fixes, although there are a few small features that have been added. Some highlights: * Data Browser improvements * Online feedback form. * Optional HTTPS support. * Cross platform agent installation * Column Family truncate support We hope the data browser improvements are particularly useful. Previous versions displayed any non UTF8 data as hex bytes. The new release should correctly display all built in Cassandra data types (including composites). Also there is an integrated feedback mechanism now. Please feel free to send us any feedback you have. Full Release Notes: http://www.datastax.com/docs/opscenter/release_notes Download: http://www.datastax.com/download/community *Note: The community version is completely free for all types of use. Thanks! Nick
Re: Distinct Counter Proposal for Cassandra
You can open JIRA ticket at https://issues.apache.org/jira/browse/CASSANDRA with your proposal. Just for the input: I had once implemented HyperLogLog counter to use internally in Cassandra, but it turned out I didn't need it so I just put it to gist. You can find it here: https://gist.github.com/2597943 The above implementation and most of the other ones (including stream-lib) implement the optimized version of the algorithm which counts up to 10^9, so may need some work. Other alternative is self-learning bitmap (http://ect.bell-labs.com/who/aychen/sbitmap4p.pdf) which, in my understanding, is more memory efficient when counting small values. Yuki On Wednesday, June 13, 2012 at 11:28 AM, Utku Can Topçu wrote: Hi All, Let's assume we have a use case where we need to count the number of columns for a given key. Let's say the key is the URL and the column-name is the IP address or any cardinality identifier. The straight forward implementation seems to be simple, just inserting the IP Adresses as columns under the key defined by the URL and using get_count to count them back. However the problem here is in case of large rows (where too many IP addresses are in); the get_count method has to de-serialize the whole row and calculate the count. As also defined in the user guides, it's not an O(1) operation and it's quite costly. However, this problem seems to have better solutions if you don't have a strict requirement for the count to be exact. There are streaming algorithms that will provide good cardinality estimations within a predefined failure rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, also there's an optimal one developed recently, please check http://dl.acm.org/citation.cfm?doid=1807085.1807094 If you want to take a look at the Java implementation for LogLog, Clearspring has both LogLog and space optimized HyperLogLog available at https://github.com/clearspring/stream-lib I don't see a reason why this can't be implemented in Cassandra. The distributed nature of all these algorithms can easily be adapted to Cassandra's model. I think most of us would love to see come cardinality estimating columns in Cassandra. Regards, Utku
Re: Errors with Cassandra 1.0.10, 1.1.0, 1.1.1-SNAPSHOT and 1.2.0-SNAPSHOT
Hi! I've found that in all cases, the errors happened when running the tests cases of the application. Before running each tests case, a test keyspace is created, then the tests are executed, and the keyspace is removed. The repeated creation, use and removal of a keyspace causes that kind of errors, but the tests ran ok... that errors are not propagated to the application... Thanks! Horacio On Mon, Jun 4, 2012 at 3:50 PM, aaron morton aa...@thelastpickle.comwrote: I remember someone have the file exists issue a few weeks ago, IIRC it magically went away. Do yo have steps to reproduce this fault ? If you can reproduce it on a release version please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA and update the email thread. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/06/2012, at 2:56 PM, Horacio G. de Oro wrote: Permissions are ok. The writes works ok, and the data can be read. Thanks! Horacio On Sat, Jun 2, 2012 at 11:50 PM, Kirk True k...@mustardgrain.com wrote: Permissions problems on /var for the user running Cassandra? Sent from my iPhone On Jun 2, 2012, at 6:56 PM, Horacio G. de Oro hgde...@gmail.com wrote: Hi! While using Cassandra, I've seen this log messages when running some test cases (which insert lots of columns in 4 rows). I've tryied Cassandra 1.0.10, 1.1.0, 1.1.1-SNAPSHOT and 1.2.0-SNAPSHOT (built from git). I'm using the default configuration, Oracle jdk 1.6.0_32, Ubuntu 12.04 and pycassa. Since I'm very new to Cassandra (I'm just starting to learn it) I don't know if I'm doing something wrong, or maybe there are some bugs in the several versions of Cassandra I've tested. cassandra-1.0.10 - IOException: unable to mkdirs cassandra-1.1.0 - IOException: Unable to create directory cassandra-1.1.1-SNAPSHOT - IOException: Unable to create directory cassandra-1.2.0-SNAPSHOT - IOException: Unable to create directory - CLibrary.java (line 191) Unable to create hard link (...) command output: ln: failed to create hard link `(...)/lolog_tests-Logs_by_app-ia-3-Summary.db': File exists Thanks in advance! Horacio system-cassandra-1.0.10.log ERROR [MutationStage:1] 2012-06-02 20:37:41,115 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:1,5,main] java.io.IOError: java.io.IOException: unable to mkdirs /var/lib/cassandra/data/lolog_tests/snapshots/1338680261112-Logs_by_app at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657) at org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50) ERROR [MutationStage:11] 2012-06-02 20:37:55,730 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:11,5,main] java.io.IOError: java.io.IOException: unable to mkdirs /var/lib/cassandra/data/lolog_tests/snapshots/1338680275729-Logs_by_app at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657) at org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50) ERROR [MutationStage:19] 2012-06-02 20:37:57,395 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:19,5,main] java.io.IOError: java.io.IOException: unable to mkdirs /var/lib/cassandra/data/lolog_tests/snapshots/1338680277394-Logs_by_app at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657) at org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:50) ERROR [MutationStage:20] 2012-06-02 20:41:26,666 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MutationStage:20,5,main] java.io.IOError: java.io.IOException: unable to mkdirs /var/lib/cassandra/data/lolog_tests/snapshots/133868048-Logs_by_app at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.ColumnFamilyStore.truncate(ColumnFamilyStore.java:1657) at
Re: Distinct Counter Proposal for Cassandra
Hi Yuki, I think I should have used the word discussion instead of proposal for the mailing subject. I have quite some of a design in my mind but I think it's not yet ripe enough to formalize. I'll try to simplify it and open a Jira ticket. But first I'm wondering if there would be any excitement in the community for such a feature. Regards, Utku On Wed, Jun 13, 2012 at 7:00 PM, Yuki Morishita mor.y...@gmail.com wrote: You can open JIRA ticket at https://issues.apache.org/jira/browse/CASSANDRA with your proposal. Just for the input: I had once implemented HyperLogLog counter to use internally in Cassandra, but it turned out I didn't need it so I just put it to gist. You can find it here: https://gist.github.com/2597943 The above implementation and most of the other ones (including stream-lib) implement the optimized version of the algorithm which counts up to 10^9, so may need some work. Other alternative is self-learning bitmap ( http://ect.bell-labs.com/who/aychen/sbitmap4p.pdf) which, in my understanding, is more memory efficient when counting small values. Yuki On Wednesday, June 13, 2012 at 11:28 AM, Utku Can Topçu wrote: Hi All, Let's assume we have a use case where we need to count the number of columns for a given key. Let's say the key is the URL and the column-name is the IP address or any cardinality identifier. The straight forward implementation seems to be simple, just inserting the IP Adresses as columns under the key defined by the URL and using get_count to count them back. However the problem here is in case of large rows (where too many IP addresses are in); the get_count method has to de-serialize the whole row and calculate the count. As also defined in the user guides, it's not an O(1) operation and it's quite costly. However, this problem seems to have better solutions if you don't have a strict requirement for the count to be exact. There are streaming algorithms that will provide good cardinality estimations within a predefined failure rate, I think the most popular one seems to be the (Hyper)LogLog algorithm, also there's an optimal one developed recently, please check http://dl.acm.org/citation.cfm?doid=1807085.1807094 If you want to take a look at the Java implementation for LogLog, Clearspring has both LogLog and space optimized HyperLogLog available at https://github.com/clearspring/stream-lib I don't see a reason why this can't be implemented in Cassandra. The distributed nature of all these algorithms can easily be adapted to Cassandra's model. I think most of us would love to see come cardinality estimating columns in Cassandra. Regards, Utku
Re: kswapd0 causing read timeouts
Alright, here it goes again... Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs if its mmap_index_only. only reads happening at 50 reads/second row cache size: 730 mb, row cache hit ratio: 0.75 key cache size: 400 mb, key cache hit ratio: 0.4 heap size (max 8 gigs): used 6.1-6.9 gigs No messages about reducing cache sizes in the logs stats: vmstat 1 : no swapping here, however high sys cpu utilization iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util = 15-30% top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb cfstats - 70-100 ms. This number used to be 20-30 ms. The value of the SHR keeps increasing (owing to mmap i guess), while at the same time buffers keeps decreasing. buffers starts as high as 50 mb, and goes down to 2 mb. This is very easily reproducible for me. Every time the RES memory hits abt 15 gigs, the client starts getting timeouts from cassandra, the sys cpu jumps a lot. All this, even though my row cache hit ratio is almost 0.75. Other than just turning off mmap completely, is there any other solution or setting to avoid a cassandra restart every cpl of days. Something to keep the RES memory to hit such a high number. I have been constantly monitoring the RES, was not seeing issues when RES was at 14 gigs. /G On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.comwrote: Aaron, Ruslan, I changed the disk access mode to mmap_index_only, and it has been stable ever since, well at least for the past 20 hours. Previously, in abt 10-12 hours, as soon as the resident memory was full, the client would start timing out on all its reads. It looks fine for now, i am going to let it continue to see how long it lasts and if the problem comes again. Aaron, yes, i had turned swap off. The total cpu utilization was at 700% roughly.. It looked like kswapd0 was using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a bit. top was reporting high system cpu, and low user cpu. vmstat was not showing swapping. java heap size max is 8 gigs. while only 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat was doing ok from what i remember, i will have to reproduce the issue for the exact numbers. cfstats latency had gone very high, but that is partly due to high cpu usage. One thing was clear, that the SHR was inching higher (due to the mmap) while buffer cache which started at abt 20-25mb reduced to 2 MB by the end, which probably means that pagecache was being evicted by the kswapd0. Is there a way to fix the size of the buffer cache and not let system evict it in favour of mmap? Also, mmapping data files would basically cause not only the data (asked for) to be read into main memory, but also a bunch of extra pages (readahead), which would not be very useful, right? The same thing for index would actually be more useful, as there would be more index entries in the readahead part.. and the index files being small wouldnt cause memory pressure that page cache would be evicted. mmapping the data files would make sense if the data size is smaller than the RAM or the hot data set is smaller than the RAM, otherwise just the index would probably be a better thing to mmap, no?. In my case data size is 85 gigs, while available RAM is 16 gigs (only 8 gigs after heap). /G On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.comwrote: Ruslan, Why did you suggest changing the disk_access_mode ? Gurpreet, I would leave the disk_access_mode with the default until you have a reason to change it. 8 core, 16 gb ram, 6 data disks raid0, no swap configured is swap disabled ? Gradually, the system cpu becomes high almost 70%, and the client starts getting continuous timeouts 70% of one core or 70% of all cores ? Check the server logs, is there GC activity ? check nodetool cfstats to see the read latency for the cf. Take a look at vmstat to see if you are swapping, and look at iostats to see if io is the problem http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote: Thanks Ruslan. I will try the mmap_index_only. Is there any guideline as to when to leave it to auto and when to use mmap_index_only? /G On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.comwrote: disk_access_mode: mmap?? set to disk_access_mode: mmap_index_only in cassandra yaml 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com: Hi, I am testing cassandra 1.1 on a 1 node cluster. 8 core, 16 gb ram, 6 data disks raid0, no swap configured cassandra 1.1.1 heap size: 8 gigs key cache size in mb: 800 (used only 200mb till now) memtable_total_space_in_mb : 2048 I am
RE: Much more native memory used by Cassandra then the configured JVM heap size
actually, this is without jna.jar. I will add and see if still have same issue From: Poziombka, Wade L Sent: Wednesday, June 13, 2012 10:53 AM To: user@cassandra.apache.org Subject: RE: Much more native memory used by Cassandra then the configured JVM heap size Seems like my only recourse is to remove jna.jar and just take the performance/swapping pain? Obviously can't have the entire box lock up. I can provide a pmap etc. if needed. From: Poziombka, Wade L [mailto:wade.l.poziom...@intel.com] Sent: Wednesday, June 13, 2012 10:28 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Much more native memory used by Cassandra then the configured JVM heap size I have experienced the same issue. The Java heap seems fine but eventually the OS runs out of heap. In my case it renders the entire box unusable without a hard reboot. Console shows: is there a way to limit the native heap usage? xfs invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Call Trace: [800c9d3a] out_of_memory+0x8e/0x2f3 [8002dfd7] __wake_up+0x38/0x4f [8000f677] __alloc_pages+0x27f/0x308 [80013034] __do_page_cache_readahead+0x96/0x17b [80013971] filemap_nopage+0x14c/0x360 [8000896c] __handle_mm_fault+0x1fd/0x103b [8002dfd7] __wake_up+0x38/0x4f [800671f2] do_page_fault+0x499/0x842 [800b8f39] audit_filter_syscall+0x87/0xad [8005dde9] error_exit+0x0/0x84 Node 0 DMA per-cpu: empty Node 0 DMA32 per-cpu: empty Node 0 Normal per-cpu: cpu 0 hot: high 186, batch 31 used:23 cpu 0 cold: high 62, batch 15 used:14 ... cpu 23 cold: high 62, batch 15 used:8 Node 1 HighMem per-cpu: empty Free pages: 158332kB (0kB HighMem) Active:16225503 inactive:1 dirty:0 writeback:0 unstable:0 free:39583 slab:21496 Node 0 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB lowmem_reserve[]: 0 0 32320 32320 Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0 lowmem_reserve[]: 0 0 32320 32320 Node 0 Normal free:16136kB min:16272kB low:20340kB high:24408kB active:3255624 From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, June 12, 2012 4:08 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Much more native memory used by Cassandra then the configured JVM heap size see http://wiki.apache.org/cassandra/FAQ#mmap which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. Is the low free memory causing a problem ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/06/2012, at 5:52 PM, Jason Tang wrote: Hi I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. I use top to monitor total memory used by Cassandra. = -Xms6G -Xmx6G -Xmn1600M 2012/6/12 Jeffrey Kesselman jef...@gmail.commailto:jef...@gmail.com Btw. I suggest you spin up JConsole as it will give you much more detai kon what your VM is actually doing. On Mon, Jun 11, 2012 at 9:14 PM, Jason Tang ares.t...@gmail.commailto:ares.t...@gmail.com wrote: Hi We have some problem with Cassandra memory usage, we configure the JVM HEAP 6G, but after runing Cassandra for several hours (insert, update, delete). The total memory used by Cassandra go up to 15G, which cause the OS low memory. So I wonder if it is normal to have so many memory used by cassandra? And how to limit the native memory used by Cassandra? === Cassandra 1.0.3, 64 bit jdk. Memory ocupied by Cassandra 15G PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9567 casadm20 0 28.3g 15g 9.1g S 269 65.1 385:57.65 java = -Xms6G -Xmx6G -Xmn1600M # ps -ef | grep 9567 casadm9567 1 55 Jun11 ?05:59:44 /opt/jdk1.6.0_29/bin/java -ea -javaagent:/opt/dve/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
Re: kswapd0 causing read timeouts
Hm, it's very strange what amount of you data? You linux kernel version? Java version? PS: i can suggest switch diskaccessmode to standart in you case PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32 (from oracle site) 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com: Alright, here it goes again... Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs if its mmap_index_only. only reads happening at 50 reads/second row cache size: 730 mb, row cache hit ratio: 0.75 key cache size: 400 mb, key cache hit ratio: 0.4 heap size (max 8 gigs): used 6.1-6.9 gigs No messages about reducing cache sizes in the logs stats: vmstat 1 : no swapping here, however high sys cpu utilization iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util = 15-30% top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb cfstats - 70-100 ms. This number used to be 20-30 ms. The value of the SHR keeps increasing (owing to mmap i guess), while at the same time buffers keeps decreasing. buffers starts as high as 50 mb, and goes down to 2 mb. This is very easily reproducible for me. Every time the RES memory hits abt 15 gigs, the client starts getting timeouts from cassandra, the sys cpu jumps a lot. All this, even though my row cache hit ratio is almost 0.75. Other than just turning off mmap completely, is there any other solution or setting to avoid a cassandra restart every cpl of days. Something to keep the RES memory to hit such a high number. I have been constantly monitoring the RES, was not seeing issues when RES was at 14 gigs. /G On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.com wrote: Aaron, Ruslan, I changed the disk access mode to mmap_index_only, and it has been stable ever since, well at least for the past 20 hours. Previously, in abt 10-12 hours, as soon as the resident memory was full, the client would start timing out on all its reads. It looks fine for now, i am going to let it continue to see how long it lasts and if the problem comes again. Aaron, yes, i had turned swap off. The total cpu utilization was at 700% roughly.. It looked like kswapd0 was using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a bit. top was reporting high system cpu, and low user cpu. vmstat was not showing swapping. java heap size max is 8 gigs. while only 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat was doing ok from what i remember, i will have to reproduce the issue for the exact numbers. cfstats latency had gone very high, but that is partly due to high cpu usage. One thing was clear, that the SHR was inching higher (due to the mmap) while buffer cache which started at abt 20-25mb reduced to 2 MB by the end, which probably means that pagecache was being evicted by the kswapd0. Is there a way to fix the size of the buffer cache and not let system evict it in favour of mmap? Also, mmapping data files would basically cause not only the data (asked for) to be read into main memory, but also a bunch of extra pages (readahead), which would not be very useful, right? The same thing for index would actually be more useful, as there would be more index entries in the readahead part.. and the index files being small wouldnt cause memory pressure that page cache would be evicted. mmapping the data files would make sense if the data size is smaller than the RAM or the hot data set is smaller than the RAM, otherwise just the index would probably be a better thing to mmap, no?. In my case data size is 85 gigs, while available RAM is 16 gigs (only 8 gigs after heap). /G On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.com wrote: Ruslan, Why did you suggest changing the disk_access_mode ? Gurpreet, I would leave the disk_access_mode with the default until you have a reason to change it. 8 core, 16 gb ram, 6 data disks raid0, no swap configured is swap disabled ? Gradually, the system cpu becomes high almost 70%, and the client starts getting continuous timeouts 70% of one core or 70% of all cores ? Check the server logs, is there GC activity ? check nodetool cfstats to see the read latency for the cf. Take a look at vmstat to see if you are swapping, and look at iostats to see if io is the problem http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote: Thanks Ruslan. I will try the mmap_index_only. Is there any guideline as to when to leave it to auto and when to use mmap_index_only? /G On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.com wrote: disk_access_mode: mmap?? set to disk_access_mode: mmap_index_only in
Cassandra upgrade to 1.1.1 resulted in slow query issue
Greetings, We have recently introduced Cassandra at the Globe and Mail here in Toronto, Canada. We are processing and storing the North American stock-market feed. We have found it to work very quickly and things have been looking very good. Recently we upgraded to version 1.1.1 and then we have noticed some issues occurring. I will try to describe it for you here. Basically one operation that we very often perform and is very critical is the ability to 'get the latest quote'. This would return to you the latest Quote adjusted against exchange delay rules. With Cassandra version 1.0.3 we could get a Quote in around 2ms. After update we are looking at time of at least 2-3 seconds. The way we query the quote is using a REVERSED SuperSliceQuery with start=now, end=00:00:00.000 (beginning of day) LIMITED to 1. Our investigation leads us to suspect that, since upgrade, Cassandra seems to be reading the sstable from disk even when we request a small range of day only 5 seconds back. If you look at the output below you can see that the query does NOT get slower as the lookback increases from 5 sec, 60 sec, 15 min, 60 min, and 24 hours. We also noticed that the query was very fast for the first five minutes of trading, apparently until the first sstable was flushed to disk. After that we go into query times of 1-2 seconds or so. Query time[lookback=5]:[1711ms] Query time[lookback=60]:[1592ms] Query time[lookback=900]:[1520ms] Query time[lookback=3600]:[1294ms] Query time[lookback=86400]:[1391ms] We would really appreciate input or help on this. Cassandra version: 1.1.1 Hector version: 1.0-1 --- public void testCassandraIssue() { try { int[] seconds = new int[]{ 5, 60, 60 * 15, 60 * 60, 60 * 60 * 24}; for(int sec : seconds) { DateTime start = new DateTime(); SuperSliceQueryString, String, String, String superSliceQuery = HFactory.createSuperSliceQuery(keyspaceOperator, StringSerializer.get(), StringSerializer.get(), StringSerializer.get(), StringSerializer.get()); superSliceQuery.setKey(101390 + . + testFormatter.print(start)); superSliceQuery.setColumnFamily(Quotes); superSliceQuery.setRange(superKeyFormatter.print(start), superKeyFormatter.print(start.minusSeconds(sec)), true, 1); long theStart = System.currentTimeMillis(); QueryResultSuperSliceString, String, String result = superSliceQuery.execute(); long end = System.currentTimeMillis(); System.out.println(Query time[lookback= + sec + ]:[ + (end - theStart) + ms]); } } catch(Exception e) { e.printStackTrace(); fail(e.getMessage()); } } --- create column family Quotes with column_type = Super and comparator = BytesType and subcomparator = BytesType and keys_cached = 7000 and rows_cached = 0 and row_cache_save_period = 0 and key_cache_save_period = 3600 and memtable_throughput = 255 and memtable_operations = 0.29 AND compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64}; -Ivan/ --- [cid:image001.jpg@01CD496D.8EAB8240] Ivan Ganza | Senior Developer | Information Technology c: 647.701.6084 | e: iga...@globeandmail.com inline: image001.jpg
Help with configuring replication
Before going into complex clustering topologies, I would like to try the most simple configuration: just set up two nodes that will completely replicate each other. Could somebody tell me how to configure it? Thanks! This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (Moon Capital). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email.
Re: Snapshot failing on JSON files in 1.1.0
Hello Alain, Yes, the AMI is geared to working out of the box for most dev purposes. We recently spotted an issue with JNA 3.2.7 on Ubuntu 10.10 not being picked up. You can try running `apt-get install libjna-java` but in order for the change to be activated, you must restart your Cassandra service. Could you try removing the older version in the /usr/share/java/ folder and restarting your Cassandra service? You should see a 'JNA mlockall successful' message in your system.log. If not, then remove the remaining jna.jar and replace it with the newest jna.jar as found here: https://github.com/twall/jna. Upon another restart of your node, you should see the 'JNA mlockall successful' message as well as the ability to run the snapshots. I was unable to replicate this on a new instance so all new launches should include the patched code. Do let me know if anyone else sees this issue. Thanks, Joaquin Casares DataStax Software Engineer/Support On Wed, Jun 13, 2012 at 10:28 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: if I do : locate jna /opt/java/64/jdk1.6.0_31/db/docs/html/ref/rrefsqljnaturaljoin.html /root/.m2/repository/net/java/dev/jna /root/.m2/repository/net/java/dev/jna/jna /root/.m2/repository/net/java/dev/jna/jna/3.2.7 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7-sources.jar /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7-sources.jar.sha1 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.jar /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.jar.sha1 /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.pom /root/.m2/repository/net/java/dev/jna/jna/3.2.7/jna-3.2.7.pom.sha1 /usr/share/doc/libjna-java /usr/share/doc/libjna-java/README.Debian /usr/share/doc/libjna-java/changelog.Debian.gz /usr/share/doc/libjna-java/copyright /usr/share/java/jna-3.2.4.jar /usr/share/java/jna.jar /usr/share/maven-repo/net/java/dev/jna /usr/share/maven-repo/net/java/dev/jna/jna /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4 /usr/share/maven-repo/net/java/dev/jna/jna/debian /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4/jna-3.2.4.jar /usr/share/maven-repo/net/java/dev/jna/jna/3.2.4/jna-3.2.4.pom /usr/share/maven-repo/net/java/dev/jna/jna/debian/jna-debian.jar /usr/share/maven-repo/net/java/dev/jna/jna/debian/jna-debian.pom /var/cache/apt/archives/libjna-java_3.2.4-2_amd64.deb /var/lib/dpkg/info/libjna-java.list /var/lib/dpkg/info/libjna-java.md5sums So what version am I using (jna 3.2.7 or 3.2.4 ?) Should I do an apt-get install libjna-java and does this need a restart ? Alain 2012/6/13 Alain RODRIGUEZ arodr...@gmail.com: Hi Aaron, We are using Ubuntu (AMI Datastax 1.0.9 as I said). Release:10.10 Codename: maverick ERROR [RMI TCP Connection(37732)-10.248.10.94] 2012-06-13 15:00:17,157 CLibrary.java (line 153) Unable to create hard link com.sun.jna.LastErrorException: errno was 1 at org.apache.cassandra.utils.CLibrary.link(Native Method) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:145) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:857) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1412) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at
Supercolumn behavior on writes
Does a write to a sub column involve deserialization of the entire super column ? Thanks, Oleg
Re: Supercolumn behavior on writes
That's a good question. I just went to a class, Ben was saying that any action on a super column requires de-re-serialization. But, it would be nice if a write had this sort of efficiency. I have been playing with the 1.1.1 version, in that one there are 'composite' columns, which I think are like super columns, but they don't require serialization and deserialization. However, there seems to be a catch. You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. ---greg On Wed, Jun 13, 2012 at 6:52 PM, Oleg Dulin oleg.du...@gmail.com wrote: Does a write to a sub column involve deserialization of the entire super column ? Thanks, Oleg
Re: Supercolumn behavior on writes
You can create composite columns on the fly. On 06/13/2012 09:58 PM, Greg Fausak wrote: That's a good question. I just went to a class, Ben was saying that any action on a super column requires de-re-serialization. But, it would be nice if a write had this sort of efficiency. I have been playing with the 1.1.1 version, in that one there are 'composite' columns, which I think are like super columns, but they don't require serialization and deserialization. However, there seems to be a catch. You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. ---greg On Wed, Jun 13, 2012 at 6:52 PM, Oleg Dulinoleg.du...@gmail.com wrote: Does a write to a sub column involve deserialization of the entire super column ? Thanks, Oleg
Re: Supercolumn behavior on writes
You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. That' s incorrect. You can define name on fly. Validation must be define when declaring CF
Re: Supercolumn behavior on writes
Interesting. How do you do it? I have a version 2 CF, that works fine. A version 3 table won't let me invent columns that don't exist yet. (for composite tables). What's the trick? cqlsh -3 cas1 use onplus; cqlsh:onplus select * from at_event where ac_event_id = 7690254; ac_event_id | ac_creation | ac_event_type | ac_id | ev_sev -+--+---+---+ 7690254 | 2011-07-23 00:11:47+ | SERV.CPE.CONN | \N | 5 cqlsh:onplus update at_event set wingy = 'toto' where ac_event_id = 7690254; Bad Request: Unknown identifier wingy This is what I used to create it: // // create the event column family, this contains the static // part of the definition. many additional columns can be specified // in the port from relational, these would be mainly the at_event table // use onplus; create columnfamily at_event ( ac_event_id int PRIMARY KEY, ac_event_type text, ev_sev int, ac_id text, ac_creation timestamp ) with compression_parameters:sstable_compression = '' ; -g On Wed, Jun 13, 2012 at 9:36 PM, samal samalgo...@gmail.com wrote: You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. That' s incorrect. You can define name on fly. Validation must be define when declaring CF
Re: Supercolumn behavior on writes
Via thrift, or a high level client on thrift, see as an example http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1 On 06/13/2012 11:08 PM, Greg Fausak wrote: Interesting. How do you do it? I have a version 2 CF, that works fine. A version 3 table won't let me invent columns that don't exist yet. (for composite tables). What's the trick? cqlsh -3 cas1 use onplus; cqlsh:onplus select * from at_event where ac_event_id = 7690254; ac_event_id | ac_creation | ac_event_type | ac_id | ev_sev -+--+---+---+ 7690254 | 2011-07-23 00:11:47+ | SERV.CPE.CONN | \N | 5 cqlsh:onplus update at_event set wingy = 'toto' where ac_event_id = 7690254; Bad Request: Unknown identifier wingy This is what I used to create it: // // create the event column family, this contains the static // part of the definition. many additional columns can be specified // in the port from relational, these would be mainly the at_event table // use onplus; create columnfamily at_event ( ac_event_id int PRIMARY KEY, ac_event_type text, ev_sev int, ac_id text, ac_creation timestamp ) with compression_parameters:sstable_compression = '' ; -g On Wed, Jun 13, 2012 at 9:36 PM, samalsamalgo...@gmail.com wrote: You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. That' s incorrect. You can define name on fly. Validation must be define when declaring CF
Re: kswapd0 causing read timeouts
I would check /etc/sysctl.conf and get the values of /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure. If you don't have JNA enabled (which Cassandra uses to fadvise) and swappiness is at its default of 60, the Linux kernel will happily swap out your heap for cache space. Set swappiness to 1 or 'swapoff -a' and kswapd shouldn't be doing much unless you have a too-large heap or some other app using up memory on the system. On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.comwrote: Hm, it's very strange what amount of you data? You linux kernel version? Java version? PS: i can suggest switch diskaccessmode to standart in you case PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32 (from oracle site) 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com: Alright, here it goes again... Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs if its mmap_index_only. only reads happening at 50 reads/second row cache size: 730 mb, row cache hit ratio: 0.75 key cache size: 400 mb, key cache hit ratio: 0.4 heap size (max 8 gigs): used 6.1-6.9 gigs No messages about reducing cache sizes in the logs stats: vmstat 1 : no swapping here, however high sys cpu utilization iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util = 15-30% top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb cfstats - 70-100 ms. This number used to be 20-30 ms. The value of the SHR keeps increasing (owing to mmap i guess), while at the same time buffers keeps decreasing. buffers starts as high as 50 mb, and goes down to 2 mb. This is very easily reproducible for me. Every time the RES memory hits abt 15 gigs, the client starts getting timeouts from cassandra, the sys cpu jumps a lot. All this, even though my row cache hit ratio is almost 0.75. Other than just turning off mmap completely, is there any other solution or setting to avoid a cassandra restart every cpl of days. Something to keep the RES memory to hit such a high number. I have been constantly monitoring the RES, was not seeing issues when RES was at 14 gigs. /G On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.com wrote: Aaron, Ruslan, I changed the disk access mode to mmap_index_only, and it has been stable ever since, well at least for the past 20 hours. Previously, in abt 10-12 hours, as soon as the resident memory was full, the client would start timing out on all its reads. It looks fine for now, i am going to let it continue to see how long it lasts and if the problem comes again. Aaron, yes, i had turned swap off. The total cpu utilization was at 700% roughly.. It looked like kswapd0 was using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a bit. top was reporting high system cpu, and low user cpu. vmstat was not showing swapping. java heap size max is 8 gigs. while only 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat was doing ok from what i remember, i will have to reproduce the issue for the exact numbers. cfstats latency had gone very high, but that is partly due to high cpu usage. One thing was clear, that the SHR was inching higher (due to the mmap) while buffer cache which started at abt 20-25mb reduced to 2 MB by the end, which probably means that pagecache was being evicted by the kswapd0. Is there a way to fix the size of the buffer cache and not let system evict it in favour of mmap? Also, mmapping data files would basically cause not only the data (asked for) to be read into main memory, but also a bunch of extra pages (readahead), which would not be very useful, right? The same thing for index would actually be more useful, as there would be more index entries in the readahead part.. and the index files being small wouldnt cause memory pressure that page cache would be evicted. mmapping the data files would make sense if the data size is smaller than the RAM or the hot data set is smaller than the RAM, otherwise just the index would probably be a better thing to mmap, no?. In my case data size is 85 gigs, while available RAM is 16 gigs (only 8 gigs after heap). /G On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.com wrote: Ruslan, Why did you suggest changing the disk_access_mode ? Gurpreet, I would leave the disk_access_mode with the default until you have a reason to change it. 8 core, 16 gb ram, 6 data disks raid0, no swap configured is swap disabled ? Gradually, the system cpu becomes high almost 70%, and the client starts getting continuous timeouts 70% of one core or 70% of all cores ? Check the server logs, is there GC activity ? check nodetool cfstats to see the read
Re: Much more native memory used by Cassandra then the configured JVM heap size
Linux's default on busy IO boxes is to use all available memory for cache. Try echo 1 /proc/sys/vm/drop_caches and see if your memory comes back (this will drop vfs caches, and in my experience is safe, but YMMV). If your memory comes back, everything is normal and you should leave it alone. It may block for a while if you have a lot of unflushed pages, this is expected. Try setting /proc/sys/vm/dirty_ratio lower if you notice around 20% of your memory is being consumed for dirty (written pages not flushed to storage) memory. I usually run all of my systems at 5 or lower. 20 is too high for large memory servers IMO. -Al On Wed, Jun 13, 2012 at 11:01 AM, Poziombka, Wade L wade.l.poziom...@intel.com wrote: actually, this is without jna.jar. I will add and see if still have same issue ** ** *From:* Poziombka, Wade L *Sent:* Wednesday, June 13, 2012 10:53 AM *To:* user@cassandra.apache.org *Subject:* RE: Much more native memory used by Cassandra then the configured JVM heap size ** ** Seems like my only recourse is to remove jna.jar and just take the performance/swapping pain? ** ** Obviously can’t have the entire box lock up. I can provide a pmap etc. if needed. ** ** *From:* Poziombka, Wade L [mailto:wade.l.poziom...@intel.comwade.l.poziom...@intel.com] *Sent:* Wednesday, June 13, 2012 10:28 AM *To:* user@cassandra.apache.org *Subject:* RE: Much more native memory used by Cassandra then the configured JVM heap size ** ** I have experienced the same issue. The Java heap seems fine but eventually the OS runs out of heap. In my case it renders the entire box unusable without a hard reboot. Console shows: ** ** is there a way to limit the native heap usage? ** ** xfs invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 ** ** Call Trace: [800c9d3a] out_of_memory+0x8e/0x2f3 [8002dfd7] __wake_up+0x38/0x4f [8000f677] __alloc_pages+0x27f/0x308 [80013034] __do_page_cache_readahead+0x96/0x17b [80013971] filemap_nopage+0x14c/0x360 [8000896c] __handle_mm_fault+0x1fd/0x103b [8002dfd7] __wake_up+0x38/0x4f [800671f2] do_page_fault+0x499/0x842 [800b8f39] audit_filter_syscall+0x87/0xad [8005dde9] error_exit+0x0/0x84 ** ** Node 0 DMA per-cpu: empty Node 0 DMA32 per-cpu: empty Node 0 Normal per-cpu: cpu 0 hot: high 186, batch 31 used:23 cpu 0 cold: high 62, batch 15 used:14 … cpu 23 cold: high 62, batch 15 used:8 Node 1 HighMem per-cpu: empty Free pages: 158332kB (0kB HighMem) Active:16225503 inactive:1 dirty:0 writeback:0 unstable:0 free:39583 slab:21496 Node 0 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB lowmem_reserve[]: 0 0 32320 32320 Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0 lowmem_reserve[]: 0 0 32320 32320 Node 0 Normal free:16136kB min:16272kB low:20340kB high:24408kB active:3255624 ** ** ** ** *From:* aaron morton [mailto:aa...@thelastpickle.comaa...@thelastpickle.com] *Sent:* Tuesday, June 12, 2012 4:08 AM *To:* user@cassandra.apache.org *Subject:* Re: Much more native memory used by Cassandra then the configured JVM heap size ** ** see http://wiki.apache.org/cassandra/FAQ#mmap ** ** which cause the OS low memory. If the memory is used for mmapped access the os can get it back later. ** ** Is the low free memory causing a problem ? ** ** Cheers ** ** ** ** - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com ** ** On 12/06/2012, at 5:52 PM, Jason Tang wrote: ** ** Hi ** ** I found some information of this issue And seems we can have other strategy for data access to reduce mmap usage, in order to use less memory. ** ** But I didn't find the document to describe the parameters for Cassandra 1.x, is it a good way to use this parameter to reduce shared memory usage and what's the impact? (btw, our data model is dynamical, which means the although the through put is high, but the life cycle of the data is short, one hour or less). ** ** # Choices are auto, standard, mmap, and mmap_index_only. disk_access_mode: auto ** ** http://comments.gmane.org/gmane.comp.db.cassandra.user/7390 2012/6/12 Jason Tang ares.t...@gmail.com See my post, I limit the HVM heap 6G, but actually Cassandra will use more memory which is not calculated in JVM heap. ** ** I use top to monitor total memory used by Cassandra. ** ** = -Xms6G -Xmx6G -Xmn1600M ** ** 2012/6/12 Jeffrey Kesselman jef...@gmail.com Btw. I suggest you
Re: Supercolumn behavior on writes
I have just check on datastax blog, CQL3 does not support, I am not aware. But as a whole we can via client lib using cql. On Thu, Jun 14, 2012 at 9:12 AM, Dave Brosius dbros...@mebigfatguy.comwrote: Via thrift, or a high level client on thrift, see as an example http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1 On 06/13/2012 11:08 PM, Greg Fausak wrote: Interesting. How do you do it? I have a version 2 CF, that works fine. A version 3 table won't let me invent columns that don't exist yet. (for composite tables). What's the trick? cqlsh -3 cas1 use onplus; cqlsh:onplus select * from at_event where ac_event_id = 7690254; ac_event_id | ac_creation | ac_event_type | ac_id | ev_sev -+--+---+---+ 7690254 | 2011-07-23 00:11:47+ | SERV.CPE.CONN | \N | 5 cqlsh:onplus update at_event set wingy = 'toto' where ac_event_id = 7690254; Bad Request: Unknown identifier wingy This is what I used to create it: // // create the event column family, this contains the static // part of the definition. many additional columns can be specified // in the port from relational, these would be mainly the at_event table // use onplus; create columnfamily at_event ( ac_event_id int PRIMARY KEY, ac_event_type text, ev_sev int, ac_id text, ac_creation timestamp ) with compression_parameters:sstable_compression = '' ; -g On Wed, Jun 13, 2012 at 9:36 PM, samal samalgo...@gmail.com samalgo...@gmail.com wrote: You can't 'invent' columns on the fly, everything has to be declared when you declare the column family. That' s incorrect. You can define name on fly. Validation must be define when declaring CF
Re: Supercolumn behavior on writes
On Wed, Jun 13, 2012 at 9:08 PM, Greg Fausak g...@named.com wrote: Interesting. How do you do it? I have a version 2 CF, that works fine. A version 3 table won't let me invent columns that don't exist yet. (for composite tables). What's the trick? You are able to get the same behaviour as non cql by doing something like this: CREATE TABLE mytable ( id bigint, name text, value text, PRIMARY KEY (id, name) ) WITH COMPACT STORAGE; This table will work exactly like a standard column family with no defined columns. For example: cqlsh:testing INSERT INTO mytable (id, name, value) VALUES (1, 'firstname', 'Alice'); cqlsh:testing INSERT INTO mytable (id, name, value) VALUES (1, 'email', ' al...@example.org'); cqlsh:testing INSERT INTO mytable (id, name, value) VALUES (2, 'firstname', 'Bob'); cqlsh:testing INSERT INTO mytable (id, name, value) VALUES (2, 'webpage', ' http://bob.example.org'); cqlsh:testing INSERT INTO mytable (id, name, value) VALUES (2, 'email', ' b...@example.org'); cqlsh:testing SELECT name, value FROM mytable WHERE id = 2; name | value ---+ email |b...@example.org firstname |Bob webpage | http://bob.example.org Not very exciting, but when you take a look with cassandra-cli: [default@testing] get mytable[2]; = (column=email, value=b...@example.org, timestamp=1339648270284000) = (column=firstname, value=Bob, timestamp=1339648270275000) = (column=webpage, value=http://bob.example.org, timestamp=133964827028) Returned 3 results. Elapsed time: 11 msec(s). which is exactly what you would expect from a normal cassandra column family. So the trick is to separate your static columns and your dynamic columns into separate column families. Column names and types can of course be something different then my example, and inserts can be done within a 'BATCH' to avoid multiple round trips. Also, I'm not trying to advocate this as being a better solution then just using the old thrift interface, I'm just showing an example of how to do it. I personally do prefer this way as it is more predictable, but of course others will have a different opinion. -- Derek Williams
Re: cassandra still tried to join old ring nodes
but on the cassandra -f process, It keeps sending every 5 seconds Your explanation is a bit confusing. WARN 16:01:44,042 ClusterName mismatch from /123.123.123.123 big cluster!=Test Cluster The .123 machine is contacting this machine and asking to join the ring. Possibly because the 123 machine has the ip for this one as a seed, or it's in the gossip data. Can you shut down the .123 machine ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/06/2012, at 2:06 AM, Cyril Auburtin wrote: I joined a ring, then left it now my local ring is just showing my local cassndra node, like intended but on the cassandra -f process, It keeps sending every 5 seconds WARN 16:01:44,042 ClusterName mismatch from /123.123.123.123 big cluster!=Test Cluster where big cluster is the cluster name used when in the previous ring Why do I keep receiving this alerts? is it because 123.123.123.123 still tries to be in the ring? Should I run nodetool decommission (token that had my node) on that node? to stop this alert? thx for explanations