Re: storage engine series
Great set of learning material Jon, thank you so much for the hard work Sincerely Ranjib On Mon, Apr 29, 2024 at 4:24 PM Jon Haddad wrote: > Hey everyone, > > I'm doing a 4 week YouTube series on the C* storage engine. My first > video was last week where I gave an overview into some of the storage > engine internals [1]. > > The next 3 weeks are looking at the new Trie indexes coming in 5.0 [2], > running Cassandra on EBS [3], and finally looking at some potential > optimizations [4] that could be done to improve things even further in the > future. > > I hope these videos are useful to the community, and I welcome feedback! > > Jon > > [1] https://www.youtube.com/live/yj0NQw9DgcE?si=ra1zqusMdSs6vl4T > [2] https://www.youtube.com/live/ZdzwtH0cJDE?si=CumcPny2UG8zwtsw > [3] https://www.youtube.com/live/kcq1TC407U4?si=pZ8AkXkMzIylQgB6 > [4] https://www.youtube.com/live/yj0NQw9DgcE?si=ra1zqusMdSs6vl4T >
Re: Open source equivalents of OpsCenter
we use datadog (metrics emitted as raw statsd) for the dashboard. All repair & compaction is done via blender & serf[1]. [1]https://github.com/pagerduty/blender On Wed, Jul 13, 2016 at 2:42 PM, Kevin O'Connorwrote: > Now that OpsCenter doesn't work with open source installs, are there any > runs at an open source equivalent? I'd be more interested in looking at > metrics of a running cluster and doing other tasks like managing > repairs/rolling restarts more so than historical data. >
Re: Operating on large cluster
We use chef for configuration management and blender for on demand jobs https://github.com/opscode/chef https://github.com/PagerDuty/blender On Oct 23, 2014 2:18 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I was wondering about how do you guys handle a large cluster (50+ machines). I mean there is sometime you need to change configuration (cassandra.yaml) or send a command to one, some or all nodes (cleanup, upgradesstables, setstramthoughput or whatever). So far we have been using things like custom scripts for repairs or any routine maintenance and cssh for specific and one shot actions on the cluster. But I guess this doesn't really scale, I guess we coul use pssh instead. For configuration changes we use Capistrano that might scale properly. So I would like to known, what are the methods that operators use on large cluster out there ? Have some of you built some open sourced cluster management interfaces or scripts that could make things easier while operating on large Cassandra clusters ? Alain
Re: How do you run integration tests for your cassandra code?
you can use tools like chef along side vagrant to bring a cassandra. I personally prefer LXC containers, as they mimic full blown vms, along side chef-lxc which provides chef's awesome DSL for container customization (similar dockerfile, and you wont install chef inside the container), for our scenarios, i use 5 node clusters. and those whole spawn+ token set + rebalance is done as part of the test setup, this is assuming you want to run full blown integration test within a single host. if you can afford multiple hosts, you can set up either ephemeral hosts (chef-metal) or dedicated hosts (normal chef nodes but with a CI agent , like team city or go-cd etc). So, depending upon how big you want integration environment (from a single host developer environment) till a production clone (say for capacity planning load testing), you can automate cassandra cluster provisioning before you actually kick in your tests, which might be jmeter/gatling based testing to actual unit/functional/UAT style testing. I use chef, but im sure alternate exists in other frameworks hope this help ranjib On Mon, Oct 13, 2014 at 10:36 PM, Paco Trujillo f.truji...@genetwister.nl wrote: Hi Kevin We are using a similar solution than horschi. In the past we use CassandraUnit (https://github.com/jsevellec/cassandra-unit) but truncate the tables after and before each test works better for us. We also set gc_grace_seconds to zero. *From:* horschi [mailto:hors...@gmail.com] *Sent:* maandag 13 oktober 2014 22:17 *To:* user@cassandra.apache.org *Subject:* Re: How do you run integration tests for your cassandra code? Hi Kevin, I run my tests against my locally running Cassandra instance. I am not using any framework, but simply truncate all my tables after/before each test. With which I am quite happy. You have to enable the unsafeSystem property, disable durable writes on the CFs and disable auto-snapshot in the yaml for it to be fast. kind regards, Christian On Mon, Oct 13, 2014 at 9:50 PM, Kevin Burton bur...@spinn3r.com wrote: Curious to see if any of you have an elegant solution here. Right now I”m using cassandra unit; https://github.com/jsevellec/cassandra-unit for my integration tests. The biggest problem is that it doesn’t support shutdown. so I can’t stop or cleanup after cassandra between tests. I have other Java daemons that have the same problem. For example, ActiveMQ doesn’t clean up after itself. I was *thinking* of using docker or vagrant to startup a daemon in a container, then shut it down between tests. But this seems difficult to setup and configure … as well as being not amazingly portable. Another solution is to use a test suite, and a setUp/tearDown that drops all tables created by a test. This way you’re still on the same cassandra instance, but the tables are removed for each pass. Anyone have an elegant solution to this? -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com http://spinn3r.com
Re: Exploring Simply Queueing
i want answer the first question why one might use cassandra as a queuing solution: - its the only opensource distributed persistence layer (i.e. no SPOF), that you can run over WAN and provide lan/wan specific quorum controls i know its sub optimal, as the deletion imposes additional compaction/repair penalties, but there no other solution i am awaee of. Most AMQP solutions are broker based and clustering is pain, while things like riak only supports wan based cluster in their commercial solution. I would love to know about other alternatives, And thaks for sharing the ruby based priority queue prototype, it helps people like me (sys ad :-) ) exploring these concepts betrter, cheers ranjib On Mon, Oct 6, 2014 at 1:35 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Shane, On 06 Oct 2014, at 16:34, Shane Hansen shanemhan...@gmail.com wrote: Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. Agreed - however, the use case simply does not justify the additional operations. That being said, there might be a better way to play to the strengths of C*. Ideally everything you do is append only with few deletes or updates. So an interesting way to implement a queue might be to do one insert to put the job in the queue and another insert to mark the job as done or in process or whatever. This would also give you the benefit of being able to replay the state of the queue. Thanks, I’ll try that, too. Jan On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Chris, thanks for taking a look. On 06 Oct 2014, at 04:44, Chris Lohfink clohf...@blackbirdit.com wrote: It appears you are aware of the tombstones affect that leads people to label this an anti-pattern. Without due or any time based value being part of the partition key means you will still get a lot of buildup. You only have 1 partition per shard which just linearly decreases the tombstones. That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems. Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again. Am I understanding this correctly? You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones. Would be hard to identify difference in small tests I think. Thanks, I’ll try to explore the various effects Whats the plan to prevent two consumers from reading same message off of a queue? You mention in docs you will address it at a later point in time but its kinda a biggy. Big lock batch reads like astyanax recipe? I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates. I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?) Thanks again, Jan --- Chris Lohfink On Oct 5, 2014, at 6:03 PM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I have put together some thoughts on realizing simple queues with Cassandra. https://github.com/algermissen/cassandra-ruby-queue The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced. Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took. Jan [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
Re: Monitoring with Cacti
I use nagios + nrpe + some custom scripts to monitor our cassandra/hadoop nodes. Given our long time comfortability with nagios, i didn't find any major gotchas .. regards ranjib On Mon, Sep 13, 2010 at 2:10 AM, Aaron Morton aa...@thelastpickle.comwrote: This is my first encounter with cacti, and it's feels a lot like having a cactus violently inserted in me :) Hopefully this week I can get back to it with a clearer head, part of my annoyance was probably trying to rush it through on a Friday and it's somewhat taxing configuration. Over the weekend I was thinking about going with some python (our in house favorite) in front of the jmxterm jar. I'll also try to learn a bit more about cacti, it cannot be as hard as it seemed on Friday. I'll email you out of the list this week if I make some progress. Aaron On 11 Sep, 2010,at 03:31 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Fri, Sep 10, 2010 at 7:29 PM, aaron morton aa...@thelastpickle.com wrote: Am going through the rather painful process of trying to monitor cassandra using Cacti (it's what we use at work). At the moment it feels like a losing battle :) Does anyone know of some cacti resources for monitoring the JVM or Cassandra metrics other than... mysql-cacti-templates http://code.google.com/p/mysql-cacti-templates/ - provides templates and data sources that require ssh and can monitor JVM heap and a few things. Cassandra-cacti-m6 http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp Coded for version 0.6* , have made some changes to stop it looking for stats that no longer exist. Missing some metrics I think but it's probably the best bet so far. If I get it working I'll contribute it back to them Most of the problems were probably down the how much effort it takes to setup cacti. jmxterm http://www.cyclopsgroup.org/projects/jmxterm/ Allows for command line access to JMX. I started down the path of writing a cacti data source to use this just to see how it worked. Looks like a lot of work. Thanks for any advice. Aaron Setting up cacti is easy, the second time, and third time :) As for cassandra-cacti-m6 (i am the author). Unfortunately, I have been fighting the jmx switcharo battle for about 3 years now hadoop/hbase/cassandra/hornetq/vserver In a nutshell there is ALWAYS work involved. First, is because as you noticed attributes change/remove/add/renamed. Second it takes a human to logically group things together. For example, if you have two items cache hits and cache misses. You really do not want two separate graphs that will scale independently. You want one slick stack graph, with nice colors, and you want a CDEF to calculate the cache hit percentage by dividing one into the other and show that at the bottom. If you want to have a 7.0 branch to cassandra-cacti-m6 I would love the help. We are not on 7.0 yet so I have not had the time just to go out and make graphs for a version we are not using yet :) but if you come up with patches they are happily accepted. Edward