Re: C* 2.1.2 invokes oom-killer
In all tables SSTable counts is below 30. On Thu, Feb 19, 2015 at 9:43 AM, Carlos Rolo r...@pythian.com wrote: Can you check how many SSTables you have? It is more or less a know fact that 2.1.2 has lots of problems with compaction so a upgrade can solve it. But a high number of SSTables can confirm that indeed compaction is your problem not something else. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 9:16 AM, Michał Łowicki mlowi...@gmail.com wrote: We don't have other things running on these boxes and C* is consuming all the memory. Will try to upgrade to 2.1.3 and if won't help downgrade to 2.1.2. — Michał On Thu, Feb 19, 2015 at 2:39 AM, Jacob Rhoden jacob.rho...@me.com wrote: Are you tweaking the nice priority on Cassandra? (Type: man nice) if you don't know much about it. Certainly improving cassandra's nice score becomes important when you have other things running on the server like scheduled jobs of people logging in to the server and doing things. __ Sent from iPhone On 19 Feb 2015, at 5:28 am, Michał Łowicki mlowi...@gmail.com wrote: Hi, Couple of times a day 2 out of 4 members cluster nodes are killed root@db4:~# dmesg | grep -i oom [4811135.792657] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using row cache. Noticed that couple of times a day used RSS is growing really fast within couple of minutes and I see CPU spikes at the same time - https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0 . Could be related to compaction but after compaction is finished used RSS doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb. At the time dump was made heap usage is far below 8GB (~3GB) but total RSS is ~50GB. Any help will be appreciated. -- BR, Michał Łowicki -- -- BR, Michał Łowicki
Re: C* 2.1.2 invokes oom-killer
On Thu, Feb 19, 2015 at 10:41 AM, Carlos Rolo r...@pythian.com wrote: So compaction doesn't seem to be your problem (You can check with nodetool compactionstats just to be sure). pending tasks: 0 How much is your write latency on your column families? I had OOM related to this before, and there was a tipping point around 70ms. Write request latency is below 0.05 ms/op (avg). Checked with OpsCenter. -- -- BR, Michał Łowicki
Re: Many pending compactions
Hi, 2.1.3 is now the official latest release - I checked this morning and got this good surprise. Now it's update time - thanks to all guys involved, if I meet anyone one beer from me :-) The changelist is rather long: https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.1.3 Hopefully that will solve many of those oddities and not invent to much new ones :-) Cheers, Roland
Re: Cancel subscription
Please use user-unsubscr...@cassandra.apache.org to unsubscribe from this mailing list. Thanks Regards, Mark On 19 February 2015 at 09:14, Hilary Albutt - CEO hil...@incrediblesoftwaresolutions.com wrote: Cancel subscription
Re: C* 2.1.2 invokes oom-killer
So compaction doesn't seem to be your problem (You can check with nodetool compactionstats just to be sure). How much is your write latency on your column families? I had OOM related to this before, and there was a tipping point around 70ms. -- --
Re: C* 2.1.2 invokes oom-killer
Do you have trickle_fsync enabled? Try to enable that and see if it solves your problem, since you are getting out of non-heap memory. Another question, is always the same nodes that die? Or is 2 out of 4 that die? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 10:49 AM, Michał Łowicki mlowi...@gmail.com wrote: On Thu, Feb 19, 2015 at 10:41 AM, Carlos Rolo r...@pythian.com wrote: So compaction doesn't seem to be your problem (You can check with nodetool compactionstats just to be sure). pending tasks: 0 How much is your write latency on your column families? I had OOM related to this before, and there was a tipping point around 70ms. Write request latency is below 0.05 ms/op (avg). Checked with OpsCenter. -- -- BR, Michał Łowicki -- --
Re: can't delete tmp file
Thanks you Roland -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/ 2015-02-19 20:32 GMT+08:00 Roland Etzenhammer r.etzenham...@t-online.de: Hi, try 2.1.3 - with 2.1.2 this is normal. From the changelog: * Make sure we don't add tmplink files to the compaction strategy (CASSANDRA-8580) * Remove tmplink files for offline compactions (CASSANDRA-8321) In most cases they are safe to delete, I did this when the node was down. Cheers, Roland
Re: C* 2.1.2 invokes oom-killer
Then you are probably hitting a bug... Trying to find out in Jira. The bad news is the fix is only to be released on 2.1.4. Once I find it out I will post it here. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 12:16 PM, Michał Łowicki mlowi...@gmail.com wrote: |trickle_fsync| has been enabled for long time in our settings (just noticed): trickle_fsync: true trickle_fsync_interval_in_kb: 10240 On Thu, Feb 19, 2015 at 12:12 PM, Michał Łowicki mlowi...@gmail.com wrote: On Thu, Feb 19, 2015 at 11:02 AM, Carlos Rolo r...@pythian.com wrote: Do you have trickle_fsync enabled? Try to enable that and see if it solves your problem, since you are getting out of non-heap memory. Another question, is always the same nodes that die? Or is 2 out of 4 that die? Always the same nodes. Upgraded to 2.1.3 two hours ago so we'll monitor if maybe issue has been fixed there. If not will try to enable |tricke_fsync| Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 10:49 AM, Michał Łowicki mlowi...@gmail.com wrote: On Thu, Feb 19, 2015 at 10:41 AM, Carlos Rolo r...@pythian.com wrote: So compaction doesn't seem to be your problem (You can check with nodetool compactionstats just to be sure). pending tasks: 0 How much is your write latency on your column families? I had OOM related to this before, and there was a tipping point around 70ms. Write request latency is below 0.05 ms/op (avg). Checked with OpsCenter. -- -- BR, Michał Łowicki -- -- BR, Michał Łowicki -- BR, Michał Łowicki -- --
Node joining take a long time
Hi guys: I have a 20 nodes C* cluster with vnodes,version is 2.1.2. When I add a node to my cluster,it take a long time ,and somes exists node nodetool nestats show this: Mode: NORMAL Unbootstrap cfe03590-b02a-11e4-95c5-b5f6ad9c7711 /172.19.105.49 Receiving 68 files, 23309801005 bytes total I want know ,is there some problem with my cluster? -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/
Re: can't delete tmp file
Hi, try 2.1.3 - with 2.1.2 this is normal. From the changelog: * Make sure we don't add tmplink files to the compaction strategy (CASSANDRA-8580) * Remove tmplink files for offline compactions (CASSANDRA-8321) In most cases they are safe to delete, I did this when the node was down. Cheers, Roland
Re: can't delete tmp file
Just upgrade my cluster to 2.1.3??? -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/ 2015-02-19 20:32 GMT+08:00 Roland Etzenhammer r.etzenham...@t-online.de: Hi, try 2.1.3 - with 2.1.2 this is normal. From the changelog: * Make sure we don't add tmplink files to the compaction strategy (CASSANDRA-8580) * Remove tmplink files for offline compactions (CASSANDRA-8321) In most cases they are safe to delete, I did this when the node was down. Cheers, Roland
Re: can't delete tmp file
You should upgrade to 2.1.3 for sure. Check the changelog here: https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.1.3 Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 1:44 PM, 曹志富 cao.zh...@gmail.com wrote: Thanks you Roland -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/ 2015-02-19 20:32 GMT+08:00 Roland Etzenhammer r.etzenham...@t-online.de: Hi, try 2.1.3 - with 2.1.2 this is normal. From the changelog: * Make sure we don't add tmplink files to the compaction strategy (CASSANDRA-8580) * Remove tmplink files for offline compactions (CASSANDRA-8321) In most cases they are safe to delete, I did this when the node was down. Cheers, Roland -- --
Re: Data tiered compaction and data model question
What's the typical size of the data field? Unless it's very large, I don't think table 2 is a very wide row (10x20x60x24=288000 events/partition at worst). Plus you only need to store 30 days of data. The over data size is 288000x30=8,640,000 events. I am not even sure if you need C* depending on event size. On Thu, Feb 19, 2015 at 12:00 AM, cass savy casss...@gmail.com wrote: 10-20 per minute is the average. Worstcase can be 10x of avg. On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller moham...@glassbeam.com wrote: What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed *From:* cass savy [mailto:casss...@gmail.com] *Sent:* Wednesday, February 18, 2015 4:21 PM *To:* user@cassandra.apache.org *Subject:* Data tiered compaction and data model question We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on *Table 1*: not very widerow, still be able to query for range of minutes for given day and/or given day and range of hours Create table *log_Event* ( event_day text, event_hr int, event_time timeuuid, data text, PRIMARY KEY (* (event_day,event_hr),*event_time) ) *Table 2: This will be very wide row* Create table *log_Event* ( event_day text, event_time timeuuid, data text, PRIMARY KEY (* event_day,*event_time) ) *Datatiered compaction: recommended for time series data as per below doc. Our data will be kept only for 30 days. Hence thought of using this compaction strategy.* http://www.datastax.com/dev/blog/datetieredcompactionstrategy Create table 1 listed above with this compaction strategy. Added some rows and did manual flush. I do not see any sstables created yet. Is that expected? compaction={'max_sstable_age_days': '1', 'class': 'DateTieredCompactionStrategy'}
Re: Node joining take a long time
What is a long time in your scenario? What is the data size in your cluster? I'm sure Rob will be along shortly to say that 2.1.2 is, in his opinion, broken for production use...an opinion I'd agree with. So bare that in mind if you are running a production cluster. Regards, Mark On 19 February 2015 at 12:19, 曹志富 cao.zh...@gmail.com wrote: Hi guys: I have a 20 nodes C* cluster with vnodes,version is 2.1.2. When I add a node to my cluster,it take a long time ,and somes exists node nodetool nestats show this: Mode: NORMAL Unbootstrap cfe03590-b02a-11e4-95c5-b5f6ad9c7711 /172.19.105.49 Receiving 68 files, 23309801005 bytes total I want know ,is there some problem with my cluster? -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/
Re: run cassandra on a small instance
What does your schema look like, your total data size and your read/write patterns? Maybe you are simply doing a heavier workload than a small instance can handle. Hi Mark, OK well as mentioned this is all test data with almost literally no workload. So I doubt it's the data and/ or workload that's causing it to crash on the 2GB instance after 5 hours. But when I describe the schema with my test data this is what I see: cqlsh use joke_fire1 ... ; cqlsh:joke_fire1 describe schema; CREATE KEYSPACE joke_fire1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; 'module' object has no attribute 'UserTypesMeta' If I take a look at the size of the total amount of data this is what I see: [root@beta-new:/etc/alternatives/cassandrahome/data] #du -hs data 17M data Which includes the system keyspace. But the test data that I created for my use is only 15MB: [root@beta-new:/etc/alternatives/cassandrahome/data/data] #du -hs joke_fire1/ 15M joke_fire1/ But just to see if it's my data that could be causing the problem, I tried removing it all, and setting the IP of the 2GB instance itself as the seed node. I'll try running that for a while and seeing if it crashes. Also I tried just installing a plain cassandra 2.1.3 onto a plain CentOS 6.6 instance on the AWS free tier. It's a t.2 micro instance. So far it's running. I'll keep an eye on both. At this point, I'm thinking that there might be something about my data that could be causing it to fail after 5 or so hours. However I might need some help diagnosing the data, as I'm not familiar on how to do that with cassandra. Thanks! Tim On Thu, Feb 19, 2015 at 3:51 AM, Mark Reddy mark.l.re...@gmail.com wrote: What does your schema look like, your total data size and your read/write patterns? Maybe you are simply doing a heavier workload than a small instance can handle. Regards, Mark On 19 February 2015 at 08:40, Carlos Rolo r...@pythian.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 7:16 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, After the upgrade to 2.1.3, and after almost exactly 5 hours running cassandra did indeed crash again on the 2GB ram VM. This is how the memory on the VM looked after the crash: [root@web2:~] #free -m total used free sharedbuffers cached Mem: 2002 1227774 8 45386 -/+ buffers/cache:794 1207 Swap:0 0 0 And that's with this set in the cassandra-env.sh file: MAX_HEAP_SIZE=800M HEAP_NEWSIZE=200M So I'm thinking now, do I just have to abandon this idea I have of running Cassandra on a 2GB instance? Or is this something we can all agree can be done? And if so, how can we do that? :) Thanks Tim On Wed, Feb 18, 2015 at 8:39 PM, Jason Kushmaul | WDA jason.kushm...@wda.com wrote: I asked this previously when a similar message came through, with a similar response. planetcassandra seems to have it “right”, in that stable=2.0, development=2.1, whereas the apache site says stable is 2.1. “Right” in they assume latest minor version is development. Why not have the apache site do the same? That’s just my lowly non-contributing opinion though. *Jason * *From:* Andrew [mailto:redmu...@gmail.com] *Sent:* Wednesday, February 18, 2015 8:26 PM *To:* Robert Coli; user@cassandra.apache.org *Subject:* Re: run cassandra on a small instance Robert, Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... Andrew On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: run cassandra on a small instance
On Wed, Feb 18, 2015 at 5:26 PM, Andrew redmu...@gmail.com wrote: Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... In my opinion and in no way speaking for or representing Apache Cassandra, Datastax, or anyone else : I think it's a problem of messaging, and a mismatch of expectations between the development team and operators. I think the stable versions are stable by the dev team's standards, and not by operators' standards. While testing has historically been IMO insufficient for a data-store (where correctness really matters) there are also various issues which probably can not realistically be detected in testing. Of course, operators need to be willing to operate (ideally in non-production) near the cutting edge in order to assist in the detection and resolution of these bugs, but I think the project does itself a disservice by encouraging noobs to run these versions. You only get one chance to make a first impression, as the saying goes. My ideal messaging would probably say something like versions near the cutting edge should be treated cautiously, conservative operators should run mature point releases in production and only upgrade to near the cutting edge after extended burn-in in dev/QA/stage environments. A fair response to this critique is that operators should know better than to trust that x.y.0-5 release versions of any open source software are likely to be production ready, even if the website says stable next to the download. Trust, but verify? =Rob
RE: Data tiered compaction and data model question
Reading 288,000 rows from a partition may cause problems. It is recommended not to read more than 100k rows in a partition ((although paging may help). So Table 2 may cause issues. I agree with Kai that for you may not even need C* for this use-case. C* is ideal for data with 3 Vs: volume, velocity and variety. It doesn’t look like your data has the volume or velocity that a standard RDBMS cannot handle. Mohammed From: Kai Wang [mailto:dep...@gmail.com] Sent: Thursday, February 19, 2015 6:06 AM To: user@cassandra.apache.org Subject: Re: Data tiered compaction and data model question What's the typical size of the data field? Unless it's very large, I don't think table 2 is a very wide row (10x20x60x24=288000 events/partition at worst). Plus you only need to store 30 days of data. The over data size is 288000x30=8,640,000 events. I am not even sure if you need C* depending on event size. On Thu, Feb 19, 2015 at 12:00 AM, cass savy casss...@gmail.commailto:casss...@gmail.com wrote: 10-20 per minute is the average. Worstcase can be 10x of avg. On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller moham...@glassbeam.commailto:moham...@glassbeam.com wrote: What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed From: cass savy [mailto:casss...@gmail.commailto:casss...@gmail.com] Sent: Wednesday, February 18, 2015 4:21 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Data tiered compaction and data model question We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on Table 1: not very widerow, still be able to query for range of minutes for given day and/or given day and range of hours Create table log_Event ( event_day text, event_hr int, event_time timeuuid, data text, PRIMARY KEY ( (event_day,event_hr),event_time) ) Table 2: This will be very wide row Create table log_Event ( event_day text, event_time timeuuid, data text, PRIMARY KEY ( event_day,event_time) ) Datatiered compaction: recommended for time series data as per below doc. Our data will be kept only for 30 days. Hence thought of using this compaction strategy. http://www.datastax.com/dev/blog/datetieredcompactionstrategy Create table 1 listed above with this compaction strategy. Added some rows and did manual flush. I do not see any sstables created yet. Is that expected? compaction={'max_sstable_age_days': '1', 'class': 'DateTieredCompactionStrategy'}
designing table
I am trying to design a table in Cassandra in which I will have multiple JSON String for a particular client id. abc123 - jsonA abc123 - jsonB abcd12345 - jsonC My query pattern is going to be - Give me all JSON String for a particular client id. Give me all the client id's and json strings for a particular date. What is the best way to design table for this?
Re: [ANNOUNCE] Apache Gora 0.6 Released
Congras! On Feb 20, 2015 2:59 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Folks, The Apache Gora team are pleased to announce the immediate availability of Apache Gora 0.6. This release addresses a modest 47 issues http://s.apache.org/gora-0.6 with some being major improvements, new functionality and dependency upgrades. Most notably the release involves key upgrades to Hadoop, HBase and Solr dependencies as well as some extremely important bug fixes for the MongoDB module. Suggested Gora database support is as follows - Apache Avro 1.7.6 - Apache Hadoop 1.2.1 and 2.5.2 - Apache HBase 0.98.8-hadoop2 - Apache Cassandra 2.0.2 - Apache Solr 4.10.3 - MongoDB 2.6.X - Apache Accumlo 1.5.1 Gora is released as both source code, downloads for which can be found at our downloads page http://gora.apache.org/downloads.html as well as Maven artifacts which can be found on Maven central http://search.maven.org/#search%7Cga%7C1%7Cgora. Thank you Lewis (on behalf of Gora PMC) -- *Lewis*
Why no virtual nodes for Cassandra on EC2?
Hi all, The guide for installing Cassandra on EC2 says that Note: The DataStax AMI does not install DataStax Enterprise nodes with virtual nodes enabled. http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMI.html Just curious why this is the case. It was my understanding that virtual nodes make taking Cassandra nodes on and offline an easier process, and that seems like something that an EC2 user would want to do quite frequently. -Clint
Re: Data tiered compaction and data model question
Hi Cass, just a hint from the off - if I got it right you have: Table 1: PRIMARY KEY ( (event_day,event_hr),event_time) Table 2: PRIMARY KEY (event_day,event_time) Assuming your events to write come in by wall clock time, the first table design will have a hotspot on a specific node getting all writes for a single hour as (event_day,event_hr) is the partioning key. The second table design will put this hotspot on a specific node per day as event_day is the partitoning key. So please be careful if you have a write intensive workload. I have designed my logging tables with a non datetime key in my partioning key to distribute writes to all nodes at a specific point in time. I have for example PRIMARY KEY ((sensor_id,measure_date)) and the timestamp-value pairs in the rows. They are quite wide as I have about 1 measurements per sensor and id, but analytics and cleanup jobs run daily. Of course as a not so long time cassandra user I can be wrong, please feel free to correct me. Cheers, Roland
Re: Node joining take a long time
So ,what can I do???Waiting for 2.1.4 or upgrade to 2.1.3?? -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/ 2015-02-20 3:16 GMT+08:00 Robert Coli rc...@eventbrite.com: On Thu, Feb 19, 2015 at 7:34 AM, Mark Reddy mark.l.re...@gmail.com wrote: I'm sure Rob will be along shortly to say that 2.1.2 is, in his opinion, broken for production use...an opinion I'd agree with. So bare that in mind if you are running a production cluster. If you speak of the devil, he will appear. But yes, really, run 2.1.1 or 2.1.3, 2.1.2 is a bummer. Don't take the brown 2.1.2. This commentary is likely unrelated to the problem the OP is having, which I would need the information Mark asked for to comment on. :) =Rob
Re: run cassandra on a small instance
I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Hi Carlos, Ok, that's really interesting. So I have to ask, did you have to do anything special to get Cassandra to run on those 1GB AWS instances? I'd love to do the same. I even tried there as well and failed due to lack of memory to run it. And there is no specific reason other than lack of memory that I can tell for it to fail. And it doesn's seem to matter what data I use either. Because even if I remove the data directory with rm -rf, the phenomenon is the same. It'll run for a while, usually about 5 hours and then just crash with the word 'killed' as the last line of output. Thanks Tim On Thu, Feb 19, 2015 at 3:40 AM, Carlos Rolo r...@pythian.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 7:16 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, After the upgrade to 2.1.3, and after almost exactly 5 hours running cassandra did indeed crash again on the 2GB ram VM. This is how the memory on the VM looked after the crash: [root@web2:~] #free -m total used free sharedbuffers cached Mem: 2002 1227774 8 45386 -/+ buffers/cache:794 1207 Swap:0 0 0 And that's with this set in the cassandra-env.sh file: MAX_HEAP_SIZE=800M HEAP_NEWSIZE=200M So I'm thinking now, do I just have to abandon this idea I have of running Cassandra on a 2GB instance? Or is this something we can all agree can be done? And if so, how can we do that? :) Thanks Tim On Wed, Feb 18, 2015 at 8:39 PM, Jason Kushmaul | WDA jason.kushm...@wda.com wrote: I asked this previously when a similar message came through, with a similar response. planetcassandra seems to have it “right”, in that stable=2.0, development=2.1, whereas the apache site says stable is 2.1. “Right” in they assume latest minor version is development. Why not have the apache site do the same? That’s just my lowly non-contributing opinion though. *Jason * *From:* Andrew [mailto:redmu...@gmail.com] *Sent:* Wednesday, February 18, 2015 8:26 PM *To:* Robert Coli; user@cassandra.apache.org *Subject:* Re: run cassandra on a small instance Robert, Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... Andrew On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: run cassandra on a small instance
What I normally do is install plain CentOS (Not any AMI build for Cassandra) and I don't use them for production! I run them for testing, fire drills and some cassandra-stress benchmarks. I will look if I had more than 5h Cassandra uptime. I can even put one up now and do the test and get the results back to you. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 6:41 PM, Tim Dunphy bluethu...@gmail.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Hi Carlos, Ok, that's really interesting. So I have to ask, did you have to do anything special to get Cassandra to run on those 1GB AWS instances? I'd love to do the same. I even tried there as well and failed due to lack of memory to run it. And there is no specific reason other than lack of memory that I can tell for it to fail. And it doesn's seem to matter what data I use either. Because even if I remove the data directory with rm -rf, the phenomenon is the same. It'll run for a while, usually about 5 hours and then just crash with the word 'killed' as the last line of output. Thanks Tim On Thu, Feb 19, 2015 at 3:40 AM, Carlos Rolo r...@pythian.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 7:16 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, After the upgrade to 2.1.3, and after almost exactly 5 hours running cassandra did indeed crash again on the 2GB ram VM. This is how the memory on the VM looked after the crash: [root@web2:~] #free -m total used free sharedbuffers cached Mem: 2002 1227774 8 45386 -/+ buffers/cache:794 1207 Swap:0 0 0 And that's with this set in the cassandra-env.sh file: MAX_HEAP_SIZE=800M HEAP_NEWSIZE=200M So I'm thinking now, do I just have to abandon this idea I have of running Cassandra on a 2GB instance? Or is this something we can all agree can be done? And if so, how can we do that? :) Thanks Tim On Wed, Feb 18, 2015 at 8:39 PM, Jason Kushmaul | WDA jason.kushm...@wda.com wrote: I asked this previously when a similar message came through, with a similar response. planetcassandra seems to have it “right”, in that stable=2.0, development=2.1, whereas the apache site says stable is 2.1. “Right” in they assume latest minor version is development. Why not have the apache site do the same? That’s just my lowly non-contributing opinion though. *Jason * *From:* Andrew [mailto:redmu...@gmail.com] *Sent:* Wednesday, February 18, 2015 8:26 PM *To:* Robert Coli; user@cassandra.apache.org *Subject:* Re: run cassandra on a small instance Robert, Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... Andrew On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- --
Re: run cassandra on a small instance
What I normally do is install plain CentOS (Not any AMI build for Cassandra) and I don't use them for production! I run them for testing, fire drills and some cassandra-stress benchmarks. I will look if I had more than 5h Cassandra uptime. I can even put one up now and do the test and get the results back to you. Hey thanks for letting me know that. And yep! Same here. It's just a plain CentOS 7 VM I've been using. None of this is for production. I also have an AWS account that I use only for testing. I can try setting it up there to and get back to you with my results. Thank you! Tim On Thu, Feb 19, 2015 at 12:52 PM, Carlos Rolo r...@pythian.com wrote: What I normally do is install plain CentOS (Not any AMI build for Cassandra) and I don't use them for production! I run them for testing, fire drills and some cassandra-stress benchmarks. I will look if I had more than 5h Cassandra uptime. I can even put one up now and do the test and get the results back to you. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 6:41 PM, Tim Dunphy bluethu...@gmail.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Hi Carlos, Ok, that's really interesting. So I have to ask, did you have to do anything special to get Cassandra to run on those 1GB AWS instances? I'd love to do the same. I even tried there as well and failed due to lack of memory to run it. And there is no specific reason other than lack of memory that I can tell for it to fail. And it doesn's seem to matter what data I use either. Because even if I remove the data directory with rm -rf, the phenomenon is the same. It'll run for a while, usually about 5 hours and then just crash with the word 'killed' as the last line of output. Thanks Tim On Thu, Feb 19, 2015 at 3:40 AM, Carlos Rolo r...@pythian.com wrote: I have Cassandra instances running on VMs with smaller RAM (1GB even) and I don't go OOM when testing them. Although I use them in AWS and other providers, never tried Digital Ocean. Does Cassandra just fails after some time running or it is failing on some specific read/write? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 19, 2015 at 7:16 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, After the upgrade to 2.1.3, and after almost exactly 5 hours running cassandra did indeed crash again on the 2GB ram VM. This is how the memory on the VM looked after the crash: [root@web2:~] #free -m total used free sharedbuffers cached Mem: 2002 1227774 8 45 386 -/+ buffers/cache:794 1207 Swap:0 0 0 And that's with this set in the cassandra-env.sh file: MAX_HEAP_SIZE=800M HEAP_NEWSIZE=200M So I'm thinking now, do I just have to abandon this idea I have of running Cassandra on a 2GB instance? Or is this something we can all agree can be done? And if so, how can we do that? :) Thanks Tim On Wed, Feb 18, 2015 at 8:39 PM, Jason Kushmaul | WDA jason.kushm...@wda.com wrote: I asked this previously when a similar message came through, with a similar response. planetcassandra seems to have it “right”, in that stable=2.0, development=2.1, whereas the apache site says stable is 2.1. “Right” in they assume latest minor version is development. Why not have the apache site do the same? That’s just my lowly non-contributing opinion though. *Jason * *From:* Andrew [mailto:redmu...@gmail.com] *Sent:* Wednesday, February 18, 2015 8:26 PM *To:* Robert Coli; user@cassandra.apache.org *Subject:* Re: run cassandra on a small instance Robert, Let me know if I’m off base about this—but I feel like I see a lot of posts that are like this (i.e., use this arbitrary version, not this other arbitrary version). Why are releases going out if they’re “broken”? This seems like a very confusing way for new (and existing) users to approach versions... Andrew On February 18, 2015 at 5:16:27 PM, Robert Coli (rc...@eventbrite.com) wrote: On Wed, Feb 18, 2015 at 5:09 PM, Tim Dunphy bluethu...@gmail.com wrote: I'm attempting to run Cassandra 2.1.2 on a smallish 2.GB ram instance over at Digital Ocean. It's a CentOS 7 host. 2.1.2 is IMO broken and should not be used for any purpose. Use 2.1.1 or 2.1.3.
[ANNOUNCE] Apache Gora 0.6 Released
Hi Folks, The Apache Gora team are pleased to announce the immediate availability of Apache Gora 0.6. This release addresses a modest 47 issues http://s.apache.org/gora-0.6 with some being major improvements, new functionality and dependency upgrades. Most notably the release involves key upgrades to Hadoop, HBase and Solr dependencies as well as some extremely important bug fixes for the MongoDB module. Suggested Gora database support is as follows - Apache Avro 1.7.6 - Apache Hadoop 1.2.1 and 2.5.2 - Apache HBase 0.98.8-hadoop2 - Apache Cassandra 2.0.2 - Apache Solr 4.10.3 - MongoDB 2.6.X - Apache Accumlo 1.5.1 Gora is released as both source code, downloads for which can be found at our downloads page http://gora.apache.org/downloads.html as well as Maven artifacts which can be found on Maven central http://search.maven.org/#search%7Cga%7C1%7Cgora. Thank you Lewis (on behalf of Gora PMC) -- *Lewis*
Re: Node joining take a long time
First thank all of you. Almost three days,till right now the status is still Joining. My cluster per 650G a node. -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/ 2015-02-20 3:16 GMT+08:00 Robert Coli rc...@eventbrite.com: On Thu, Feb 19, 2015 at 7:34 AM, Mark Reddy mark.l.re...@gmail.com wrote: I'm sure Rob will be along shortly to say that 2.1.2 is, in his opinion, broken for production use...an opinion I'd agree with. So bare that in mind if you are running a production cluster. If you speak of the devil, he will appear. But yes, really, run 2.1.1 or 2.1.3, 2.1.2 is a bummer. Don't take the brown 2.1.2. This commentary is likely unrelated to the problem the OP is having, which I would need the information Mark asked for to comment on. :) =Rob