I think very high uptime, and very low data loss is achievable in
Cassandra, but, for new users there are TONS of gotchas. You really
have to know what you're doing, and I doubt that many people acquire
that knowledge without making a lot of mistakes.
I see above that most people are talking
On 06/23/11 09:43, David Boxenhorn wrote:
I think very high uptime, and very low data loss is achievable in
Cassandra, but, for new users there are TONS of gotchas. You really
have to know what you're doing, and I doubt that many people acquire
that knowledge without making a lot of mistakes.
Les,
Cassandra is a good system, but it has not reached version 1.0 yet, nor has
HBase etc. It is cutting edge technology and therefore in practice you are
unlikely to achieve five nines immediately - even if in theory with perfect
planning, perfect administration and so on, this should be
On 06/22/2011 10:03 PM, Edward Capriolo wrote:
I have not read the original thread concerning the problem you mentioned.
One way to avoid OOM is large amounts of RAM :) On a more serious note most
OOM's are caused by setting caches or memtables too large. If the OOM was
caused by a software
On 06/22/2011 07:12 PM, Les Hazlewood wrote:
Telling me to read the mailing lists and follow the issue tracker and use
monitoring software is all great and fine - and I do all of these things
today already - but this is a philosophical recommendation that does not
actually address my question.
Great stuff Chris - thanks so much for the feedback!
Les
In the spirit of your re-formulated questions:
- Read-before-write is a Cassandra anti-pattern, avoid it if at all
possible.
This leads me to believe that Cassandra may not be a good idea for a
primary OLTP data store. For example only create a user object if email
foo is not already in
On 06/23/2011 01:56 PM, Les Hazlewood wrote:
Is there a roadmap or time to 1.0? Even a ballpark time (e.g next year 3rd
quarter, end of year, etc) would be great as it would help me understand
where it may lie in relation to my production rollout.
The C* devs are rather strongly inclined
As an additional concrete detail to Edward's response, 'result
pinning' can provide some performance improvements depending on
topology and workload. See the conf file comments for details:
https://github.com/apache/cassandra/blob/cassandra-0.8.0/conf/cassandra.yaml#L308-315
I would also advise
I'm planning on using Cassandra as a product's core data store, and it is
imperative that it never goes down or loses data, even in the event of a
data center failure. This uptime requirement (five nines: 99.999% uptime)
w/ WAN capabilities is largely what led me to choose Cassandra over other
On Wed, Jun 22, 2011 at 2:24 PM, Les Hazlewood l...@katasoft.com wrote:
I'm planning on using Cassandra as a product's core data store, and it is
imperative that it never goes down or loses data, even in the event of a
data center failure. This uptime requirement (five nines: 99.999% uptime)
Just to be clear:
I understand that resources like [1] and [2] exist, and I've read them. I'm
just wondering if there are any 'gotchas' that might be missing from that
documentation that should be considered and if there are any recommendations
in addition to these documents.
Thanks,
Les
[1]
I understand that every environment is different and it always 'depends' :)
But recommending settings and techniques based on an existing real
production environment (like the user's suggestion to run nodetool repair as
a regular cron job) is always a better starting point for a new Cassandra
Implement monitoring and be proactive...that will stop you waking up to a
big surprise. i'm sure there were symltoms leading up to all 4 nodes going
down. willing to wager that each node went down at different times and not
all went down at once...
On Jun 22, 2011 11:50 PM, Les Hazlewood
Sadly, they all went down within minutes of each other.
Sent from my iPhone
On Jun 22, 2011, at 6:16 PM, Sasha Dolgy sdo...@gmail.com wrote:
Implement monitoring and be proactive...that will stop you waking up
to a big surprise. i'm sure there were symltoms leading up to all 4
nodes going
On 06/22/2011 05:33 PM, Les Hazlewood wrote:
Just to be clear:
I understand that resources like [1] and [2] exist, and I've read them. I'm
just wondering if there are any 'gotchas' that might be missing from that
documentation that should be considered and if there are any recommendations
Committing to that many 9s is going to be impossible since as far as I
know no internet service provier will sla you more the 2 9s . You can
not have more uptime then your isp.
On Wednesday, June 22, 2011, Chris Burroughs chris.burrou...@gmail.com wrote:
On 06/22/2011 05:33 PM, Les Hazlewood
you have to use multiple data centers to really deliver 4 or 5 9's of service
On Wed, Jun 22, 2011 at 7:09 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
Committing to that many 9s is going to be impossible since as far as I
know no internet service provier will sla you more the 2 9s . You
[1] http://www.datastax.com/docs/0.8/operations/index
[2] http://wiki.apache.org/cassandra/Operations
Well if they new some secret gotcha the dutiful cassandra operators of
the world would update the wiki.
As I am new to the Cassandra community, I don't know how 'dutifully' this is
On Wed, Jun 22, 2011 at 4:11 PM, Peter Lin wool...@gmail.com wrote:
you have to use multiple data centers to really deliver 4 or 5 9's of
service
We do, hence my question, as well as my choice of Cassandra :)
Best,
Les
In my opinion 5 9s don't matter. It's the number of impacted customers. You
might be down during peak for 5 mts causing 1000s of customer turn aways
while you might be down during night causing only few customer turn aways.
There is no magic bullet. It's all about learning and improving. You will
so having multiple data centers is step 1 of 4/5 9's.
I've worked on some services that had 3-4 9's SLA. Getting there is
really tough as others have stated. you have to auditing built into
your service, capacity metrics, capacity planning, some kind of
real-time monitoring, staff to respond to
Forget the 5 9's - I apologize for even writing that. It was my shorthand
way of saying 'this can never go down'. I'm not asking for philosophical
advice - I've been doing large scale enterprise deployments for over 10
years. I 'get' the 'it depends' and 'do your homework' philosophy.
All I'm
Start with reading comments on cassandra.yaml and
http://wiki.apache.org/cassandra/Operations
http://wiki.apache.org/cassandra/Operations
As far as I know there is no comprehensive list for performance tuning. More
specifically common setting applicable to everyone. For most part issues
revolve
I have architected, built and been responsible for systems that support 4-5
9s for years. This discussion is not about how to do that generally. It
was intended to be about concrete techniques that have been found valuable
when deploying Cassandra in HA environments beyond what is documented in
Yep, that was [2] on my existing list. Thanks very much for actually
addressing my question - it is greatly appreciated!
If anyone else has examples they'd like to share (like their own cron
techniques, or JVM settings and why, etc), I'd love to hear them!
Best regards,
Les
On Wed, Jun 22,
Les Hazlewood wrote:
I have architected, built and been responsible for systems that support
4-5
9s for years.
So have most of us. But probably by now it should be clear that no
technology can provide concrete recommendations. They can only provide what
might be helpful which varies from
On Wed, Jun 22, 2011 at 4:35 PM, mcasandra mohitanch...@gmail.com wrote:
might be helpful which varies from env to env. That's why I suggest look at
the comments in cassandra.yaml and see which are applicable in your
scenario. I learn something new everytime I read it.
Yep, and this was
Hi Les,
I wanted to offer a couple thoughts on where to start and strategies for
approaching development and deployment with reliability in mind.
One way that we've found to more productively think about the reliability of
our data tier is to focus our thoughts away from a concept of uptime or
I think that Les's question was reasonable. Why *not* ask the community for the
'gotchas'?
Whether the info is already documented or not, it could be an opportunity to
improve the documentation based on users' perception.
The you just have to learn responses are fair also, but that reminds me
Hi Scott,
First, let me say that this email was amazing - I'm always appreciative of
the time that anyone puts into mailing list replies, especially ones as
thorough, well-thought and articulated as this one. I'm a firm believer
that these types of replies reflect a strong and durable
Hi Thoku,
You were able to more concisely represent my intentions (and their
reasoning) in this thread than I was able to do so myself. Thanks!
On Wed, Jun 22, 2011 at 5:14 PM, Thoku Hansen tho...@gmail.com wrote:
I think that Les's question was reasonable. Why *not* ask the community for
On Wed, Jun 22, 2011 at 8:31 PM, Les Hazlewood l...@katasoft.com wrote:
Hi Thoku,
You were able to more concisely represent my intentions (and their
reasoning) in this thread than I was able to do so myself. Thanks!
On Wed, Jun 22, 2011 at 5:14 PM, Thoku Hansen tho...@gmail.com wrote:
I
Edward,
Thank you so much for this reply - this is great stuff, and I really
appreciate it.
You'll be happy to know that I've already pre-ordered your book. I'm
looking forward to it! (When is the ship date?)
Best regards,
Les
On Wed, Jun 22, 2011 at 7:03 PM, Edward Capriolo
34 matches
Mail list logo