Re: Big Data Question

2023-08-21 Thread daemeon reiydelle
C. Clarke famously said that "technology sufficiently advanced is indistinguishable from magic." Magic is coming, and it's coming for all of us* *Daemeon Reiydelle* *email: daeme...@gmail.com * *LI: https://www.linkedin.com/in/daemeonreiydelle/ <https://www.linkedin.com/in/daemeonreiy

Re: Big Data Question

2023-08-17 Thread daemeon reiydelle
iently advanced is indistinguishable from magic." Magic is coming, and it's coming for all of us* *Daemeon Reiydelle* *email: daeme...@gmail.com * *LI: https://www.linkedin.com/in/daemeonreiydelle/ <https://www.linkedin.com/in/daemeonreiydelle/>* *San Francisco 1.415.501.0198/Skype daemeon.c.m.r

Re: Big Data Question

2023-08-17 Thread daemeon reiydelle
different discussion. More of a monologue with an idiot in Finance, but *.* *Arthur C. Clarke famously said that "technology sufficiently advanced is indistinguishable from magic." Magic is coming, and it's coming for all of us* *Daemeon Reiydelle* *email: daeme...@gm

Re: TLS/SSL overhead

2022-02-06 Thread daemeon reiydelle
the % numbers seen high for a clean network and a reasonable fast client. The 5% really not reasonable. No jumbo frames? No network retries (netstats)? *Daemeon Reiydelle* *email: daeme...@gmail.com * *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle* *"Why is it so hard to

Re: about memory problem in write heavy system..

2022-01-07 Thread daemeon reiydelle
Maybe SSD's? Take a look at the IO read/write wait times. FYI, your config changes simply push more activity into memory. Trading IO for mem footprint ;{) *Daemeon Reiydelle* *email: daeme...@gmail.com * *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle* Cognitive Bias: (written

Re: Latest Supported RedHat Linux version for Cassandra 3.11

2021-09-27 Thread daemeon reiydelle
runs through 8.4 for sure. *Daemeon Reiydelle* *email: daeme...@gmail.com * *LI: https://www.linkedin.com/in/daemeonreiydelle/ <https://www.linkedin.com/in/daemeonreiydelle/>* *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle* *“*“I have a different idea of elegance. I don't dres

Re: High mutation stage in multi dc deployment

2021-07-19 Thread daemeon reiydelle
You may want to think about the latency impacts of a cluster that has one node "far away". This is such a basic design flaw that you need to do some basic learning, and some basic understanding of networking and latency. On Mon, Jul 19, 2021 at 10:38 AM MyWorld wrote: > Hi all, > >

Re: underutilized servers

2021-03-05 Thread daemeon reiydelle
that you have 3 vm's on THREE SEPARATE physical systems and WITHOUT network attached storage ... *Daemeon Reiydelle* *email: daeme...@gmail.com * *LI: https://www.linkedin.com/in/daemeonreiydelle/ <https://www.linkedin.com/in/daemeonreiydelle/>* *San Francisco 1.415.501.0198/Skype daemeon.c.m.rei

Re: AWS ephemeral instances + backup

2019-12-05 Thread daemeon reiydelle
If you can handle the slower IO of S3 this can work, but you will have a window of out of date images. YOu don't have a concept of persistent snapshots. <==> Life lived is not about the size of the dog in the fight: It is about the size of the fight in the dog. *Daemeon Reiydelle*

Re: JOB | The Last Pickle (Consultant) in USA

2019-11-20 Thread daemeon reiydelle
Sounds VERY interesting! If the resume passes the BS sniff test (I do big data, which has included C* for a NUMBER of years), I would love to chat. FYI I do a fair amount of readiness assessments, before, during (with laughable results), and now/after my tenure at Accenture/Avanade. Cheers, D.

Re: Aws instance stop and star with ebs

2019-11-06 Thread daemeon reiydelle
aa-d585-38e0-a72b-b36ce82da9cb, r > emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf > > i tried running some tcpdump during that time i dont see any packet loss > during that time. still unsure why east instance which was stopped and > started unreachable to west node almost for 15 minute

Re: Aws instance stop and star with ebs

2019-11-05 Thread daemeon reiydelle
10 minutes is 600 seconds, and there are several timeouts that are set to that, including the data center timeout as I recall. You may be forced to tcpdump the interface(s) to see where the chatter is. Out of curiosity, when you restart the node, have you snapped the jvm's memory to see if e.g.

Re: Ram & Space...

2019-10-23 Thread daemeon reiydelle
pretty clear evidence of a memory leak, tombstone problem (still memory), etc. If this is Apache, then you may need to do some heap dumps and see what is going on (if it is java heap that is OOM'ing, which I suspect. Might want to do some periodic vmstat or equivalent (brute force might be screen

Re: Looking for feedback on automated root-cause system

2019-02-19 Thread daemeon reiydelle
Welcome to the world of testing predictive analytics. I will pass this on to my folks at Accenture, know of a couple of C* clients we run, wondering what you had in mind? *Daemeon C.M. Reiydelle* *email: daeme...@gmail.com * *San Francisco 1.415.501.0198/London 44 020 8144 9872/Skype

Re: benefits oh HBase over Cassandra

2018-08-25 Thread daemeon reiydelle
Messenger can allow for some losses in degenerate infra cases, given a given infra footprint. Also some ability to handle scale up faster as demand increases, peak loads, etc. It therefore becomes a use case specific optimization. Also hBase can run in Hadoop more easily, leveraging blobs (HDFS),

Re: JBOD disk failure

2018-08-14 Thread daemeon reiydelle
you have to explain what you mean by "JBOD". All in one large vdisk? Separate drives? At the end of the day, if a device fails in a way that the data housed on that device (or array) is no longer available, that HDFS storage is marked down. HDFS now needs to create a 3rd replicant. Various timers

Re: Size of a single Data Row?

2018-06-10 Thread daemeon reiydelle
I'd like to split your question into two parts. Part one is around recovery. If you lose a copy of the underlying data because a note fails and let's assume you have three copies, how long can you tolerate the time to restore the third copy? The second question is about the absolute length of a

Re: Mongo DB vs Cassandra

2018-05-31 Thread daemeon reiydelle
If you are starting with a modest amount of data (e.g. under .25 PB) and do not have extremely high availability requirements, then it is easier to start with MongoDB, avoiding HA clusters. I would suggest you start with MongoDB. Both are great, but C* scales far beyond MongoDB FOR A GIVEN LEVEL

Re: Does Cassandra supports ACID txn

2018-04-25 Thread daemeon reiydelle
If ACID is needed, then C* is the wrong architecture. Your architecture needs to match to your business processes as Ben pointed out: "Ask if it’s really needed" There is a concept of a velocity file (modern tech is memSQL'ish) that delivers the high performance, acid transactions of lambda

Re: 答复: A node down every day in a 6 nodes cluster

2018-03-26 Thread daemeon reiydelle
Look for errors on your network interface. I think you have periodic errors in your network connectivity <==> "Who do you think made the first stone spear? The Asperger guy. If you get rid of the autism genetics, there would be no Silicon Valley" Temple Grandin *Daemeon C.M. ReiydelleSan

Re: Cassandra on high performance machine: virtualization vs Docker

2018-02-27 Thread daemeon reiydelle
Docker will provide less per node overhead. And yes, virtualizing smaller nodes out of a bigger physical makes sense. Of course you lose the per node failure protection, but I guess this is not production? <==> "Who do you think made the first stone spear? The Asperger guy. If you get rid

Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread daemeon reiydelle
Terraform plus ansible. Put ok but messy. 5-30,000 nodes and infra Daemeon (Dæmœn) Reiydelle USA 1.415.501.0198 On Thu, Feb 8, 2018, 15:57 Ben Wood wrote: > Shameless plug of our (DC/OS) Apache Cassandra service: >

Re: Meltdown/Spectre Linux patch - Performance impact on Cassandra?

2018-01-09 Thread daemeon reiydelle
Good luck with that. Pcid out since mid 2017 as I recall? Daemeon (Dæmœn) Reiydelle USA 1.415.501.0198 On Jan 9, 2018 10:31 AM, "Dor Laor" wrote: Make sure you pick instances with PCID cpu capability, their TLB overhead flush overhead is much smaller On Tue, Jan 9, 2018 at

Re:

2017-10-01 Thread daemeon reiydelle
What specifically are you looking to monitor? As per above, Datadog has superb components for monitoring, and no need do develop and support anything, for a price of course. I have found management sometimes sees devops resources as pretty low cost (pay for 40, get 70 hours work per week). Depends

Re: new question ;-) // RE: understanding batch atomicity

2017-09-29 Thread daemeon reiydelle
recall that a delete is actually a corner case of an update, as is an insert. As I read the snippet, you are updating multiple tables. The partition key is table specific, so two sets of update batches are handled here. We like to say that we don’t get to choose our parents, that they were given

Re: cassandra hardware requirements (STAT/SSD)

2017-09-29 Thread daemeon reiydelle
Note to the AWS poster, you have some limited understanding of how disks are presented to AWS compute nodes. As a result your post is not relevant, and misleading. When considering throughput, recall that disk IO is ideally parallel. While C* handles IO across multiple devices nicely, the unit of

Re: Tool to manage cassandra

2017-06-16 Thread daemeon reiydelle
Ambari *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *"It is better to be insulted with the truth than kissed with a lie”* On Fri, Jun 16, 2017 at 6:01 AM, Ram Bhatia wrote: > Hi > > May I know, if there a tool similar to Oracle

Re: Restarting nodes and reported load

2017-06-01 Thread daemeon reiydelle
Some random thoughts; I would like to thank you for giving us an interesting problem. Cassandra can get boring sometimes, it is too stable. - Do you have a way to monitor the network traffic to see if it is increasing between restarts or does it seem relatively flat? - What activities are

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
of the day are dangerous men, for they may act their dreams with open eyes, to make it possible.” — T.E. Lawrence* On Tue, May 30, 2017 at 2:18 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: > This isn't an HDFS mailing list. > > On Tue, May 30, 2017 at 2:14 PM daemeon r

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
; I don't believe incremental repair is enabled, I have never enabled it on >> the cluster, and unless it's the default then it is off. Also I don't see a >> setting in cassandra.yaml for it. >> >> >> >> On May 30 2017, at 1:10 pm, daemeon reiydelle <daeme..

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
1:36 PM, Daniel Steuernol <dan...@sendwithus.com> >> wrote: >> >> I don't believe incremental repair is enabled, I have never enabled it on >> the cluster, and unless it's the default then it is off. Also I don't see a >> setting in cassandra.yaml for it. >

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
wrote: > I don't believe incremental repair is enabled, I have never enabled it on > the cluster, and unless it's the default then it is off. Also I don't see a > setting in cassandra.yaml for it. > > > On May 30 2017, at 1:10 pm, daemeon reiydelle <daeme...@gmail.com> wro

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
status. >> >> >> >> On May 30 2017, at 10:25 am, daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >>> When you say "the load rises ... ", could you clarify what you mean by >>> "load"? That has a specific Linux ter

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle
When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something? On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli

Re: How do you do automatic restacking of AWS instance for cassandra?

2017-05-28 Thread daemeon reiydelle
, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they may act their dreams with open eyes, to make it possible.” — T.E. Lawrence sent from my mobile Daemeon Reiydelle skype

Re: How do you do automatic restacking of AWS instance for cassandra?

2017-05-25 Thread daemeon reiydelle
What is restacking? *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *“All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they

Re: How to avoid flush if the data can fit into memtable

2017-05-25 Thread daemeon reiydelle
wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they may act their dreams with open eyes, to make it possible.” — T.E. Lawrence sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On May 25, 2017 9:14 AM, "Jonathan H

Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread daemeon reiydelle
. Lawrence sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On May 16, 2017 2:42 PM, "suraj pasuparthy" <suraj.pasupar...@gmail.com> wrote: > So i though the same, > I see the data via the CQLSH in both the datacenters. consistency is set >

Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread daemeon reiydelle
May I inquire if your configuration is actually data center aware? Do you understand the difference between LQ and replication? *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *“All men dream, but not equally. Those who dream by night in the dusty recesses of

Re: Impact on latency with larger memtable

2017-05-24 Thread daemeon reiydelle
. “All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they may act their dreams with open eyes, to make it possible.” — T.E. Lawrence sent from my mobile Daemeon

Re: Cassandra Node Density thresholds

2017-05-19 Thread daemeon reiydelle
are dangerous men, for they may act their dreams with open eyes, to make it possible.” — T.E. Lawrence sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On May 19, 2017 9:05 AM, "ZAIDI, ASAD A" <az1...@att.com> wrote: > Hello Folks - > > I'm

Re: Can I have multiple datacenter with different versions of Cassandra

2017-05-18 Thread daemeon reiydelle
with open eyes, to make it possible.” — T.E. Lawrence sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On May 18, 2017 8:20 AM, "Chuck Reynolds" <creyno...@ancestry.com> wrote: > I have a need to create another datacenter and upgrade my existing >

Re: Bootstraping a Node With a Newer Version

2017-05-17 Thread daemeon reiydelle
972-74-700-4035 <+972%2074-700-4035> > <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson> > <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections > > > > On Tue, May 16, 2017 at 6:48 PM, daemeon reiydelle <

Re: Bootstraping a Node With a Newer Version

2017-05-16 Thread daemeon reiydelle
it possible.” — T.E. Lawrence sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On May 16, 2017 5:27 AM, "Shalom Sagges" <shal...@liveperson.com> wrote: > Hi All, > > Hypothetically speaking, let's say I want to upgrade my Cassandra cluster, >

Re: Cassandra as a key/object store for many small (10-60k) files

2017-05-05 Thread daemeon reiydelle
cal storage volumes on each machine. > > On May 5, 2017, at 3:25 PM, daemeon reiydelle <daeme...@gmail.com> wrote: > > These numbers do not match e.g. AWS, so guessing you are using local > storage? > > > *...* > > *Making a billion dollar startup is easy: &quo

Re: Cassandra as a key/object store for many small (10-60k) files

2017-05-05 Thread daemeon reiydelle
These numbers do not match e.g. AWS, so guessing you are using local storage? *...* *Making a billion dollar startup is easy: "take a human desire, preferably one that has been around for a really long time … Identify that desire and use modern technology to take out steps."*

Re: Service discovery in the Cassandra cluster

2017-05-02 Thread daemeon reiydelle
My compliments to all of you for being adults, excessively kind, and definitely excessively nice. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Tue, May 2, 2017 at 5:08 PM, Steve Robenalt wrote: > Hi Roman, > > I'm assuming

Re: Service discovery in the Cassandra cluster

2017-05-01 Thread daemeon reiydelle
Yes, you can use host names. That merely adds another level of configuration. When using terraform, I often use node names like and just use those. They are only routable within the region/VPC but are in fact already in dns. You do have to watch out as if you change the seeds (in tf) or the

Re: Seed nodes as part of cluster

2017-05-01 Thread daemeon reiydelle
Caps below for emphasis, not shouting ;{) Seed nodes are IDENTICAL to all other node hdfs nodes or you will wish otherwise. Folks get confused because of terminoligy. I refer to this stuff as "the seed node service of a normal hdfs node". ANY HDFS NODE IS ABLE TO ACT AS A SEED NODE BY DEFINITION.

Re: Migrating from Datastax Distribution to Apache Cassandra

2017-04-07 Thread daemeon reiydelle
Having done variants of this, I would suggest you bring up new nodes at approximately the same Apache version as a separate data center, in your same cluster. Replication strategy may need to be tweaked *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On

Re: Cassandra and LINUX CPU Context Switches

2017-04-05 Thread daemeon reiydelle
This would be normal if the switches are user to kernel mode (disk & network IO are kernel mode activities). If your run queue (jobs waiting to run) is much larger than the number of cores (just a swag but less than 2-3*# of cores), you might have other issues. *...* *Daemeon C.M.

Re: nodes are always out of sync

2017-04-01 Thread daemeon reiydelle
What you are doing is correctly going to result in this, IF there is substantial backlog/network/disk or whatever pressure. What do you think will happen when you write with a replication factor greater than consistency level of write? Perhaps your mental model of how C* works needs work?

Re: How to add a node with zero downtime

2017-03-21 Thread daemeon reiydelle
Possible areas to check: - too few nodes (node overload) - you did not indicate either replication factor, number of nodes. Assume nodes are *rather* full. - network overload (check your TORS's errors, also the tcp stats on the relevant nodes) - look for stop the world garbage collection on

Re: repair performance

2017-03-20 Thread daemeon reiydelle
I would zero in on network throughput, especially interrack trunks sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On Mar 17, 2017 2:07 PM, "Roland Otta" <roland.o...@willhaben.at> wrote: > hello, > > we are quite inexperienced wit

Re: Random slow read times in Cassandra

2017-03-17 Thread daemeon reiydelle
check for level 2 (stop the world) garbage collections. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Mar 17, 2017 at 11:51 AM, Chuck Reynolds wrote: > I have a large Cassandra 2.1.13 ring (60 nodes) in AWS that has >

Re: Issue with Cassandra consistency in results

2017-03-17 Thread daemeon reiydelle
queries are hitting the cluster at peak? If many clients, how do you balance the connection load or do you always hit the same node? sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA 415.501.0198 On Mar 16, 2017 3:25 PM, "srinivasarao daruna" <sree.srin...@gmail.com&

Re: Issue with Cassandra consistency in results

2017-03-16 Thread daemeon reiydelle
The discard due to oom is causing the zero returned. I would guess a cache miss problem of some sort, but not sure. Are you using row, index, etc. caches? Are you seeing the failed prep statement on random nodes (duh, nodes that have the relevant data ranges)? *...* *Daemeon C.M.

Re: Does "nodetool repair" need to be run on each node for a given table?

2017-03-14 Thread daemeon reiydelle
Am I unreasonable in expecting a poster to have looked at the documentation before posting? And that reposting the same query WITHOUT reading the documents (when pointed out to them) when asked to do so is not appropriate? Do we have a way to blackball such? *...* *Daemeon C.M.

Re: Does "nodetool repair" need to be run on each node for a given table?

2017-03-13 Thread daemeon reiydelle
I ​ find it helpful to read the manual first. After review, I would be happy to answer specific questions. https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRepair.html​ *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Mon, Mar 13, 2017

Re: scylladb

2017-03-11 Thread daemeon reiydelle
Recall that garbage collection on a busy node can occur minutes or seconds apart. Note that stop the world GC also happens as frequently as every couple of minutes on every node. Remove that and do the simple arithmetic. sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA

Re: Disconnecting two data centers

2017-03-08 Thread daemeon reiydelle
I guess it depends on the experience one has. This is a common process to bring up, move, build full prod copies, etc. What is outlined is pretty much exactly what I have done 20-50 times (too many to remember). FYI, some of this should be done with nodes DOWN. *...* *Daemeon C.M.

Re: AWS NVMe i3 instances performances

2017-03-01 Thread daemeon reiydelle
We did. Found that, even with (CentOS, Ubuntu both for application compatibility reasons) that there is somewhat less IO and better CPU throughput at the price point. At the time my optimization work for that client ended, Amazon was looking at the IO issue, as perhaps the frame configurations

Re: Current data density limits with Open Source Cassandra

2017-02-08 Thread daemeon reiydelle
your MMV. Think of that storage limit as fairly reasonable for active data likely to tombstone. Add more for older/historic data. Then think about time to recover a node. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Wed, Feb 8, 2017 at 2:14 PM, Ben

Re: Instaclustr Masters scholarship

2017-02-07 Thread daemeon reiydelle
A bunch more welcome than here in the US, to our deep shame and foolishness. Sadly while I am actually involved in this area, I am happy in San Francisco. I would be interested in being part of a pro bono team should that transpire. Thanks, D. *...* *Daemeon C.M. ReiydelleUSA (+1)

Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread daemeon reiydelle
This is not a bug, and in fact changing it would be a serious bug. What it is is a wonderful case of bad coding: would one expect a java/py/bash script that loops on a bunch of read/execut/update calls where each iteration calls time to return the same exact time for the duration of the execution

Re: Throughout of hints delivery

2016-09-17 Thread daemeon reiydelle
timeouts indicate network or equivalent throughput delays, from the physical box's network card out and to the other dc's card. If you are using VM's add that layer. Your network team needs to be looking for ANY timeouts, retries, packets delivered in retry window > 0, etc. ANY value other than

Re: Questions about anti-entropy repair

2016-07-20 Thread daemeon reiydelle
I don't know if my perspective on this will assist, so YMMV: Summary 1. Nodetool repairs are required when a node has issues and can't get its (e.g. hinted handoff) resync done: culprit: usually network, sometimes container/vm, rarely disk. 2. Scripts to do partition range are a pain

Re: Problems with cassandra on AWS

2016-07-11 Thread daemeon reiydelle
xWell, I seem to recall that the private IP's are valid for communications WITHIN one VPC. I assume you can log into one machine and ping (or ssh) the others. If so, check that cassandra.yaml is not set to listen on 127.0.0.1 (localhost). *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198

Re: Blog post on Cassandra's inner workings and performance - feedback welcome

2016-07-09 Thread daemeon reiydelle
d > the correct URL (journeymonitor.com > :4000/tutorials/2016/02/29/cassandra-inner-workings-and-how-this-relates-to-performance/). > > Substantial feedback regarding the actual post still very much welcome. > > Regards, > Manuel > > Am 09.07.2016 um 03:32 schrieb daemeon reiydelle &l

Re: Blog post on Cassandra's inner workings and performance - feedback welcome

2016-07-08 Thread daemeon reiydelle
Localhost is a special network address that never leaves the operating system. It only goes "half way" down the IP stack. Thanks for your efforts! *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Jul 8, 2016 at 5:53 PM, Joaquin Alzola

Re: Is my cluster normal?

2016-07-07 Thread daemeon reiydelle
ick the defaults). >> >> Sent from my iPhone >> >> On Jul 7, 2016, at 4:39 PM, Yuan Fang <y...@kryptoncloud.com> wrote: >> >> yes, it is about 8k writes per node. >> >> >> >> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <daeme...@gm

Re: Is my cluster normal?

2016-07-07 Thread daemeon reiydelle
ul 7, 2016 at 1:51 PM, daemeon reiydelle <daeme...@gmail.com> > wrote: > >> Assuming you meant 100k, that likely for something with 16mb of storage >> (probably way small) where the data is more that 64k hence will not fit >> into the row cache. >> >

Re: Is my cluster normal?

2016-07-07 Thread daemeon reiydelle
Assuming you meant 100k, that likely for something with 16mb of storage (probably way small) where the data is more that 64k hence will not fit into the row cache. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang

Re: Debugging high tail read latencies (internal timeout)

2016-07-07 Thread daemeon reiydelle
Hmm. Would you mind looking at your network interface (appropriate netstat commands). if I am right you will be seeing packet errors, drops, retries, packet out of window receives, etc. What you may be missing is that you reported zero DROPPED latency. Not mean LATENCY. Check your netstats. ANY

Re: all the nost are not reacheable when running massive deletes

2016-04-04 Thread daemeon reiydelle
Network issues. Could be jumbo frames not consistent or other. sent from my mobile sent from my mobile Daemeon C.M. Reiydelle USA 415.501.0198 London +44.0.20.8144.9872 On Apr 4, 2016 5:34 AM, "Paco Trujillo" wrote: > Hi everyone > > > > We are having problems with

Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
This traffic is not part of the anomalous > traffic we're seeing above, since this one goes on port 80 and it's clearly > visible with a separate bpf filter, and its magnitude is far lower than > that anyway > > Thanks > > On Thu, Feb 25, 2016 at 9:03 PM, daemeon reiydelle <daem

Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
watch (divided by two) > > So unfortunately I still don't have any ideas about what's going on and > why I'm seeing 17 GB of internode traffic instead of ~ 5-6. > > On Thursday, February 25, 2016, daemeon reiydelle <daeme...@gmail.com> > wrote: > >> If read & writ

Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
If read & write at quorum then you write 3 copies of the data then return to the caller; when reading you read one copy (assume it is not on the coordinator), and 1 digest (because read at quorum is 2, not 3). When you insert, how many keyspaces get written to? (Are you using e.g. inverted

Re: Checking replication status

2016-02-25 Thread daemeon reiydelle
Hmm. What are your processes when a node comes back after "a long offline"? Long enough to take the node offline and do a repair? Run the risk of serving stale data? Parallel repairs? ??? So, what sort of time frames are "a long time"? *...* *Daemeon C.M. ReiydelleUSA (+1)

Re: Nodes go down periodically

2016-02-23 Thread daemeon reiydelle
If you can, do a few (short, maybe 10m records, delete the default schema between executions) run of Cassandra Stress test against your production cluster (replication=3, force quorum to 3). Look for latency max in the 10s of SECONDS. If your devops team is running a monitoring tool that looks at

RE: Restart Cassandra automatically

2016-02-23 Thread daemeon reiydelle
Cassandra nodes do not go down "for no reason". They are not stateless. I would like to thank you for this marvelous example of a wonderful antipattern. Absolutely fantastic. Thank you! I am not being a satirical smartass. I sometimes am challenged by clients in my presentations about sre best

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread daemeon reiydelle
FYI, my observations were with native, not thrift. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Feb 19, 2016 at 10:12 AM, Sotirios Delimanolis wrote: > Does your cluster contain 24+ nodes or fewer? > > We did the same

Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability

2016-02-19 Thread daemeon reiydelle
May be unrelated, but I found highly variable latency (latency max) when on the 2.1 code tree loading new data (and reading). Others found that G1 or CMS do not make a difference. Some evidence that 8/12/16gb memory make no difference. These were latencies in the 10-30 SECOND range. It did cause

Re: Compatability, performance & portability of Cassandra data types (MAP, UDT & JSON) in DSE Search & Analytics

2016-02-18 Thread daemeon reiydelle
Given you only have 16 columns vs. over 200 ... I would expect a substantial improvement in writes, but not 5x. Ditto reads. I would be interested to understand where that 5x comes from. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Thu, Feb 18, 2016

Re: High Bloom filter false ratio

2016-02-18 Thread daemeon reiydelle
and worse case latencies (allowing for gc times)? Daemeon Reiydelle On Feb 18, 2016 8:57 AM, "Tyler Hobbs" <ty...@datastax.com> wrote: > You can try slightly lowering the bloom_filter_fp_chance on your table. > > Otherwise, it's possible that you're repeatedly queryin

Re: Cassandra Collections performance issue

2016-02-09 Thread daemeon reiydelle
I think the key to your problem might be around "we overwrite every value". You are creating a large number of tombstones, forcing many reads to pull current results. You would do well to rethink why you are having to to overwrite values all the time under the same key. You would be better to

Re: Need Feedback about cassandra-stress tests

2016-01-23 Thread daemeon reiydelle
Might I suggest you START by using the default schema provided by cassandra-stress. Using someone else's schema is great AFTER you use have used a standard and generally well understood baseline. >From that you can decide whether a 4 node x 2 cluster is right for you. FYI, given your 6 way

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-17 Thread daemeon reiydelle
What do the logs say on the seed node (and on the UJ node)? Look for timeout messages. This problem has occurred for me when there was high network utilization between the seed and the joining node, also routing issues. *...* *“Life should not be a journey to the grave with the

Re: electricity outage problem

2016-01-15 Thread daemeon reiydelle
> persisted Gossip state the seed nodes will again be needed to find the rest >> of the cluster. >> >> I'm not sure whether a power outage is the same as stopping and >> restarting an instance (AWS) in terms of whether the restarted instance >> retains

Re: Encryption in cassandra

2016-01-14 Thread daemeon reiydelle
The keys don't have to be on the box. You do need a logi/password for c*. sent from my mobile Daemeon C.M. Reiydelle USA 415.501.0198 London +44.0.20.8144.9872 On Jan 14, 2016 5:16 PM, "oleg yusim" wrote: > Greetings, > > Guys, can you please help me to understand

Re: electricity outage problem

2016-01-12 Thread daemeon reiydelle
This happens when there is insufficient time for nodes coming up to join a network. It takes a few seconds for a node to come up, e.g. your seed node. If you tell a node to join a cluster you can get this scenario because of high network utilization as well. I wait 90 seconds after the first (i.e.

Re: Three questions about cassandra

2015-11-27 Thread daemeon reiydelle
There is a window after a node goes down that changes that node should have gotten will be kept. If the node is down LONGER than that, it will server stale data. If the consistency is greater than two, its data will be ignored (if consistency one, its data could be the first returned, if

Re: Repair Hangs while requesting Merkle Trees

2015-11-11 Thread daemeon reiydelle
Have you checked the network statistics on that machine? (netstats -tas) while attempting to repair ... if netstats show ANY issues you have a problem. If you can put the command in a loop running every 60 seconds for maybe 15 minutes and post back? Out of curiousity, how many remote DC nodes are

Re: Can consistency-levels be different for "read" and "write" in Datastax Java-Driver?

2015-10-26 Thread daemeon reiydelle
If one rethinks "consistency" to mean "copies returned" and "copies written" then one can have different values for the former (datastax) and the latter (within Cassandra). The latter changes eventual consistency (e.g. two copies must be written), the former can speed up a result at the (slight)

Re: How much disk is needed to compact Leveled compaction?

2015-04-05 Thread daemeon reiydelle
You appear to have multiple java binaries in your path. That needs to be resolved. sent from my mobile Daemeon C.M. Reiydelle USA 415.501.0198 London +44.0.20.8144.9872 On Apr 5, 2015 1:40 AM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Hi, I have a cluster of 5 nodes. We use

Re: COMMERCIAL:Re: Cross-datacenter requests taking a very long time.

2015-04-02 Thread daemeon reiydelle
, and loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Thu, Apr 2, 2015 at 12:39 PM, Andrew Vant andrew.v...@rackspace.com wrote: On Mar 31, 2015, at 4:59 PM, daemeon reiydelle daeme...@gmail.com wrote: What

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread daemeon reiydelle
Jack did a superb job of explaining all of your issues, and his last sentence seems to fit your needs (and my experience) very well. The only other point I would add is to ascertain if the use patterns commend microservices to abstract from data locality, even if the initial deployment is a noop

Re: Cluster status instability

2015-04-02 Thread daemeon reiydelle
Do you happen to be using a tool like Nagios or Ganglia that are able to report utilization (CPU, Load, disk io, network)? There are plugins for both that will also notify you of (depending on whether you enabled the intermediate GC logging) about what is happening. On Thu, Apr 2, 2015 at 8:35

Re: Frequent timeout issues

2015-04-02 Thread daemeon reiydelle
May not be relevant, but what is the default heap size you have deployed. Should be no more than 16gb (and be aware of the impacts of gc on that large size), suggest not smaller than 8-12gb. On Wed, Apr 1, 2015 at 11:28 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Are you writing multiple

Re: Column value not getting updated

2015-04-02 Thread daemeon reiydelle
Interesting that you are finding excessive drift from public time servers. I only once saw that problem with AWS' time servers. To be conservative I sometimes recommend that clients spool up their own time server, but realize IT will also drift if the public time servers do! Somewhat different if

  1   2   >