Potential bug re to MVs after upgrading from 3.11.x to 4.0.x

2023-05-31 Thread Rahul Singh
Good afternoon, Am looking into some latent issues post 3.11.x to 4.0.x upgrade. The system is using materialized views and the core problem is related to how the mutations are being sent from the parent table to two related materialized views. In 3.11.x, without any tuning ( no flag set for

Re: Ansible Cassandra Collection

2021-03-19 Thread Rahul Singh
/ similar issues. Best regards, Rahul Singh rahul.xavier.si...@gmail.com http://cassandra.link On Fri, Mar 19, 2021 at 9:27 AM Erick Ramirez wrote: > Fantastic, Rhys! Thanks very much. I'm sure it will prove very useful for > users in the community. Cheers! > >>

Re: data modeling qu: use a Map datatype, or just simple rows... ?

2020-09-19 Thread Rahul Singh
with natural string keys like “email.” Best regards, Rahul Singh From: Sagar Jambhulkar Sent: Saturday, September 19, 2020 6:45:25 AM To: user@cassandra.apache.org ; Attila Wind Subject: Re: data modeling qu: use a Map datatype, or just simple rows... ? Don't really see a difference in two options

Cassandra.Link Knowledge Base - v. 0.7 - Jobs Section

2020-08-05 Thread Rahul Singh
Folks, Quick update. We added a jobs section that's aggregating jobs from a few different job markets but only those that relate to Cassandra. Right now it's just a "lucene" based filter from a larger data set, but our team is working to put some ML action to do NLP based classification. Would

Open Request for Reference Architectures / Case Studies of Apache Cassandra

2020-07-31 Thread Rahul Singh
Folks, I'm looking for articles, blogs, diagrams, or someone to answer some questions so that I can help repopulate a *case study database of uses of Cassandra*. I'll be publishing these on the *https://cassandra.link * site. I'm also collecting reference architectures so

Cassandra.Link Knowledge Base - v. 0.5

2020-03-03 Thread Rahul Singh
Our apprentice / analyst team (Tanaka, Jordon, and Cynthia) at Anant.us have been making improvements to the best public collection of curated links for Apache Cassandra at https://Cassandra.Link — they’ve fixed several issues related to the search interface. This week we’ll also be releasing

Cassandra.Link Knowledge Base - v. 0.4

2019-07-20 Thread Rahul Singh
Hey Cassandra community , Thanks for all the feedback in the past on my cassandra knowledge base project. Without the feedback cycle it’s not really for the community. V. 0.1 - Awesome Cassandra  readme.me https://anant.github.io/awesome-cassandra Hundreds of Cassandra articles tools etc.

Re: CassKop : a Cassandra operator for Kubernetes developped by Orange

2019-05-24 Thread Rahul Singh
Fantastic! Now there are three teams making k8s operators for C*: Datastax, Instaclustr, and now Orange. rahul.xavier.si...@gmail.com http://cassandra.link I'm speaking at #DataStaxAccelerate, the world’s premiere #ApacheCassandra conference, and I want to see you there! Use my code Singh50 for

Re: Cassandra cross dc replication row isolationCassandra cross dc replication row isolation

2019-05-07 Thread Rahul Singh
Depends on the consistency level you are setting on write and read. What CL are you writing at and what CL are you reading at? The consistency level tells the coordinator when to send acknowledgement of a write and whether to cross DCs to confirm a write. It also tells the coordinator how many

Re: Five Questions for Cassandra Users

2019-04-01 Thread Rahul Singh
Answers inline. 1. Do the same people where you work operate the cluster and write the code to develop the application? No but the operators need to know development , data-modeling, and generally how to "code" the application. (Coding is a low-level task of assigning a code to a

Re: How do u setup networking for Opening Solr Web Interface when on cloud?

2019-04-01 Thread Rahul Singh
This is probably not a question for this community... but rather for Datastax support or the Datastax Academy slack group. More specifically this is a "how to expose solr securely" question which is amply answered well on the interwebs if you look for it on Google. rahul.xavier.si...@gmail.com

Re: TWCS Compactions & Tombstones

2019-03-26 Thread Rahul Singh
What's your timewindow? Roughly how much data is in each window? If you examine the sstable data and see that is truly old data with little chance that it has any new data, you can just remove the SStables. You can do a rolling restart -- take down a node, remove mc-254400-* and then start it up.

Re: Merging two cluster's in to one without any downtime

2019-03-26 Thread Rahul Singh
In my experience, I'd use two methods to make sure that you are covering your ass. 1. "old school" methodology would be to do the SStable load from old to new cluster -- if you do incremental snapshots, then you could technically minimize downtime and just load the latest increments with a little

Re: good monitoring tool for cassandra

2019-03-14 Thread Rahul Singh
I wrote this last year. It's mostly still relevant --- as Jonathan said, Prometheus+Grafana is the best "make your own hammers and nails" approach. https://blog.anant.us/resources-for-monitoring-datastax-cassandra-spark-solr-performance/ On Thu, Mar 14, 2019 at 8:13 PM Jonathan Haddad wrote:

Re: Adding New Column with Default Value

2019-03-14 Thread Rahul Singh
*Spark.* Alter the table, add a column. Run a spark job to scan your table, and set a value. * val myKeyspace = "pinch" val myTable = "hitter"* *def updateColumns(row: CassandraRow): CassandraRow = { * * val inputMap = row.toMap val newData = Map( "newColumn" -> "somevalue" ) * * var outputMap

Re: update manually rows in cassandra

2019-03-14 Thread Rahul Singh
CQL supports JSON in and out from the Cassandra table, but if your JSON in the table is a string, then you need to update it as a string. https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertJSON.html https://docs.datastax.com/en/cql/3.3/cql/cql_using/useQueryJSON.html What's the schema

Re: Inconsistent results after restore with Cassandra 3.11.1

2019-03-14 Thread Rahul Singh
Can you define "inconsistent" results.. ? What's the topology of the cluster? What were you expecting and what did you get? On Thu, Mar 14, 2019 at 7:09 AM sandeep nethi wrote: > Hello, > > Does anyone experience inconsistent results after restoring Cassandra > 3.11.1 with refresh command? Was

Re: [EXTERNAL] Re: Migrate large volume of data from one table to another table within the same cluster when COPY is not an option.

2019-03-14 Thread Rahul Singh
Adding to Stefan's comment. There is a "scylladb" migrator, which uses the spark connector from Datastax, and theoretically can work on any Cassandra compiant DB.. and should not be limited to cassandra to scylla.

Re: Audit in C*

2019-03-13 Thread Rahul Singh
Which version are you referring to? On Wed, Mar 13, 2019 at 10:28 AM Nitan Kainth wrote: > Hi, > > Anybody have used auditing to find out failed login attempts, or > unauthorized access tries. > > I found ecAudit by Ericsson, is it free to use? Has anybody tried it? > > Ref:

Re: AxonOps - Cassandra operational management tool

2019-03-12 Thread Rahul Singh
Nice.. Good to see the community producing tools around the Cassandra product. Few pieces of feedback *Kudos* 1. Glad that you are doing it 2. Looks great 3. Willing to try it out if you find this guy called "Free Time" for me :) *Criticism* 1. It mimics a lot of stack components that are out

Re: cassandra upgrades multi-DC in parallel

2019-03-12 Thread Rahul Singh
Carl, If you have done an automation and tested it a few time on a lower environment with the same data from production, I'd say go for it.. but as Jonathan said, if there's an issue, you won't be able to continue operations. On Tue, Mar 12, 2019 at 3:20 PM Jonathan Haddad wrote: > Nothing

Re: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

2019-02-27 Thread Rahul Singh
+1 on Datastax and could consider looking at Elassandra. On Thu, Feb 7, 2019 at 9:14 AM Durity, Sean R wrote: > Kenneth is right. Trying to port/support a relational model to a CQL model > the way you are doing it is not going to go well. You won’t be able to > scale or get the search

Re: High GC pauses leading to client seeing impact

2019-02-27 Thread Rahul Singh
There are a few factors: sometimes data that is in a fat partition clogs up the heap space / memtable space and tombstones don't help that much either. This is worsened by data skew . I agree , if CMS is working for now, continue using it and then upgrade to better versions of Java / C*. Few

Re: Connection status on cluster exposed anywhere?

2019-02-27 Thread Rahul Singh
You can get statistics at a table level, and you can get some information at a keyspace level, but it's an approximation. Better to get tablelevel and aggregate up. Here are some pointers. https://blog.anant.us/resources-for-monitoring-datastax-cassandra-spark-solr-performance/ On Wed, Feb 27,

Re: Feedback wanted for Knowledge base for all things cassandra (cassandra.link)

2019-02-25 Thread Rahul Singh
t; > Author/Presenter > > Publisher/Producer/Event > > > > Thank you for the continuing effort you have made on this project Rahul! > > > > Kenneth Brotman > > > > *From:* Rahul Singh [mailto:rahul.xavier.si...@gmail.com] > *Sent:* Monday, Feb

Feedback wanted for Knowledge base for all things cassandra (cassandra.link)

2019-02-25 Thread Rahul Singh
Folks, I've been scrounging time to work on a knowledge resource for all things Cassandra ( Cassandra, DSE, Scylla, YugaByte, Elassandra) I feel like the Cassandra core community still has the most knowledge even though people are fragmenting into their brands. Would love to get your feedback

Re: C* as fluent data storage, 10MB/sec/node?

2018-12-20 Thread Rahul Singh
Agree with JEFF in twcs. Also look At https://github.com/paradoxical-io/cassieq for reference. Good ideas for a queue on Cassandra. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business

Re: Optimizing for connections

2018-12-20 Thread Rahul Singh
See inline Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Dec 9, 2018, 2:02 PM -0500, Devaki, Srinivas , wrote: > Hi Guys, > > Have a couple of

Re: Alter table

2018-12-20 Thread Rahul Singh
If you use collections such as a map you could get by with just upserts. A collection in a column gives you the ability to have “flexible” schema for your “documents” as in mongo while the regular fields can act as “records” as in a more Traditional table. Rahul Singh Chief Executive Officer m

Re: Cassandra repair in different version

2018-09-21 Thread Rahul Singh
Is there a reason why these versions are so different ? I would recommend bringing 3.0.6 to 3.0.13 before doing cluster wise commands. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital

Re: Cassandra system table diagram

2018-09-21 Thread Rahul Singh
I think his question was related specifically to the system tables. KDM is a good tool for designing the tables but not necessarily for viewing the system tables. Abdul, try out a tool called DB Schema Visualizer. It supports Cassandra Rahul Singh Chief Executive Officer m 202.905.2818 Anant

Re: Scrub a single SSTable only?

2018-09-11 Thread Rahul Singh
What’s the RF for that data ? If you can manage downtime one node I’d recommend just bringing it down, and then repairing after you delete the bad file and bring it back up. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C

Re: Using CDC Feature to Stream C* to Kafka (Design Proposal)

2018-09-11 Thread Rahul Singh
p 10, 2018 at 3:08 PM, Rahul Singh > > wrote: > > > In response to mimicking Advanced replication in DSE. I understand the > > > goal. Although DSE advanced replication does one way, those are use cases > > > with limited value to me because ultimately it’s sti

Re: Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-10 Thread Rahul Singh
Look into Kafka Connect. It does tracking internally in a topic. Works better going from relational to Cassandra. Still won’t fix your potential data model issue related to skew and wide partitions. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW

Re: Using CDC Feature to Stream C* to Kafka (Design Proposal)

2018-09-10 Thread Rahul Singh
on both clusters / DBS. All that means is that I need to sequence the change before it happens so I can predictably ensure it’s Scheduled for write / Mutation. So I’m Back to square one: having a definitive queue / ledger separate from the individual commit log of the cluster. Rahul Singh

Re: Using CDC Feature to Stream C* to Kafka (Design Proposal)

2018-09-10 Thread Rahul Singh
you’ll want to do what Jon suggested and source the event from Kafka for all subsequent processes rather than process in Cassandra and the create the event in Kafka. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We

Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Rahul Singh
Look here for some “migration” or data modeling articles. https://anant.github.io/awesome-cassandra/ Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Sep 5

Re: [EXTERNAL] Regarding migrating data from Oracle to Cassandra.migrate data from Oracle to Cassandra.

2018-09-05 Thread Rahul Singh
providing. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Sep 5, 2018, 10:47 AM -0500, Jeff Jirsa , wrote: > All of  Sean's points are good, a few m

Re: Datastax encryption with kms

2018-09-04 Thread Rahul Singh
This is a Cassandra user group — consider joining the Datastax Academy Slack group and asking there. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Sep 4

Re: A blog about Cassandra in the IoT arena

2018-08-29 Thread Rahul Singh
research paper about Dotted DB and an > attempt to make delete without using tombstones:  > http://haslab.uminho.pt/tome/files/dotteddb_srds.pdf > > > > > On Fri, Aug 24, 2018 at 12:38 AM, Rahul Singh > > wrote: > > > Agreed. One of the ideas I had on partitio

RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Rahul Singh
YugaByte is also another new dancer in the Cassandra dance. The data store is based on RocksDB — and it’s written in C++. Although they ar wire compliant with c* I’m pretty are everything under the hood is NOT a port like Scylla was initially. Rahul Singh Chief Executive Officer m 202.905.2818

Re: Tombstone experience

2018-08-24 Thread Rahul Singh
it to set a short TTL that you would have deleted and that will eventually clear our data depending on the value you set. My suggestion for those cases where you must do business rules deletions, use a continuous spark job / Spark streaming on another DC to maintain data hygiene. Rahul Singh Chief

Re: How to rename the column name in Cassandra tables

2018-08-23 Thread Rahul Singh
a static column, since you cannot use a static column in the table's primary key. Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Aug 13, 2018, 7:42 AM -0500

Re: Fwd: Removing Extra Spaces and Row counts while using Capture Command

2018-08-23 Thread Rahul Singh
What’s your goal? Just output the results and save as JSON? There may be a better way to do what you want. https://github.com/tenmax/cqlkit/blob/master/README.md Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We

Re: Cassandra 2.2.7 Compaction after Truncate issue

2018-08-23 Thread Rahul Singh
David , What CL do you set when running this command? Rahul Singh Chief Executive Officer m 202.905.2818 Anant Corporation 1010 Wisconsin Ave NW, Suite 250 Washington, D.C. 20007 We build and manage digital business technology platforms. On Aug 14, 2018, 11:49 AM -0500, David Payne , wrote

Re: 90million reads

2018-08-23 Thread Rahul Singh
Agreed. If your data model is good and no major read latencies due to little or no data skew, wide partitions, or tombstones, you can literally scale linearly. You could also consider having a plan in which you ramp up as the traffic increases. Rahul Singh Chief Executive Officer m

Re: A blog about Cassandra in the IoT arena

2018-08-23 Thread Rahul Singh
n index file) > 2. tombstone a non-issue > > that day, Cassandra will dominate any other IoT technology out there > > Until then ... > > > On Thu, Aug 23, 2018 at 4:54 PM, Rahul Singh > > wrote: > > > Good analysis of how the different key structures affect

Re: A blog about Cassandra in the IoT arena

2018-08-23 Thread Rahul Singh
Good analysis of how the different key structures affect use cases and performance. I think you could extend this article with potential evaluation of FiloDB which specifically tries to solve the OLAP issue with arbitrary queries. Another option is leveraging Elassandra (index in Elasticsearch

Re: Work in Progress - Awesome Cassandra Resources w/ Outline

2018-08-22 Thread Rahul Singh
good, I would say https://academy.datastax.com/s > upport-blog/deeper-dive-diagnosing-dse-performance-issues-ttop-and- > multidump > > And since now there is a official blog, wouldn't be good to have this > resources there? > > Regards, > Horia > > On ons, 2018-08-08 at 07:14 -0400,

Re: Repair daily refreshed table

2018-08-19 Thread Rahul Singh
because I'm overwriting whole table with new TTL, process > creates tons of thumbstones and I'm more concerned with them. > > Regards, > Maxim. > > > On Sun, Aug 19, 2018 at 3:02 AM Rahul Singh > > wrote: > > > Are you loading using a batch process? What’s

Re: Repair daily refreshed table

2018-08-18 Thread Rahul Singh
Are you loading using a batch process? What’s the frequency of the data Ingest and does it have to very fast. If not too frequent and can be a little slower, you may consider a higher consistency to ensure data is on replicas. Rahul On Aug 18, 2018, 2:29 AM -0700, Maxim Parkachov , wrote: > Hi

Re: ETL options from Hive/Presto/s3 to cassandra

2018-08-07 Thread Rahul Singh
Spark is scalable to as many nodes as you want and could be collocated with the data nodes — sstableloader wont be as performant for larger datasets. Although it can be run in parallel on different nodes I don’t believe it to be as fault tolerant. If you have to do it continuously I would even

Re: Hinted Handoff

2018-08-07 Thread Rahul Singh
What is the data size that you are talking about ? What is your compaction strategy? I wouldn’t recommend having such an aggressive TTL. Why not put a clustering key that allows you to get the data fairly quickly but have a longer TTL? Cassandra can still be used if the there is a legitimate

Re: Huge daily outbound network traffic

2018-08-07 Thread Rahul Singh
Are you sure you don’t have an outside process that is doing an export , Spark job, non AWS managed backup process ? Is this network out from Cassandra or from the network? Rahul On Aug 7, 2018, 4:09 AM -0400, Behnam B.Marandi , wrote: > Hi, > I have a 3 node Cassandra cluster (version 3.11.1)

Re: Data model storage optimization

2018-07-29 Thread Rahul Singh
How many rows in average per partition? Let me get this straight : You are bifurcating your partitions on either email or username , essentially potentially doubling the data because you don’t have a way to manage a central system of record of users ? I would do this: (my opinion) Migrate to a

Re: optimization to cassandra-env.sh

2018-07-29 Thread Rahul Singh
Depends on which GC you are using but you can definitely manage GC - but you will always be stuck to the upper limit of memory. I found the Hubspot gc visualizer and the associated blog post very helpful in the past. https://github.com/HubSpot/gc_log_visualizer/blob/master/README.md

Re: cassandro nodes restarts

2018-07-29 Thread Rahul Singh
Need to review java gc, system , network, disk, memory, node, and table statistics. A lot can be discerned from visually examining the charts. Eg. if the nodes with the most local reads is failing or is it the one with the most writes or is it completely unrelated. Since it’s a distributed

Re: Cassandra crashes after loading data with sstableloader

2018-07-29 Thread Rahul Singh
What does “hash” Data look like? Rahul On Jul 24, 2018, 11:30 AM -0400, Arpan Khandelwal , wrote: > I need to clone data from one keyspace to another keyspace. > We do it by taking snapshot of keyspace1 and restoring in keyspace2 using > sstableloader. > > Suppose we have following table with

Work in Progress - Bringing it all together in one "Awesome Cassandra" README

2018-07-26 Thread Rahul Singh
uting (e.g. Kafka, Spark, Akka, Kubernetes, etc.) . I've got about ~120 or so resources organized in this Readme, and I have a queue of another 100 or so. Please feel free to send me any focused Cassandra blogs related to development, architecture, or devops. Thanks, Rahul Singh Chief Executi

Re: Infinite loop of single SSTable compactions

2018-07-26 Thread Rahul Singh
Few questions What is your maximumcompactedbytes across the cluster for this table ? What’s your TTL ? What does your data model look like as in what’s your PK? Rahul On Jul 25, 2018, 1:07 PM -0400, James Shaw , wrote: > nodetool compactionstats  --- see compacting which table > nodetool

Re: cassandro nodes restarts

2018-07-26 Thread Rahul Singh
Do the same nodes reboot or is it arbitrary? I’m wondering if it’s an isolated incident related to dat / traffic skew or could happen on any coordinator Rahul On Jul 26, 2018, 12:31 AM -0400, Jeff Jirsa , wrote: > It’s a warning, but probably not causing you problems > > A 20kB batch is a hint

Re: apache cassandra development process and future

2018-07-18 Thread Rahul Singh
acle now has a Datastax offering 3. Mesosphere offers supported versions of Cassandra and Datastax 4. Kubernetes and related purveyors use Cassandra as prime example as a part of a Kubernetes backed cloud agnostic orchestration framework 5. What Alain mentioned earlier. -- Rahul Singh rahul.si...@

Re: Cassandra node RAM amount vs data-per-node/total data?

2018-07-17 Thread Rahul Singh
heapspace, so unncessary GC pressure even with G1GC … which has STW pauses … eventually. Non-response was generally due to GC pauses… (considering that Data model was good all around) On Jul 17, 2018, 10:39 AM -0400, Vsevolod Filaretov , wrote: > @Rahul Singh thank you for the answer! >

Re: Cassandra Repair

2018-07-17 Thread Rahul Singh
17, 2018, at 4:45 AM, Rahul Singh > > wrote: > > > > Have you considered looking into reaper project — could save you time in > > figuring out your own strategy.  > > https://github.com/thelastpickle/cassandra-reaper > > > > Otherwise you can always do a

Re: Cassandra node RAM amount vs data-per-node/total data?

2018-07-17 Thread Rahul Singh
~ 128GB. The lowest I’ve gone is 16GB but that’s for dev purposes only. -- Rahul Singh rahul.si...@anant.us https://www.anant.us/datastax Anant Corporation On Jul 17, 2018, 8:26 AM -0400, Vsevolod Filaretov , wrote: > What are general community and/or your personal experience viewpoi

RE: [EXTERNAL] New cluster vs Increasing nodes to already existed cluster

2018-07-17 Thread Rahul Singh
You can make new clusters or you can isolate with datacenters that don’t have a keyspace replicated. On Jul 16, 2018, 10:41 AM -0400, Durity, Sean R , wrote: > In most cases, we separate clusters by application. This does help with > isolating problems. A bad query in one application won’t

Re: Cassandra Repair

2018-07-17 Thread Rahul Singh
less than your shortest GC grace seconds. So if you have a GC of 10 days, you want to complete your repairs in 9 days… -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jul 16, 2018, 5:15 PM -0400, rajasekhar kommineni , wrote: > Hello All, > > > I have all cluster no

Re: Bind keyspace to specific data directory

2018-07-17 Thread Rahul Singh
What’s the goal, Abdul? Is it for security reasons or for organizational reasons. You could try prefixing / suffixing the keyspace names if its for organizational reasons (For now) if you don’t want to do the manual management of mounts as Anthony suggested . -- Rahul Singh rahul.si

Re: Cassandra recommended server uptime?

2018-07-17 Thread Rahul Singh
It’s likely that if you have server stability issues its because of data model or compaction strategy configurations which lead to out of memory issues or massive GC pauses. Rebooting wouldn’t solve those issues. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jul 17, 2018, 7:28 AM

Clarification needed on how triggers execute on batch mutations

2018-07-12 Thread Rahul Singh
management , I am expecting that regardless of whether I'm doing a logged or unlogged batch, the trigger on any given table will only be triggered once per mutated partition. Is my assumption correct? Rahul Singh Chief Executive Officer | Internet Architecture https://www.anant.us/datastax m

Re: Jmx_exporter CPU spike

2018-07-10 Thread Rahul Singh
Nice find, Ben. I added this to my list of c* monitoring tools. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jul 9, 2018, 8:20 PM -0500, rajpal reddy , wrote: > Thanks Ben!. will look into it > > On Jul 9, 2018, at 10:42 AM, Ben Bromhead wrote: > > > > Hi Rajp

Re: Installation

2018-07-10 Thread Rahul Singh
turn on the new binaries, one node at a time. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jul 9, 2018, 6:35 PM -0500, rajpal reddy , wrote: > We have our infrastructure in cloud so opted for adding new dc with tar.gz > then removed the old dc with package installation > &

Re: Jmx_exporter CPU spike

2018-07-08 Thread Rahul Singh
How often are you polling the JMX? How much of a spike are you seeing in CPU? -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jul 5, 2018, 2:45 PM -0500, rajpal reddy , wrote: > > we have Qualys security scan running causing the cpu spike. We are seeing the > CPU spike only

Re: Is there a plan for Feature like this in C* ?

2018-07-03 Thread Rahul Singh
Some of my links related to Kafka and Cassandra http://leaves.anant.us/#!/leaf/10767?tag=cassandra,kafka Rahul On Jul 3, 2018, 11:48 AM -0400, Joshua Galbraith , wrote: > There is more info and background context on CDC here: > https://issues.apache.org/jira/browse/CASSANDRA-8844 > > > On Mon,

Re: Is there a plan for Feature like this in C* ?

2018-07-03 Thread Rahul Singh
There is a source connector from Landoop for Kafka Connect but it is based on polling a “kcql” select statement. They claim to be working on a CDC source connector for Kafka Connect but I couldn’t find anything. Smart Cat Labs has a CDC trigger based Kafka producer but I don’t think it uses

Resources for Monitoring Cassandra, Spark, Solr

2018-07-02 Thread Rahul Singh
/ This is a work in progress and I'll update this with screenshots as well as with links from other contributors. -- Rahul Singh rahul.si...@anant.us Anant Corporation

Re: C* in multiple AWS AZ's

2018-06-29 Thread Rahul Singh
gt; > rebuild. > > > > > > > > > > On another note you could just replace the nodes but use GPFS instead > > > > > of EC2 snitch, using the same rack name. > > > > > > > > > > > On Fri., 29 Jun. 2018, 00:19 Rahul Sin

Re: Check Cluster Health

2018-06-28 Thread Rahul Singh
When you run TPstats or Tablestats subcommands in nodetool you are actually accessing data inside Cassandra via JMX. You can start there at first. Rahul On Jun 28, 2018, 10:55 AM -0500, Thouraya TH , wrote: > Hi, > > Please, how can check the health of my cluster / data center using cassandra

Re: C* in multiple AWS AZ's

2018-06-28 Thread Rahul Singh
carry > a heavy load until the others are migrated over? > and then I think "repair" to cleanup the replications? > > > > On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh > > wrote: > > > You don’t have to use EC2 snitch on AWS but if you hav

Re: C* in multiple AWS AZ's

2018-06-28 Thread Rahul Singh
You don’t have to use EC2 snitch on AWS but if you have already started with it , it may put a node in a different DC. If your data density won’t be ridiculous You could add 3 to different DC/ Region and then sync up. After the new DC is operational you can remove one at a time on the old DC

Re: How do you monitoring Cassandra Cluster?

2018-06-21 Thread Rahul Singh
I’ve collected a bunch at http://leaves.anant.us/#!/?tag=cassandra,monitoring I reommend Grafana / Prometheus if you don’t have DSE (which has OpsCenter) -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jun 19, 2018, 1:06 PM -0400, Romain Gérard , wrote: > Hi Felipe, > > Yo

RE: [EXTERNAL] Re: Tombstone

2018-06-21 Thread Rahul Singh
. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jun 19, 2018, 12:39 PM -0400, Durity, Sean R , wrote: > This sounds like a queue pattern, which is typically an anti-pattern for > Cassandra. I would say that it is very difficult to get the access patterns, > tombstones, and everyt

RE: how to avoid lightwieght transactions

2018-06-21 Thread Rahul Singh
A read before write is always going to be tremendously more than just writing. Depending on your architecture you may consider both of the options described. If you have a CQRS architecture and are processing an event queue — doing LWT / read before write , then your “write” is processed

Re: Options to replace hardware of the cluster

2018-06-14 Thread Rahul Singh
How much daa do you have and what is the timeline? If you can manage with a maintenance window the snapshot / move and restore method may be the fastest. Streaming data can take a long time to sync two DCs if there is a lot of data. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Jun

Re: Options to replace hardware of the cluster

2018-06-14 Thread Rahul Singh
For no downtime and no lost data, I would make a new DC in the same cluster, and wait for the data / MVs to stream over. Otherwise, the best way is to snapshot everything and bring up the nodes all at once. On Jun 14, 2018, 4:11 AM -0400, Christian Lorenz , wrote: > Hi, > > we need to move our

Re: nodetool repair -pr

2018-06-08 Thread Rahul Singh
>From DS dox : "Do not use -pr with this option to repair only a local data >center." On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote: > > nodetool repair -pr

Re: Certified Cassandra for Enterprise use

2018-05-31 Thread Rahul Singh
is a DataStax services partner. -- Rahul Singh rahul.si...@anant.us Anant Corporation On May 29, 2018, 4:01 AM -0400, Ben Slater , wrote: > Hi Pranay > > We (Instaclustr) provide enterprise support for Cassandra > (https://www.instaclustr.com/services/cassandra-support/) which may cover

Re: Fwd: Re: cassandra update vs insert + delete

2018-05-30 Thread Rahul Singh
> My 2 cents, if you want to update some information just update it. There’s > > no need to overthink it. > > > > Batches are good if they’re constrained to a single partition, not so hot > > otherwise. > > > > > > On Sun, May 27, 2018 at 8:1

Re: cassandra update vs insert + delete

2018-05-27 Thread Rahul Singh
Deletes create tombstones — not really something to consider. Better to add / update or insert data and do a soft delete on old data and apply a TTL to remove it at a future time. -- Rahul Singh rahul.si...@anant.us Anant Corporation On May 27, 2018, 5:36 AM -0400, onmstester onmstester

Re: EXT: Cassandra Monitoring tool

2018-05-25 Thread Rahul Singh
Good article about it on LI https://www.linkedin.com/pulse/snap-cassandra-s3-tablesnap-vijaya-kumar-hosamani/ On May 25, 2018, 2:52 PM -0500, Joaquin Casares , wrote: > Hello Aneesh, > > While this doesn't provide a GUI, tablesnap is a community tool that does a >

Re: estimated number of keys vs ttl

2018-05-23 Thread Rahul Singh
If the TTL actually reduces the key count , should. It’s possible to TTL a row from a partition but not the whole partition. 1 key = 1 partition != 1 row != 1 cell -- Rahul Singh rahul.si...@anant.us Anant Corporation On May 23, 2018, 6:07 AM -0500, Grzegorz Pietrusza <gpietru...@gmail.

Re: How to measure time to execute joinToCassandraTable

2018-05-13 Thread Rahul Singh
and saved them into Cassandra. I could then later get time aggregates and average times per operation. -- Rahul Singh rahul.si...@anant.us Anant Corporation On May 13, 2018, 4:14 PM -0500, Guillermo Ortiz <konstt2...@gmail.com>, wrote: > I'm using the driver from Cassandra-Spark, I w

Re: Determining active sstables and table- dir

2018-05-01 Thread Rahul Singh
Schema column families is the most authoritative. You may have different data directories. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 27, 2018, 1:24 PM -0700, Carl Mueller <carl.muel...@smartthings.com>, wrote: > IN cases where a table was dropped and re-added, ther

Re: GUI clients for Cassandra

2018-04-23 Thread Rahul Singh
Zeppelin and Dbeaver EE are both good. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 23, 2018, 12:53 AM -0400, Eunsu Kim <eunsu.bil...@gmail.com>, wrote: > I am now writing dbeaver EE, but I’m waiting for TeamSQL (https://teamsql.io) > to support cassandra. > >

Re: read repair with consistency one

2018-04-21 Thread Rahul Singh
Read repairs are one anti-entropy measure. Continuous repairs is another. If you do repairs via Reaper or your own method it will resolve your discrepencies. On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza , wrote: > Hi all > > I'm a bit confused with how read repair

Re: copy from one table to another

2018-04-21 Thread Rahul Singh
bles where keyspace_name='test' and > table_name='usr'; > >  id > -- >  ea2f6da0-f931-11e7-8224-43ca70555242 > > > Directory name: > ./data/test/usr-ea2f6da0f93111e7822443ca70555242 > > Correct? > > Regards, > Kyrill > From: Rah

Re: copy from one table to another

2018-04-19 Thread Rahul Singh
Each table has a different Guid — doing a hard link may work as long as the sstable dir’s guid is he same as the newly created table in the system schema. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <kyrylo_lebed...@epam.com>,

Re: Phantom growth resulting automatically node shutdown

2018-04-19 Thread Rahul Singh
to data growth. What does your cfstats / tablestats day? Are you monitoring your key tables data via cfstats metrics like SpaceUsedLive or SpaceUsedTotal. What is your snapshottjng / backup process doing? -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 19, 2018, 7:01 AM -0500, horschi

Re: where does c* store the schema?

2018-04-18 Thread Rahul Singh
, It should catch up but every now and then if the changes are too great, it’s easier to run nodetool resetlocalschema https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsResetLocalSchema.html -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 18, 2018, 1:17 AM -0500, Jinhua

Re: multiple table directories for system_schema keyspace

2018-04-17 Thread Rahul Singh
it reinitialized the system. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 17, 2018, 2:25 PM -0500, John Sanda <john.sa...@gmail.com>, wrote: > On a couple different occasions I have run into this exception at start up: >

  1   2   >