Re: Is periodic manual repair necessary?

2017-02-28 Thread Thakrar, Jayesh
after some time and is not guarenteed. On Monday, February 27, 2017, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: Thanks Roth and Oskar for your quick responses. This is a single datacenter, multi-rack setup. > A TTL is technically simila

Is periodic manual repair necessary?

2017-02-27 Thread Thakrar, Jayesh
Suppose I have an application, where there are no deletes, only 5-10% of rows being occasionally updated (and that too only once) and a lot of reads. Furthermore, I have replication = 3 and both read and write are configured for local_quorum. Occasionally, servers do go into maintenance. I

Re: Is periodic manual repair necessary?

2017-02-27 Thread Thakrar, Jayesh
antuee a 100% that data is read-repaired before gc_grace_seconds after the data has been TTL'ed, you won't need an extra repair. 2017-02-27 18:29 GMT+01:00 Oskar Kjellin <oskar.kjel...@gmail.com<mailto:oskar.kjel...@gmail.com>>: Are you running multi dc? Skickat från min iPad

Re: [Cassandra 3.0.9] Cannot allocate memory

2017-03-22 Thread Thakrar, Jayesh
Is/are the Cassandra server(s) shared? E.g. do they run mesos + spark? From: Abhishek Kumar Maheshwari Date: Wednesday, March 22, 2017 at 12:45 AM To: "user@cassandra.apache.org" Subject: [Cassandra 3.0.9] Cannot allocate memory

Re: [Cassandra 3.0.9] Cannot allocate memory

2017-03-23 Thread Thakrar, Jayesh
wari <abhishek.maheshw...@timesinternet.in> Date: Wednesday, March 22, 2017 at 5:18 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: RE: [Cassandra 3.0.9] Cannot allocate memory JVM config is as below: -Xms16G -Xmx16G -Xmn3000M What I need to check in

RE: [Cassandra 3.0.9] Cannot allocate memory

2017-03-22 Thread Thakrar, Jayesh
And what is the configured max heap? Sometimes you may also be able to see some useful messages in "dmesg" output. Jayesh From: Abhishek Kumar Maheshwari <abhishek.maheshw...@timesinternet.in> Sent: Wednesday, March 22, 2017 5:05:14 PM To: Thakr

Re: repair performance

2017-03-18 Thread Thakrar, Jayesh
You changed compaction_throughput_mb_per_sec, but did you also increase concurrent_compactors? In reference to the reaper and some other info I received on the user forum to my question on "nodetool repair", here are some useful links/slides -

Re: Does "nodetool repair" need to be run on each node for a given table?

2017-03-15 Thread Thakrar, Jayesh
Thank you Eric for helping out. The reason I sent the question a second time is because I did not see my question and the first reply from the usergroup. After I sent the question a second time, I got a personal flame from somebody else too and so examined my "spam" folders and that's where I

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-21 Thread Thakrar, Jayesh
ication? Is there a setting to change it? On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: I had asked a similar/related question - on how to carry out repair, etc and got some useful pointers. I would highly

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Thakrar, Jayesh
Had been fighting a similar battle, but am now over the hump for most part. Get info on the server config (e.g. memory, cpu, free memory (free -g), etc) Run "nodetool info" on the nodes to get heap and off-heap sizes Run "nodetool tablestats" or "nodetool tablestats ." on the key large tables

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
andra on 30 Plus node at the same time We run C* at 32 GB and all servers have 96GB RAM. We use STCS . LCS is not an option for us as we have frequent updates. Thanks, Shravan ________ From: Thakrar, Jayesh <jthak...@conversantmedia.com> Sent: Friday, March

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
y JVM heap used is ~ 12GB and off heap is ~ 4-5GB. ________ From: Thakrar, Jayesh <jthak...@conversantmedia.com> Sent: Saturday, March 4, 2017 9:23:01 AM To: Shravan C; Joaquin Casares; user@cassandra.apache.org Subject: Re: OOM on Apache Cassandra on 30 Plus node at

Does "nodetool repair" need to be run on each node for a given table?

2017-03-14 Thread Thakrar, Jayesh
I understand that the nodetool command connects to a specific server and for many of the commands, e.g. "info", "compactionstats", etc, the information is for that specific node. While for some other commands like "status", the info is for the whole cluster. So is "nodetool repair" that

Does "nodetool repair" need to be run on each node for a given table?

2017-03-13 Thread Thakrar, Jayesh
I understand that the nodetool command connects to a specific server and for many of the commands, e.g. "info", "compactionstats", etc, the information is for that specific node. While for some other commands like "status", the info is for the whole cluster. So is "nodetool repair" that

Re: Any way to control/limit off-heap memory?

2017-03-06 Thread Thakrar, Jayesh
a/operations/ops_tuning_bloom_filters_c.html Hannu On 4 Mar 2017, at 22:54, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: I have a situation where the off-heap memory is bloating the jvm process memory, making it a candidate to be killed by the oom_killer.

Any way to control/limit off-heap memory?

2017-03-04 Thread Thakrar, Jayesh
I have a situation where the off-heap memory is bloating the jvm process memory, making it a candidate to be killed by the oom_killer. My server has 256 GB RAM and Cassandra heap memory of 16 GB Below is the output of "nodetool info" and nodetool compactionstats for a culprit table which causes

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-06 Thread Thakrar, Jayesh
I had asked a similar/related question - on how to carry out repair, etc and got some useful pointers. I would highly recommend the youtube video or the slideshare link below (both are for the same presentation). https://www.youtube.com/watch?v=1Sz_K8UID6E

Re: READ Queries timing out.

2017-07-07 Thread Thakrar, Jayesh
Can you provide more details. E.g. table structure, the app used for the query, the query itself and the error message. Also get the output of the following commands from your cluster nodes (note that one command uses "." and the other "space" between keyspace and tablename) nodetool -h

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-30 Thread Thakrar, Jayesh
Thanks Sander - this helps get a better understanding! From: Fridtjof Sander <fridtjof.san...@googlemail.com> Date: Friday, June 30, 2017 at 4:19 AM To: Vladimir Yudovin <vla...@winguzone.com>, "Thakrar, Jayesh" <jthak...@conversantmedia.com> Cc: Subroto Barua <s

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-27 Thread Thakrar, Jayesh
will reduce entropy, chance to read the same data everywhere increases. C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com<mailto:al...@thelastpickle.com> France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-

Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Hi All, We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the user group for their experiences. Our usage profile is batch jobs that load millions of rows to Cassandra every hour. And there are similar period batch jobs that read millions of rows and do some processing,

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; From: "Fay Hou [Storage Service] ­" <fay...@coupang.com> Date: Tuesday, August 22, 2017 at 10:52 AM To: "Th

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
executed are not needed anymore, so its ok for the older prepared statements to be purged. All the same, I will do some analysis on the prepared statements table. Thanks for the tip/pointer! On 8/22/17, 5:17 PM, "Alain Rastoul" <alf.mmm@gmail.com> wrote: On 08/22/2017 0

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Yep, similar symptoms - but no, there's no OOM killer Also, if you look in the gc log around the time of failure, the heap memory was much below the 16 GB limit. And if I look at the 2nd last GC log before the crash, here’s what we see. And you will notice that cleaning up the 4 GB Eden (along

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-19 Thread Thakrar, Jayesh
Subroto, Cassandra docs say otherwise. Writing list data is accomplished with a JSON-style syntax. To write a record using INSERT, specify the entire list as a JSON array. Note: An INSERT will always replace the entire list. Maybe you can elaborate/shed some more light? Thanks, Jayesh

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-20 Thread Thakrar, Jayesh
ECT * FROM test.test ; k | v ---+- 1 | [3] // = EXPECTED RESULT = From: Subroto Barua <sbarua...@yahoo.com> Date: Monday, June 19, 2017 at 11:09 PM To: "Thakrar, Jayesh" <jthak...@conversantmedia.com>, Subroto Barua <sbarua...@yahoo.com.INVA

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
t;polling" cycle overhead. Furthermore, zk does not have the "overhead" of other things that Cassandra does. Honestly I am not familiar with Paxos and stuff, so can't speak to it. On 5/25/17, 3:40 PM, "Jan Algermissen" <algermissen1...@icloud.com> wrote: Hi

Re: Apache Cassandra - Memory usage on server

2017-06-14 Thread Thakrar, Jayesh
Asad, The rest of the 42 GB of memory on your server is used by the filesystem buffer cache - see the "cached" column and the -/+ buffers/cache line. The OS (Linux) uses all free memory for filesystem buffer cache and if applications need memory, will relinquish it appropriately. To see the

Re: Question: Large partition warning

2017-06-14 Thread Thakrar, Jayesh
Thank you Kurt - that makes sense. Will certainly reduce it to 1024. Greatly appreciate your quick reply. Thanks, Jayesh From: kurt greaves Sent: Wednesday, June 14, 5:53 PM Subject: Re: Question: Large partition warning To: Fay Hou [Data Pipeline & Real-time Analytics] ­ Cc: Thakrar, Ja

Question: Large partition warning

2017-06-14 Thread Thakrar, Jayesh
We are on Cassandra 2.2.5 and I am constantly seeing warning messages about large partitions in system.log even though our setting for partition warning threshold is set to 4096 (MB). WARN [CompactionExecutor:43180] 2017-06-14 20:02:13,189 BigTableWriter.java:184 - Writing large partition

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
Hi Jan, I would suggest looking at using Zookeeper for such a usecase. See http://zookeeper.apache.org/doc/trunk/recipes.html for some examples. Zookeeper is used for such purposes in Apache HBase (active master), Apache Kafka (active controller), Apache Hadoop, etc. Look for the "Leader

Re: Why don't I see my spark jobs running in parallel in Cassandra/Spark DSE cluster?

2017-10-27 Thread Thakrar, Jayesh
What you have is sequential and hence sequential processing. Also Spark/Scala are not parallel programming languages. But even if they were, statements are executed sequentially unless you exploit the parallel/concurrent execution features. Anyway, see if this works: val (RDD1, RDD2) =

Re: Question upon gracefully restarting c* node(s)

2018-01-10 Thread Thakrar, Jayesh
Just curious - aside from the "sleep", is this all not part of the shutdown command? Is this an "opportunity" to improve C*? Having worked with RDBMSes, Hadoop and HBase, stopping communication, flushing memcache (HBase), and relinquishing ownership of data (HBase) is all part of the shutdown

Re: C* Logs to Kibana

2018-01-10 Thread Thakrar, Jayesh
Wondering what is the purpose - is it to get some insight into the cluster? Besides the logs themselves, another approach that many and I have taken is to pull the JMX metrics from Cassandra and push them to an appropriate metrics/timeseries system. Here's one approach of getting JMX metrics

Re: TWCS not deleting expired sstables

2018-01-30 Thread Thakrar, Jayesh
Thanks Kurt and Kenneth. Now only if they would work as expected. node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -lt | tail -rw-r--r--. 1 vchadoop vchadoop286889260 Sep 18 14:14 mc-1070-big-Index.db -rw-r--r--. 1 vchadoop vchadoop

Re: TWCS not deleting expired sstables

2018-02-01 Thread Thakrar, Jayesh
.sh> sstableexpiredblockers ae raw_logs_by_user On 30 January 2018 at 15:34, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: Thanks Kurt and Kenneth. Now only if they would work as expected. node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by

Re: TWCS not deleting expired sstables

2018-01-28 Thread Thakrar, Jayesh
m/blog/2016/12/08/TWCS-part1.html Specifically the “Hints and repairs” and “Timestamp overlap” sections might be of use. -B On Jan 25, 2018, at 11:05 AM, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: Wondering if I can get some pointer

Re: if the heap size exceeds 32GB..

2018-02-13 Thread Thakrar, Jayesh
In most cases, Cassandra is pretty efficient about memory usage. However, if your use case does require/need/demand more memory for your workload, I would not hesitate to use heap > 32 GB. FYI, we have configured our heap for 84 GB. However there's more tuning that we have done beyond just the

Re: if the heap size exceeds 32GB..

2018-02-13 Thread Thakrar, Jayesh
urrent marking without STW. If you ran 84G heaps with CMS, you'd either need to dramatically tune your CMS initiating occupancy, or you'd probably see horrible, horrible pauses. On Tue, Feb 13, 2018 at 8:44 AM, James Rothering <jrother...@codojo.me<mailto:jrother...@codojo.me>&