Re: Hinted Handoff

2018-08-07 Thread Agrawal, Pratik
Please find my comments inline. From: kurt greaves Reply-To: "user@cassandra.apache.org" Date: Tuesday, August 7, 2018 at 1:20 AM To: User Subject: Re: Hinted Handoff Does Cassandra TTL out the hints after max_hint_window_in_ms? From my understanding, Cassandra only stops collecting hints

Re: Secure data

2018-08-07 Thread rajpal reddy
Hi Jon, Was trying the LUKS encryption following the Doc. https://aws.amazon.com/blogs/security/how-to-protect-data-at-rest-with-amazon-ec2-instance-store-encryption/ on ec2

Re: User Defined Types?

2018-08-07 Thread shalom sagges
Thanks a lot Anup! :-) On Mon, Aug 6, 2018 at 5:45 AM, Anup Shirolkar < anup.shirol...@instaclustr.com> wrote: > Hi, > > Few of the caveats can be found here: > https://issues.apache.org/jira/browse/CASSANDRA-7423 > > The JIRA is implemented in version *3.6* and you are on 3.0, > So you are

Configuration parameter to reject incremental repair?

2018-08-07 Thread Steinmaurer, Thomas
Hello, we are running Cassandra in AWS and On-Premise at customer sites, currently 2.1 in production with 3.11 in loadtest. In a migration path from 2.1 to 3.11.x, I'm afraid that at some point in time we end up in incremental repairs being enabled / ran a first time unintentionally, cause:

Re: TWCS Compaction backed up

2018-08-07 Thread Jonathan Haddad
What's your window size? When you say backed up, how are you measuring that? Are there pending tasks or do you just see more files than you expect? On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler wrote: > Hey guys, quick question: > > I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl,

Re: TWCS Compaction backed up

2018-08-07 Thread Jeff Jirsa
May be worth seeing if any of the sstables got promoted to repaired - if so they’re not eligible for compaction with unrepaired sstables and that could explain some higher counts Do you actually do deletes or is everything ttl’d? -- Jeff Jirsa > On Aug 7, 2018, at 5:09 PM, Brian Spindler

Re: TWCS Compaction backed up

2018-08-07 Thread Jeff Jirsa
You could toggle off the tombstone compaction to see if that helps, but that should be lower priority than normal compactions Are the lots-of-little-files from memtable flushes or repair/anticompaction? Do you do normal deletes? Did you try to run Incremental repair? -- Jeff Jirsa > On

Apache Cassandra Blog is now live

2018-08-07 Thread sankalp kohli
Hi, Apache Cassandra Blog is now live. Check out the first blog post. http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html Thanks, Sankalp

TWCS Compaction backed up

2018-08-07 Thread Brian Spindler
Hey guys, quick question: I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log on one drive, data on nvme. That was working very well, it's a ts db and has been accumulating data for about 4weeks. The nodes have increased in load and compaction seems to be falling behind. I

Re: dynamic_snitch=false, prioritisation/order or reads from replicas

2018-08-07 Thread Alain RODRIGUEZ
Hello Kyrill, But in case of CL=QUORUM/LOCAL_QUORUM, if I'm not wrong, read request is > sent to all replicas waiting for first 2 to reply. > My understanding is that this sentence is wrong. It is as you described it for writes indeed, all the replicas got the information (and to all the data

Re: TWCS Compaction backed up

2018-08-07 Thread Brian Spindler
Hi, I spot checked a couple of the files that were ~200MB and the mostly had "Repaired at: 0" so maybe that's not it? -B On Tue, Aug 7, 2018 at 8:16 PM wrote: > Everything is ttl’d > > I suppose I could use sstablemeta to see the repaired bit, could I just > set that to unrepaired somehow and

Re: TWCS Compaction backed up

2018-08-07 Thread Brian Spindler
Hi Jonathan, both I believe. The window size is 1 day, full settings: AND compaction = {'timestamp_resolution': 'MILLISECONDS', 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '86400',

Re: TWCS Compaction backed up

2018-08-07 Thread Brian Spindler
In fact all of them say Repaired at: 0. On Tue, Aug 7, 2018 at 9:13 PM Brian Spindler wrote: > Hi, I spot checked a couple of the files that were ~200MB and the mostly > had "Repaired at: 0" so maybe that's not it? > > -B > > > On Tue, Aug 7, 2018 at 8:16 PM wrote: > >> Everything is ttl’d >>

Re: Apache Cassandra Blog is now live

2018-08-07 Thread Nate McCall
You can tell how psyched we are about it because we cross posted! Seriously though - this is by the community for the community, so any ideas - please send them along. On Wed, Aug 8, 2018 at 1:53 PM, sankalp kohli wrote: > Hi, > Apache Cassandra Blog is now live. Check out the first blog

Re: TWCS Compaction backed up

2018-08-07 Thread Brian Spindler
Hi Jeff, mostly lots of little files, like there will be 4-5 that are 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each. Re incremental repair; Yes one of my engineers started an incremental repair on this column family that we had to abort. In fact, the node that the repair was

Re: TWCS Compaction backed up

2018-08-07 Thread brian . spindler
Everything is ttl’d I suppose I could use sstablemeta to see the repaired bit, could I just set that to unrepaired somehow and that would fix? Thanks! > On Aug 7, 2018, at 8:12 PM, Jeff Jirsa wrote: > > May be worth seeing if any of the sstables got promoted to repaired - if so > they’re

New community blog with inaugural post on faster streaming in 4.0

2018-08-07 Thread Nate McCall
Hi folks, We just added a blog section to our site, with a post detailing performance improvements of streaming coming in 4.0: http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html I think it's a good indicator of what we are going for that our first author is not a

Huge daily outbound network traffic

2018-08-07 Thread Behnam B.Marandi
Hi, I have a 3 node Cassandra cluster (version 3.11.1) on m4.xlarge EC2 instances with separate EBS volumes for root (gp2), data (gp2) and commitlog (io1). I get daily outbound traffic at a certain time everyday. As you can see in the attached screenshot, whiile my normal networkl oad hardly meets

Re: Bootstrap OOM issues with Cassandra 3.11.1

2018-08-07 Thread Jeff Jirsa
That's a direct memory OOM - it's not the heap, it's the offheap. You can see that gpsmessages.addressreceivedtime_idxgpsmessages.addressreceivedtime_idx is holding about 2GB of offheap memory (most of it for the bloom filter), but none of the others look like they're holding a ton offheap

Re: Bootstrap OOM issues with Cassandra 3.11.1

2018-08-07 Thread Laszlo Szabo
The last run I attempted used 135GB of RAM allocated to the JVM (arguments below), and while there are OOM errors, there is not a stack trace in either the system or debug log. On direct memory runs, there is a stack trace. The last Direct memory run used 60GB heaps and 60GB for off heap (that

Re: Hinted Handoff

2018-08-07 Thread Rahul Singh
What is the data size that you are talking about ? What is your compaction strategy? I wouldn’t recommend having such an aggressive TTL. Why not put a clustering key that allows you to get the data fairly quickly but have a longer TTL? Cassandra can still be used if the there is a legitimate

Re: ETL options from Hive/Presto/s3 to cassandra

2018-08-07 Thread Rahul Singh
Spark is scalable to as many nodes as you want and could be collocated with the data nodes — sstableloader wont be as performant for larger datasets. Although it can be run in parallel on different nodes I don’t believe it to be as fault tolerant. If you have to do it continuously I would even

Re: Huge daily outbound network traffic

2018-08-07 Thread Rahul Singh
Are you sure you don’t have an outside process that is doing an export , Spark job, non AWS managed backup process ? Is this network out from Cassandra or from the network? Rahul On Aug 7, 2018, 4:09 AM -0400, Behnam B.Marandi , wrote: > Hi, > I have a 3 node Cassandra cluster (version 3.11.1)

Re: Bootstrap OOM issues with Cassandra 3.11.1

2018-08-07 Thread Laszlo Szabo
Hi, Thanks for the fast response! We are not using any materialized views, but there are several indexes. I don't have a recent heap dump, and it will be about 24 before I can generate an interesting one, but most of the memory was allocated to byte buffers, so not entirely helpful. nodetool

Re: Bootstrap OOM issues with Cassandra 3.11.1

2018-08-07 Thread Jonathan Haddad
By default Cassandra is set to generate a heap dump on OOM. It can be a bit tricky to figure out what’s going on exactly but it’s the best evidence you can work with. On Tue, Aug 7, 2018 at 6:30 AM Laszlo Szabo wrote: > Hi, > > Thanks for the fast response! > > We are not using any materialized