Re: sstablescrub for unreadable sstables?

2015-08-18 Thread Jeff Jirsa
For key cache files (which you have below – note ‘cassandra_saved_caches’ and -KeyCache– ), the safest thing to do is to simply remove them (move them aside or delete them). They’re simple caches, and they’ll be recreated shortly after starting. - Jeff From: David Paulsen Reply-To:

Re: Practical limitations of too many columns/cells ?

2015-08-23 Thread Jeff Jirsa
A few months back, a user in #cassandra on freenode mentioned that when they transitioned from thrift to cql, their overall performance decreased significantly. They had 66 columns per table, so I ran some benchmarks with various versions of Cassandra and thrift/cql combinations. It shouldn’t

Re: Practical limitations of too many columns/cells ?

2015-08-23 Thread Jeff Jirsa
@cassandra.apache.org Subject: Re: Practical limitations of too many columns/cells ? Ah.. yes. Great benchmarks. If I’m interpreting them correctly it was ~15x slower for 22 columns vs 2 columns? Guess we have to refactor again :-P Not the end of the world of course. On Sun, Aug 23, 2015 at 1:53 PM, Jeff

Re: Leak detected during repair (v2.1.8)

2015-07-29 Thread Jeff Jirsa
https://issues.apache.org/jira/browse/CASSANDRA-9382 From: Frisch, Michael Reply-To: user@cassandra.apache.org Date: Wednesday, July 29, 2015 at 6:56 AM To: Cassandra List Subject: Leak detected during repair (v2.1.8) ERROR [Reference-Reaper:1] 2015-07-29 12:34:45,941 Ref.java:179 - LEAK

Re: query statement return empty

2015-07-30 Thread Jeff Jirsa
What consistency level are you using with your query? What replication factor are you using on your keyspace? Have you run repair? The most likely explanation is that you wrote with low consistency (ANY, ONE, etc), and that one or more replicas does not have the cell. You’re then reading with

Re: Duplicating a cluster with different # of disks

2015-08-06 Thread Jeff Jirsa
You can copy all of the sstables into any given data directory without issue (keep them within the keyspace/table directories, but the mnt/mnt2/mnt3 location is irrelevant). You can also stream them in via sstableloader if your ring topology has changed (especially if tokens have moved)

Re: limit the size of data type LIST

2015-08-13 Thread Jeff Jirsa
This is not currently possible, though it has been proposed in the past and may potentially be implemented in the future: https://issues.apache.org/jira/browse/CASSANDRA-9110 - Jeff From: yuankui Reply-To: user@cassandra.apache.org Date: Thursday, August 13, 2015 at 6:24 PM To:

Re: What's the format of Cassandra's timestamp, microsecond or millisecond?

2015-08-10 Thread Jeff Jirsa
The timestamp is arbitrary precision, selected by the client. If you’re seeing milliseconds on some data and microseconds on others, then you have one client that’s using microseconds and another on milliseconds – adjust your clients. From: yhq...@sina.com Reply-To:

Re: auto_bootstrap=false broken?

2015-08-06 Thread Jeff Jirsa
You’re trying to force your view onto an established ecosystem. It’s not “wrong only because its currently bootstrapping”, it’s not bootstrapping at all, you told it not to bootstrap. ‘auto_bootstrap’ is the knob that tells cassandra whether or not you want to stream data from other replicas

Re: if seed is diff on diff nodes, any problem ?

2015-07-26 Thread Jeff Jirsa
Seeds are used in two different ways: 1) When joining the ring, the joining node knows NOTHING about the cluster, so it uses a seed list to discover the cluster. Once discovered, it saves the peers to disk, so subsequent starts it will find/reconnect to other nodes beyond just the explicitly

Re: Nodetool cleanup takes long time and no progress

2015-07-24 Thread Jeff Jirsa
You can check for progress using `nodetool compactionstats` (which will show Cleanup tasks), or check for ‘Cleaned up’ messages in the log (/var/log/cassandra/system.log). However, `nodetool cleanup` has a very specific and limited task - it deletes data no longer owned by the node, typically

Re: How can I specify the file_data_directories for a keyspace

2015-08-25 Thread Jeff Jirsa
At this point, it is only/automatically managed by cassandra, but if you’re clever with mount points you can probably work around the limitation. From: Ahmed Eljami Reply-To: user@cassandra.apache.org Date: Tuesday, August 25, 2015 at 2:09 AM To: user@cassandra.apache.org Subject: How can

Re: C* Table Changed and Data Migration with new primary key

2015-10-21 Thread Jeff Jirsa
Because the data format has changed, you’ll need to read it out and write it back in again. This means using either a driver (java, python, c++, etc), or something like spark. In either case, split up the token range so you can parallelize it for significant speed improvements. From:

Re: memtable flush size with LCS

2015-10-28 Thread Jeff Jirsa
It’s worth mentioning that initial flushed file size is typically determined by memtable_cleanup_threshold and the memtable space options (memtable_heap_space_in_mb, memtable_offheap_space_in_mb, depending on memtable_allocation_type) From: Nate McCall Reply-To: "user@cassandra.apache.org"

Re: Error Connecting to Cassandra

2015-10-28 Thread Jeff Jirsa
The cassandra system.log would be more useful When Cassandra starts rejecting or dropping tcp connections, try to connect using cqlsh, and check the logs for indication that it’s failing. From: Eduardo Alfaia Reply-To: "user@cassandra.apache.org" Date: Wednesday, October 28, 2015 at 5:09 PM

Re: Automatic pagination does not get all results

2015-10-22 Thread Jeff Jirsa
It’s possible that it could be different depending on your consistency level (on write and on read). It’s also possible it’s a bug, but you didn’t give us much information – here are some questions to help us help you: What version? What results are you seeing? What’s the “right” result?

Re: too many full gc in one node of the cluster

2015-11-13 Thread Jeff Jirsa
What Kai Wang is hinting at: one of the common tuning problems people face is they follow the advice in cassandra-env.sh, which says 100M of young gen space (Xmn) per core. Many people find that insufficient – raising that to be 30,40, or 50% of heap size (Xmx) MAY help keep short-lived objects

Re: SSTables are not getting removed

2015-10-30 Thread Jeff Jirsa
After 15 hours, you have "112000 SSTables over all nodes for all CF’s” - assuming I’m parsing that correctly, that’s about 30 per table (100 KS * 10 tables * 4 nodes), which is actually not unreasonable with LCS. Your symptom is heavy GC and you’re running 2.1.6 and don’t want to upgrade

Re: Cassandra Data Model with Narrow partition

2015-10-30 Thread Jeff Jirsa
I’m going to disagree with Carlos in two points. You do have a lot of columns, so many that it’s likely to impact performance. Rather than using collections, serializing those into a single JSON field is far more performant. Since you write each record exactly once, this should be easily

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-18 Thread Jeff Jirsa
r. I'm running a nodetool repair now to hopefully fix this. On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: auto_bootstrap=false tells it to join the cluster without running bootstrap – the node assumes it has all of the necessary data, and won’t stream an

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-18 Thread Jeff Jirsa
auto_bootstrap=false tells it to join the cluster without running bootstrap – the node assumes it has all of the necessary data, and won’t stream any missing data. This generally violates consistency guarantees, but if done on a single node, is typically correctable with `nodetool repair`. If

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-19 Thread Jeff Jirsa
Worth noting that repair may not work, as it’s possible that NONE of the nodes with data (for some given row) are no longer valid replicas according to the DHT/Tokens, so repair will not find any of the replicas with the data. From: Robert Coli Reply-To: "user@cassandra.apache.org" Date:

Re: Hiper-V snapshot and Cassandra

2015-10-20 Thread Jeff Jirsa
As long as your hyper-v/vss snapshots include both the data directory and the commit log directory, then they’re exactly as good as tolerating a single power outage – you should be able to load the sstables and replay commit log and be fine. Assuming you’re moving the hyper-v/vss snapshot to

Re: Is Cassandra really Strong consistency?

2015-09-06 Thread Jeff Jirsa
t, if it is possible, how current Cassandra can solve it? Regards, Ibrahim If anyone can read the above scenario and confirm whether this can occur or not, if it is possible, how current Cassandra can solve it? Regards, Ibrahim On Sun, Sep 6, 2015 at 5:57 PM,

Re: Is Cassandra really Strong consistency?

2015-09-06 Thread Jeff Jirsa
Yes, it can occur, if you allow it to occur. Clients should send their own timestamps. Clocks should be synchronized. Failure to do so while relying on ‘last write wins’ timestamp resolution will cause undesirable results. This is unrelated to strong/weak/eventual consistency discussions or

Re: Is Cassandra really Strong consistency?

2015-09-06 Thread Jeff Jirsa
In the cases where NTP and client timestamps with microsecond resolution is insufficient, LWT “IF EXISTS, IF NOT EXISTS” is generally used. From: ibrahim El-sanosi Reply-To: "user@cassandra.apache.org" Date: Sunday, September 6, 2015 at 7:40 AM To: "user@cassandra.apache.org" Subject: Re:

Re: High CPU usage on some of nodes

2015-09-10 Thread Jeff Jirsa
With a 5s collection, the problem is almost certainly GC. GC pressure can be caused by a number of things, including normal read/write loads, but ALSO compaction calculation (pre-2.1.9 / #9882) and very large partitions (trying to load a very large partition with something like row cache in

Re: Using DTCS, TTL but old SSTables not being removed

2015-09-13 Thread Jeff Jirsa
2.2.1 has a pretty significant bug in compaction: https://issues.apache.org/jira/browse/CASSANDRA-10270 That prevents it from compacting files after 60 minutes. It may or may not be the cause of the problem you’re seeing, but it seems like it may be possibly related, and you can try the

Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Jeff Jirsa
There’s a seldom discussed parameter called: unchecked_tombstone_compaction The documentation describes the option as follows: True enables more aggressive than normal tombstone compactions. A single SSTable tombstone compaction runs without checking the likelihood of success. Cassandra 2.0.9

Re: Help with tombstones and compaction

2015-09-20 Thread Jeff Jirsa
2.1.4 is getting pretty old. There’s a DTCS deletion tweak in 2.1.5 ( https://issues.apache.org/jira/browse/CASSANDRA-8359 ) that may help you. 2.1.5 and 2.1.6 have some memory leak issues in DTCS, so go to 2.1.7 or newer (probably 2.1.9 unless you have a compelling reason not to go to 2.1.9)

Re: Cassandra Summit 2015 Roll Call!

2015-09-22 Thread Jeff Jirsa
I’m here. Will be speaking Wednesday on DTCS for time series workloads: http://cassandrasummit-datastax.com/agenda/real-world-dtcs-for-operators/ Picture if you recognize me, say hi: https://events.mfactormeetings.com/accounts/register123/mfactor/datastax/events/dstaxsummit2015/jirsa.jpg

Re: Unable to remove dead node from cluster.

2015-09-23 Thread Jeff Jirsa
When you run unsafeAssassinateEndpoint, to which host are you connected, and what argument are you passing? Are there other nodes in the ring that you’re not including in the ‘nodetool status’ output? From: Dikang Gu Reply-To: "user@cassandra.apache.org" Date: Tuesday, September 22, 2015

Re: Running Cassandra on Java 8 u60..

2015-09-25 Thread Jeff Jirsa
We saw no problems with 8u60. From: on behalf of Kevin Burton Reply-To: "user@cassandra.apache.org" Date: Friday, September 25, 2015 at 5:08 PM To: "user@cassandra.apache.org" Subject: Running Cassandra on Java 8 u60.. Any issues with running Cassandra 2.0.16 on

Re: Unable to remove dead node from cluster.

2015-09-25 Thread Jeff Jirsa
s of being refactored (here's at least one of the issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it would be worth opening an issue with as much information as you can provide to, at the very least, have information avaiable for others. On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <

Re: Help with tombstones and compaction

2015-09-21 Thread Jeff Jirsa
for your reply Jeff! I will switch to Cassandra 2.1.9. Quick follow up question: Does the schema, settings I have setup look alright? My timestamp column's type is blob - I was wondering if this could confuse DTCS? On Sun, Sep 20, 2015 at 3:37 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com&g

Re: Unable to remove dead node from cluster.

2015-09-25 Thread Jeff Jirsa
To: cassandra Subject: Re: Unable to remove dead node from cluster. @Jeff, I just use jmx connect to one node, run the unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address. Yes, we have hundreds of other nodes in the nodetool status output as well. On Tue, Sep 22, 201

Re: lots of tombstone after compaction

2015-12-07 Thread Jeff Jirsa
https://issues.apache.org/jira/browse/CASSANDRA-7953 https://issues.apache.org/jira/browse/CASSANDRA-10505 There are buggy versions of cassandra that will multiple tombstones during compaction. 2.1.12 SHOULD correct that, if you’re on 2.1. From: Kai Wang Reply-To:

Re: Switching to Vnodes

2015-12-09 Thread Jeff Jirsa
Streaming with vnodes is not always pleasant – rebuild uses streaming (as does bootstrap, repair, and decommission). The rebuild delay you see may or may not be related to that. It could also be that the streams timed out, and you don’t have a stream timeout set. Are you seeing data move? Are

Re: Unable to start one Cassandra node: OutOfMemoryError

2015-12-09 Thread Jeff Jirsa
8G is probably too small for a G1 heap. Raise your heap or try CMS instead. 71% of your heap is collections – may be a weird data model quirk, but try CMS first and see if that behaves better. From: Mikhail Strebkov Reply-To: "user@cassandra.apache.org" Date: Wednesday, December 9, 2015 at

Re: Thousands of pending compactions using STCS

2015-12-11 Thread Jeff Jirsa
There were a few buggy versions in 2.1 (2.1.7, 2.1.8, I believe) that showed this behavior. The number of pending compactions was artificially high, and not meaningful. As long as they number of –Data.db sstables remains normal, compaction is keeping up and you’re fine. - Jeff From:

Re: Replicating Data Between Separate Data Centres

2015-12-14 Thread Jeff Jirsa
2015 at 3:18 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: There is research into causal consistency and cassandra ( http://da-data.blogspot.com/2013/02/caring-about-causality-now-in-cassandra.html , for example), though you’ll note that it uses a fork ( https://github.com/wllo

Re: Thousands of pending compactions using STCS

2015-12-11 Thread Jeff Jirsa
Same bug also affects 2.0.16 - https://issues.apache.org/jira/browse/CASSANDRA-9662 From: Jeff Jirsa Reply-To: <user@cassandra.apache.org> Date: Friday, December 11, 2015 at 9:12 AM To: "user@cassandra.apache.org" Subject: Re: Thousands of pending compactions using STCS

Re: Issues on upgrading from 2.2.3 to 3.0

2015-12-14 Thread Jeff Jirsa
https://issues.apache.org/jira/browse/CASSANDRA-10745 GPFS seems like it SHOULD be easier to package and distribute in most use cases… From: "sean_r_dur...@homedepot.com" Reply-To: "user@cassandra.apache.org" Date: Monday, December 14, 2015 at 11:56 AM To: "user@cassandra.apache.org"

Re: Replicating Data Between Separate Data Centres

2015-12-14 Thread Jeff Jirsa
There is research into causal consistency and cassandra ( http://da-data.blogspot.com/2013/02/caring-about-causality-now-in-cassandra.html , for example), though you’ll note that it uses a fork ( https://github.com/wlloyd/eiger ) which is unlikely something you’d ever want to consider in

Re: compaction_throughput_mb_per_sec

2016-01-04 Thread Jeff Jirsa
Why do you think it’s cluster wide? That param is per-node, and you can change it at runtime with nodetool (or via the JMX interface using jconsole to ip:7199 ) From: Ken Hancock Reply-To: "user@cassandra.apache.org" Date: Monday, January 4, 2016 at 12:59 PM To:

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-01-06 Thread Jeff Jirsa
Anecdotal evidence typically agrees that 2.1 is faster than 2.0 (our experience was anywhere from 20-60%, depending on workload). However, it’s not necessarily true that everything behaves exactly the same – in particular, memtables are different, commitlog segment handling is different, and

Re: Revisit Cassandra EOL Policy

2016-01-08 Thread Jeff Jirsa
You chose a specific point in time that is especially painful. Had you chosen most of 2014, you would have had a long period of 2.0.x that was stable. Yes, if you were deploying in April 2015, you had an unpleasant choice between an about-to-EOL 2.0 and a omg-memory-leak 2.1 – if you deploy

Re: Three questions about cassandra

2015-11-26 Thread Jeff Jirsa
1) It comes online in its former state. The operator is responsible for consistency beyond that point. Common solutions would be `nodetool repair` (and if you get really smart, you can start the daemon with the thrift/native listeners disabled, run repair, and then enable listeners, so that

Re: [Typo correction] Is it good for performance to put rows that are of different types but are always queried together in the same table partition?

2016-01-08 Thread Jeff Jirsa
You’ll see better performance using a slice (which is effectively what will happen if you put them into the same table and use query-1table-b), as each node will only need to merge cells/results once. It may not be twice as fast, but it’ll be fast enough to make it worthwhile. On 1/8/16,

Re: how long does "nodetool upgradesstables" take?

2016-06-04 Thread Jeff Jirsa
“It takes as long as necessary to rewrite any sstable that needs to be upgraded”. >From 2.2.4 to 2.2.6, the sstable format did not change, so there’s nothing to >upgrade. If you want to force the matter (and you probably don’t), ‘nodetool upgradesstables –a’ will rewrite them again, but you

Re: Why there is no native shutdown command in cassandra

2016-06-13 Thread Jeff Jirsa
`nodetool drain` From: Anshu Vajpayee Reply-To: "user@cassandra.apache.org" Date: Monday, June 13, 2016 at 9:28 AM To: "user@cassandra.apache.org" Subject: Why there is no native shutdown command in cassandra

Re: Time series data with only inserts

2016-05-30 Thread Jeff Jirsa
Your compaction strategy gets triggered whenever you flush memtables to disk. Most compaction strategies, especially those designed for write-only time-series workloads, check for fully expired sstables (getFullyExpiredSStables()) “often” (DTCS does it every 10 minutes, because it’s fairly

Re: Understanding when Cassandra drops expired time series data

2016-06-17 Thread Jeff Jirsa
k into updating soon). Just to clarify, in the current version of Cassandra when do fully expired SSTables get dropped? Is it when a minor compaction runs or is it separate from minor compactions? Also thanks to the link to the slides, great stuff! Jerome From: Jeff Jirsa <jeff.ji...@crowdstrike.co

Re: Understanding when Cassandra drops expired time series data

2016-06-17 Thread Jeff Jirsa
First, DTCS in 2.0.15 has some weird behaviors - https://issues.apache.org/jira/browse/CASSANDRA-9572 . That said, some other general notes: Data deleted by TTL isn’t the same as issuing a delete – each expiring cell internally has a ttl/timestamp at which it will be converted into a

Re: Understanding when Cassandra drops expired time series data

2016-06-17 Thread Jeff Jirsa
> Date: Friday, June 17, 2016 at 12:29 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: Understanding when Cassandra drops expired time series data Hey Jeff, Do most of those behaviors apply to TWCS too? -J On Fri, Jun 17, 2016 at 1:25 P

Re: [External] Re: Understanding when Cassandra drops expired time series data

2016-06-17 Thread Jeff Jirsa
Correcting myself - https://issues.apache.org/jira/browse/CASSANDRA-9882 made it check for fully expired tables no more than once every 10 minutes (still happens on flush as described, just not EVERY flush). Went in 2.0.17 / 2.1.9. - Jeff From: Jeff Jirsa <jeff

Re: sstableloader throughput

2016-01-11 Thread Jeff Jirsa
Make sure streaming throughput isn’t throttled on the destination cluster. Stream from more machines (divide sstables between a bunch of machines, run in parallel). On 1/11/16, 5:21 AM, "Noorul Islam K M" wrote: > >I have a need to stream data to new cluster using

Re: Help debugging a very slow query

2016-01-13 Thread Jeff Jirsa
Very large partitions create a lot of garbage during reads: https://issues.apache.org/jira/browse/CASSANDRA-9754 - you will see significant GC pauses trying to read from large enough partitions. I suspect GC, though it’s odd that you’re unable to see it. From: Bryan Cheng Reply-To:

Re: Impact of Changing Compaction Strategy

2016-01-15 Thread Jeff Jirsa
When you change compaction strategy, nothing happens until the next flush. On the next flush, the new compaction strategy will decide what to do – if you change from STCS to DTCS, it will look at various timestamps of files, and attempt to group them by time windows based on the sstable’s

Re: compaction throughput

2016-01-15 Thread Jeff Jirsa
With SSDs, the typical recommendation is up to 0.8-1 compactor per core (depending on other load). How many CPU cores do you have? From: Kai Wang Reply-To: "user@cassandra.apache.org" Date: Friday, January 15, 2016 at 12:53 PM To: "user@cassandra.apache.org" Subject: compaction throughput

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-01-14 Thread Jeff Jirsa
), but significantly speeds up read workloads for us. From: Zhiyan Shao Date: Thursday, January 14, 2016 at 9:49 AM To: "user@cassandra.apache.org" Cc: Jeff Jirsa, "Agrawal, Pratik" Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11 Praveen, if you search "Read is sl

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-01-14 Thread Jeff Jirsa
ccess_mode which should be mmap. We also used LZ4Compressor when created table. We will let you know if this property had any effect. We were testing with 2.1.11 and this was only fixed in 2.1.12 so we need to play with latest version. Praveen From: Jeff Jirsa <jeff.ji...@crowdstrike.com> R

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-10 Thread Jeff Jirsa
What disk size are you using? From: Mike Heffner Reply-To: "user@cassandra.apache.org" Date: Wednesday, February 10, 2016 at 2:24 PM To: "user@cassandra.apache.org" Cc: Peter Norton Subject: Re: Debugging write timeouts on Cassandra 2.2.5 Paulo, Thanks for the suggestion, we ran some

Re: cassandra client testing

2016-02-09 Thread Jeff Jirsa
http://www.scassandra.org/ From: Abdul Jabbar Azam Reply-To: "user@cassandra.apache.org" Date: Tuesday, February 9, 2016 at 2:23 PM To: "user@cassandra.apache.org" Subject: cassandra client testing Hello, What do people do to test their cassandra client code? Do you a) mock out the

Re: Read operations freeze for a few second while adding a new node

2016-01-28 Thread Jeff Jirsa
Is this during streaming plan setup (is your 10-20 second time of impact approximately 30 seconds from the time you start the node that’s joining the ring), or does it happen for the entire time you’re joining the node to the ring? If so, there’s a chance it’s GC related – the streaming plan

Re: Session timeout

2016-01-29 Thread Jeff Jirsa
> For instance, way AAA (authentication, authorization, audit) is done, doesn't > allow for centralized account and access control management, which in reality > translates into shared accounts and no hierarchy. Authentication and Authorization are both pluggable. Any organization can write

Re: EC2 storage options for C*

2016-02-01 Thread Jeff Jirsa
a, such as during the full processing of flushing memtables, but for the fsync at the end a solid guarantee is needed. -- Jack Krupansky On Mon, Feb 1, 2016 at 12:56 AM, Eric Plowe <eric.pl...@gmail.com> wrote: Jeff, If EBS goes down, then EBS Gp2 will go down as well, no? I'm no

Re: EC2 storage options for C*

2016-02-01 Thread Jeff Jirsa
the "small to medium databases" use case. Do older instances with local HDD still exist on AWS (m1, m2, etc.)? Is the doc simply for any newly started instances? See: https://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ebs/details/ -- Jack Krupansky On Mon, Feb 1, 20

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
Also in that video - it's long but worth watching We tested up to 1M reads/second as well, blowing out page cache to ensure we weren't "just" reading from memory -- Jeff Jirsa > On Jan 31, 2016, at 9:52 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: >

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
Yes, but getting at why you think EBS is going down is the real point. New GM in 2011. Very different product. 35:40 in the video -- Jeff Jirsa > On Jan 31, 2016, at 9:57 PM, Eric Plowe <eric.pl...@gmail.com> wrote: > > Jeff, > > If EBS goes down, then EBS Gp2 will go

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
Free to choose what you'd like, but EBS outages were also addressed in that video (second half, discussion by Dennis Opacki). 2016 EBS isn't the same as 2011 EBS. -- Jeff Jirsa > On Jan 31, 2016, at 8:27 PM, Eric Plowe <eric.pl...@gmail.com> wrote: > > Thank you all for

Re: EC2 storage options for C*

2016-02-03 Thread Jeff Jirsa
of m1.large. -- Jack Krupansky On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: A lot of people use the old gen instances (m1 in particular) because they came with a ton of effectively free ephemeral storage (up to 1.6TB). Whether or not they’re viable is a d

Re: EC2 storage options for C*

2016-01-29 Thread Jeff Jirsa
If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failure, go with i2+ephemerals. Until then, GP2 EBS is capable of amazing things, and greatly simplifies life. We gave a talk on this topic

Re: Problem while migrating a single node cluster from 2.1 to 3.2

2016-01-30 Thread Jeff Jirsa
Upgrade from 2.1.9+ directly to 3.0 is supported: https://github.com/apache/cassandra/blob/cassandra-3.0/NEWS.txt#L83-L85 - Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9, or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and

Re: EC2 storage options for C*

2016-01-31 Thread Jeff Jirsa
disk is rarely the bottleneck. YMMV, of course. On Fri, Jan 29, 2016 at 7:32 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: If you have to ask that question, I strongly recommend m4 or c4 instances with GP2 EBS. When you don’t care about replacing a node because of an instance failu

Re: High Bloom filter false ratio

2016-02-23 Thread Jeff Jirsa
t; JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions" JVM_OPTS="$JVM_OPTS -XX:+UseGCTaskAffinity" JVM_OPTS="$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs" # earlier value 131072 JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32678" JVM_OPTS="$JVM_OPTS -XX:CMSScheduleRemark

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
ke i have to force a major compaction to delete a lot of data ? are there any other solutions ? thanks anishek On Mon, Feb 22, 2016 at 11:21 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: 1) getFullyExpiredSSTables in 2.0 isn’t as thorough as many expect, so it’s very likely

Re: Nodetool Rebuild sending few big packets of data. Is it normal?

2016-02-26 Thread Jeff Jirsa
Cassandra is streaming it at a near constant rate (if you had metrics for network interface, you’d probably see that), but it doesn’t register in nodetool status until it completes all of the sstables for a column family. At that point, the -tmp–Data.db files get renamed to drop the –tmp, and

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
1) getFullyExpiredSSTables in 2.0 isn’t as thorough as many expect, so it’s very likely that some sstables stick around longer than you expect. 2) max_sstable_age_days tells cassandra when to stop compacting that file, not when to delete it. 3) You can change the window size using both the

Re: Production with Single Node

2016-01-22 Thread Jeff Jirsa
The value of cassandra is in its replication – as a single node solution, it’s slower and less flexible than alternatives From: John Lammers Reply-To: "user@cassandra.apache.org" Date: Friday, January 22, 2016 at 12:57 PM To: Cassandra Mailing List Subject: Fwd: Production with Single Node

Re: Using TTL for data purge

2016-01-22 Thread Jeff Jirsa
"As I understand TTL, if there is a compaction of a cell (or row) with a TTL that has been reached, a tombstone will be written.” The expiring cell is treated as a tombstone once it reaches it’s end of life, it does not write an additional tombstone to disk. From:

Re: ntpd clock sync

2016-03-09 Thread Jeff Jirsa
If you don’t overwrite or delete data, it’s not a concern. If the clocks show a time in the past instead of in the future, it’s not a concern. If the clock has drifted significantly into the future, when you start NTP you may be writing data with timestamps lower than timestamps on data that

Re: How to measure the write amplification of C*?

2016-03-10 Thread Jeff Jirsa
A bit of Splunk-fu probably works for this – you’ll have different line entries for memtable flushes vs compaction output. Comparing the two will give you a general idea of compaction amplification. From: Dikang Gu Reply-To: "user@cassandra.apache.org" Date: Thursday, March 10, 2016 at

Re: [C*2.1]memtable_allocation_type: offheap_objects

2016-03-09 Thread Jeff Jirsa
We use them. No signs of stability problems in 2.1 that we’ve attributed to offheap objects. From: "aeljami@orange.com" Reply-To: "user@cassandra.apache.org" Date: Wednesday, March 9, 2016 at 1:54 AM To: user Subject: [C*2.1]memtable_allocation_type: offheap_objects Hi,

Re: nodetool drain running for days

2016-04-06 Thread Jeff Jirsa
Drain should not run for days – if it were me, I’d be: Checking for ‘DRAINED’ in the server logs Running ‘nodetool flush’ just to explicitly flush the commitlog/memtables (generally useful before doing drain, too, it can be somewhat race-y) Explicitly killing cassandra following the flush – drain

Re: 1, 2, 3...

2016-04-08 Thread Jeff Jirsa
SELECT COUNT(*) probably works (with internal paging) on many datasets with enough time and assuming you don’t have any partitions that will kill you. No, it doesn’t count extra replicas / duplicates. The old way to do this (before paging / fetch size) was to use manual paging based on

Re: DataStax OpsCenter with Apache Cassandra

2016-04-10 Thread Jeff Jirsa
It is possible to use OpsCenter for open source / community versions up to 2.2.x. It will not be possible in 3.0+ From: Anuj Wadehra Reply-To: "user@cassandra.apache.org" Date: Sunday, April 10, 2016 at 9:28 AM To: User Subject: DataStax OpsCenter with Apache Cassandra Hi, Is it

Re: DTCS bucketing Question

2016-03-19 Thread Jeff Jirsa
> am trying to concretely understand how DTCS makes buckets and I am looking > at the DateTieredCompactionStrategyTest.testGetBuckets method and played with > some of the parameters to GetBuckets method call (Cassandra 2.1.12). I don’t > think I fully understand something there. Don’t feel

Re: Pending compactions not going down on some nodes of the cluster

2016-03-21 Thread Jeff Jirsa
> We added a bunch of new nodes to a cluster (2.1.13) and everything went fine, > except for the number of pending compactions that is staying quite high on a > subset of the new nodes. Over the past 3 days, the pending compactions have > never been less than ~130 on such nodes, with peaks of

Re: Lot of GC on two nodes out of 7

2016-03-03 Thread Jeff Jirsa
0 bytes on all nodes. We have replication factor 3 but the problem is only on two nodes. the only other thing that stands out in cfstats is the read time and write time on the nodes with high GC is 5-7 times higher than other 5 nodes, but i think thats expected. thanks anishek

Re: Lot of GC on two nodes out of 7

2016-03-01 Thread Jeff Jirsa
Compaction falling behind will likely cause additional work on reads (more sstables to merge), but I’d be surprised if it manifested in super long GC. When you say twice as many sstables, how many is that?. In cfstats, does anything stand out? Is max row size on those nodes larger than on

Re: Balancing tokens over 2 datacenter

2016-04-13 Thread Jeff Jirsa
100% ownership on all nodes isn’t wrong with 3 nodes in each of 2 Dcs with RF=3 in both of those Dcs. That’s exactly what you’d expect it to be, and a perfectly viable production config for many workloads. From: Anuj Wadehra Reply-To: "user@cassandra.apache.org" Date: Wednesday, April 13,

Re: Problem Replacing a Dead Node

2016-04-21 Thread Jeff Jirsa
The keyspace with RF=1 may lose data, but isn’t blocking the replacement. The most likely cause of the delay is hung streaming. Run `nodetool netstats` on the joining (replacement) node. Do the byte counters change? If not, streaming is hung, and you’ll likely need to restart the process. If

Re: Bloom filter memory usage disparity

2016-05-17 Thread Jeff Jirsa
Even with the same data, bloom filter is based on sstables. If your compaction behaves differently on 2 nodes than the third, your bloom filter RAM usage may be different. From: Kai Wang Reply-To: "user@cassandra.apache.org" Date: Tuesday, May 17, 2016 at 8:02 PM To:

Re: Removing a datacenter

2016-05-23 Thread Jeff Jirsa
If you remove a node at a time, you’ll eventually end up with a single node in the DC you’re decommissioning which will own all of the data, and you’ll likely overwhelm that node. It’s typically recommended that you ALTER the keyspace, remove the replication settings for that DC, and then you

Re: Replication lag between data center

2016-05-18 Thread Jeff Jirsa
Cassandra isn’t a traditional DB – it doesn’t “replicate” in the same way that a relational DB replicas. Cassandra clients send mutations (via native protocol or thrift). Those mutations include a minimum consistency level for the server to return a successful write. If a write says

Re: Removing a datacenter

2016-05-24 Thread Jeff Jirsa
“removenode” instead of “decommission” to make it even faster. Will that have any side-effect (I think it shouldn’t) ? From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Monday, May 23, 2016 4:43 PM To: user@cassandra.apache.org Subject: Re: Removing a datacenter If you remove a node at a tim

Re: Applying TTL Change quickly

2016-05-17 Thread Jeff Jirsa
Fastest way? Stop cassandra, use sstablemetadata to remove any files with maxTimestamp > 2 days. Start cassandra. Works better with some compaction strategies than others (probably find a few droppable sstables with either DTCS / STCS, but not perfect). Cleanest way? One by one (starting with

Re: restore cassandra snapshots on a smaller cluster

2016-05-17 Thread Jeff Jirsa
http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated From: Luigi Tagliamonte Reply-To: "user@cassandra.apache.org" Date: Tuesday, May 17, 2016 at 5:35 PM To: "user@cassandra.apache.org" Subject: restore cassandra snapshots on a smaller cluster Hi everyone, i'm

Re: Extending a partially upgraded cluster - supported

2016-05-18 Thread Jeff Jirsa
You can’t stream between versions, so in order to grow the cluster, you’ll need to be entirely on 2.0 or entirely on 2.1. If you go to 2.1 first, be sure you run upgradesstables before you try to extend the cluster. On 5/18/16, 11:17 AM, "Erik Forsberg" wrote: >Hi! >

  1   2   3   4   5   6   7   >