Hey guys, quick question:
I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log on
one drive, data on nvme. That was working very well, it's a ts db and has
been accumulating data for about 4weeks.
The nodes have increased in load and compaction seems to be falling
behind. I use
0
On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad wrote:
> What's your window size?
>
> When you say backed up, how are you measuring that? Are there pending
> tasks or do you just see more files than you expect?
>
> On Tue, Aug 7, 2018 at 4:38 PM Brian Spi
lower priority than normal compactions
>
> Are the lots-of-little-files from memtable flushes or
> repair/anticompaction?
>
> Do you do normal deletes? Did you try to run Incremental repair?
>
> --
> Jeff Jirsa
>
>
> On Aug 7, 2018, at 5:00 PM, Brian Spindler
> wrote
> they’re not eligible for compaction with unrepaired sstables and that could
> explain some higher counts
>
> Do you actually do deletes or is everything ttl’d?
>
>
> --
> Jeff Jirsa
>
>
>> On Aug 7, 2018, at 5:09 PM, Brian Spindler wrote:
>>
>&
ome higher counts
>
> Do you actually do deletes or is everything ttl’d?
>
>
> --
> Jeff Jirsa
>
>
> On Aug 7, 2018, at 5:09 PM, Brian Spindler
> wrote:
>
> Hi Jeff, mostly lots of little files, like there will be 4-5 that are
> 1-1.5gb or so and then m
In fact all of them say Repaired at: 0.
On Tue, Aug 7, 2018 at 9:13 PM Brian Spindler
wrote:
> Hi, I spot checked a couple of the files that were ~200MB and the mostly
> had "Repaired at: 0" so maybe that's not it?
>
> -B
>
>
> On Tue, Aug 7, 2018 at 8:16
I am thinking to abandon incremental repairs by;
- Set all repairedAt values to 0 on any/all *Data.db SSTables
- using either range_repair.py or reaper run sub range repairs
Will this clean everything up?
On Tue, Aug 7, 2018 at 9:18 PM Brian Spindler
wrote:
> In fact all of them say Repair
Ma, did you try what Mohamadreza suggested? Have a such a large heap means
you are getting a ton of stuff that needs full GC.
On Tue, Aug 28, 2018 at 4:31 AM Pradeep Chhetri
wrote:
> You may want to try upgrading to 3.11.3 instead which has some memory
> leaks fixes.
>
> On Tue, Aug 28, 2018 at
Hi all, we're planning an upgrade from 2.1.5->3.11.3 and currently we have
several column families configured with twcs class
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy' and
with 3.11.3 we need to set it to 'TimeWindowCompactionStrategy'
Is that a safe operation? Will cas
> --
> Jeff Jirsa
>
>
> On Nov 2, 2018, at 11:28 AM, Brian Spindler
> wrote:
>
> Hi all, we're planning an upgrade from 2.1.5->3.11.3 and currently we have
> several column families configured with twcs class
> 'com.jeffjirsa.cassandra.db.compaction.TimeWin
Nevermind, I spoke to quickly. I can change the cass version in the
pom.xml and re compile, thanks!
On Fri, Nov 2, 2018 at 2:38 PM Brian Spindler
wrote:
> [image: image.png]
>
>
> On Fri, Nov 2, 2018 at 2:34 PM Jeff Jirsa wrote:
>
>> Easiest approach is to build the
Try it in a test cluster before prod
>
>
> --
> Jeff Jirsa
>
>
> On Nov 2, 2018, at 11:49 AM, Brian Spindler
> wrote:
>
> Nevermind, I spoke to quickly. I can change the cass version in the
> pom.xml and re compile, thanks!
>
> On Fri, Nov 2, 2018 at 2:38 PM B
gt; I suppose you could also disable compaction, switch to something else
> (stcs), do the upgrade, then alter it back to the official TWCS. Whether or
> not this is viable depends on how quickly you write, and how long it'll
> take you to upgrade.
>
>
>
>
>
> On Fri,
3b258139da
Thanks!
On Fri, Nov 2, 2018 at 7:24 PM Brian Spindler
wrote:
> I hope I can do as you suggest and leapfrog to 3.11 rather than two
> stepping it from 3.7->3.11
>
> Just having TWCS has saved me lots of hassle so it’s all good, thanks for
> all you do for our community.
&g
Hi all, we recently started using the cassandra-lucene secondary index
support that Instaclustr recently assumed ownership of, thank you btw!
We are experiencing a strange issue where adding/removing nodes fails and
the joining node is left hung with a compaction "Secondary index build" and
it jus
Hi folks, hopefully a quick one:
We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch. It's all in
one region but spread across 3 availability zones. It was nicely balanced with
4 nodes in each.
But with a couple of failures and subsequent provisions to the wrong az we now
have a
Thanks for replying Jeff.
Responses below.
On Sat, Aug 12, 2017 at 8:33 PM Jeff Jirsa wrote:
> Answers inline
>
> --
> Jeff Jirsa
>
>
> > On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
> >
> > Hi folks, hopefully a quick one:
> >
> > We are running a 12 node cluster (2.1.15) in AW
ld. Hard to say for
> sure. ‘nodetool compactionstats’ if you’re able to provide it. The jstack
> probably not necessary, streaming is being marked as failed and it’s
> turning itself off. Not sure why streaming is marked as failing, though,
> anything on the sending sides?
>
>
>
ruption. Disk space
is getting low on these nodes ...
On Sat, Aug 12, 2017 at 9:51 PM Brian Spindler
wrote:
> nothing in logs on the node that it was streaming from.
>
> however, I think I found the issue on the other node in the C rack:
>
> ERROR [STREAM-IN-/10.40.17.114
; faster with less data.
>
>
> --
> Jeff Jirsa
>
>
> On Aug 13, 2017, at 7:11 AM, Brian Spindler
> wrote:
>
> Hi Jeff, I ran the scrub online and that didn't help. I went ahead and
> stopped the node, deleted all the corrupted data files --*.db
across all nodes for corrupt exceptions - so far no new
occurrences.
Thanks again.
-B
On Sun, Aug 13, 2017 at 17:52 kurt greaves wrote:
>
>
> On 14 Aug. 2017 00:59, "Brian Spindler" wrote:
>
> Do you think with the setup I've described I'd be ok doing that
Hi guys, our cluster - around 18 nodes - just starting having nodes die and
when restarting them they are dying with OOM. How can we handle this?
I've tried adding a couple extra gigs on these machines to help but it's
not.
Help!
-B
that column families. I think we
should review this column family design so it doesn't generate so many
tombstones? Could that be the cause? What else would you recommend?
Thank you in advance.
On Fri, Oct 6, 2017 at 6:33 AM Brian Spindler
wrote:
> Hi guys, our cluster - around 18 nodes -
memory available, the heap size and GC type
> and options in use. Do you see some GC pauses in the logs or do you control
> this value through a chart using JVM metrics?
>
> C*heers,
>
> -------
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last P
? Or do I (can I?) manually delete these
files and will c* just ignore the overlapping data and treat as tombstoned?
What else should/could be done?
Thank you in advance for your advice,
*__*
*Brian Spindler *
I probably should have mentioned our setup: we’re on Cassandra version
2.1.15.
On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler
wrote:
> Hi, I have several column families using TWCS and it’s great.
> Unfortunately we seem to have missed the great advice in Alex’s article
> h
d be kept
> as a last resort option if you're running out of space.
>
> Cheers,
>
> Le sam. 20 janv. 2018 à 15:41, Brian Spindler
> a écrit :
>
>> I probably should have mentioned our setup: we’re on Cassandra version
>> 2.1.15.
>>
>>
>> On Sa
/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
Cheers,
-B
On Sat, Jan 20, 2018 at 10:49 AM Brian Spindler
wrote:
> Hi Alexander, Thanks for your response! I'll give it a shot.
>
> On Sat, Jan 20, 2018 at 10:22 AM Alexander Dejanovski <
> a...@thelastpickle.com&g
gt; So put dclocal_... to 0.0.
>
> The commit you're referring to has been merged in 3.11.1 as 2.1 doesn't
> patched anymore.
>
>
>> Le sam. 20 janv. 2018 à 16:55, Brian Spindler a
>> écrit :
>> Hi Alexander, after re-reading this
>> https://issues.a
Hi Karthick, repairs can be tricky.
You can (and probably should) run repairs as apart of routine maintenance. And
of course absolutely if you lose a node in a bad way. If you decommission a
node for example, no “extra” repair needed.
If you are using TWCS you should probably not run repai
I would start here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html
Specifically the “Hints and repairs” and “Timestamp overlap” sections might be
of use.
-B
> On Jan 25, 2018, at 11:05 AM, Thakrar, Jayesh
> wrote:
>
> Wondering if I can get some pointers to what's happening here
It's all here:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesWhen.html
-B
On Thu, Jan 25, 2018 at 6:08 AM Karthick V wrote:
> *You can (and probably should) run repairs as apart of routine
>> maintenance.*
>
>
> Can u explain any use case for why do we need th
Hi guys, I've got a 2.1.15 node that will not start it seems. Hangs on
Opening system.size_estimates. Sometimes it can take a while but I've let
it run for 90m and nothing. Should I move this sstable out of the way to
let it start? will it rebuild/refresh size estimates if I remove that
folder?
lem. It
> flushed some new SSTables a short while after.
>
> I honestly do not know the specifics of how size_estimates is used, but if it
> prevented a node from restarting I'd definitely remove the sstables to get it
> back up.
>
> Cheers,
>
>> On Sat, Fe
34 matches
Mail list logo