Re: Data corruption, invalid UTF-8 bytes

2018-01-03 Thread Stefano Ortolani
used by the java driver, but I fail to see how that could be different when using CQLSH (python). Does anybody more familiar with the reading path able to shed some light on the stack trace? Thanks, Stefano On Tue, Jan 2, 2018 at 6:44 PM, Stefano Ortolani <ostef...@gmail.com> wrote: &g

Data corruption, invalid UTF-8 bytes

2018-01-02 Thread Stefano Ortolani
Hi all, apparently the year started with a node (version 3.0.15) exhibiting some data corruption (discovered by a spark job enumerating all keys). The exception is attached below. The invalid string is a partition key, and it is supposed to be a file name. If I manually decode the bytes I get

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-15 Thread Stefano Ortolani
of each partition is in rac3, which is > going to blow up that instance > > > > -- > Jeff Jirsa > > > On Oct 15, 2017, at 1:42 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > > Hi Jeff, > > this my third attempt bootstrapping the node so I tried several tri

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-15 Thread Stefano Ortolani
> Can you post (anonymize as needed) nodetool status, nodetool netstats, > nodetool tpstats, and nodetool compctionstats ? > > -- > Jeff Jirsa > > > On Oct 15, 2017, at 1:14 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > > Hi Jeff, > > that would be 3.0.15,

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-15 Thread Stefano Ortolani
Hi Jeff, that would be 3.0.15, single disk, vnodes enabled (num_tokens 256). Stefano On Sun, Oct 15, 2017 at 9:11 PM, Jeff Jirsa <jji...@gmail.com> wrote: > What version? > > Single disk or JBOD? > > Vnodes? > > -- > Jeff Jirsa > > > On Oct 15, 201

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-15 Thread Stefano Ortolani
es anybody know anything else I could try? Cheers, Stefano On Fri, Oct 13, 2017 at 3:58 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > Other little update: at the same time I see the number of pending tasks > stuck (in this case at 1847); restarting the node doesn't help, so

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-13 Thread Stefano Ortolani
ther nodes. Feeling more and more puzzled here :S On Fri, Oct 13, 2017 at 1:28 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > I have been trying to add another node to the cluster (after upgrading to > 3.0.15) and I just noticed through "nodetool netstats" that

Re: Bootstrapping a node fails because of compactions not keeping up

2017-10-13 Thread Stefano Ortolani
I have been trying to add another node to the cluster (after upgrading to 3.0.15) and I just noticed through "nodetool netstats" that all nodes have been streaming to the joining node approx 1/3 of their SSTables, basically their whole primary range (using RF=3)? Is this expected/normal? I was

Re: Wide rows splitting

2017-09-18 Thread Stefano Ortolani
You might find this interesting: https://medium.com/@foundev/synthetic-sharding-in-cassandra-to-deal-with-large-partitions-2124b2fd788b Cheers, Stefano On Mon, Sep 18, 2017 at 5:07 AM, Adam Smith wrote: > Dear community, > > I have a table with inlinks to URLs, i.e.

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread Stefano Ortolani
Hi Kurt, On Wed, Aug 23, 2017 at 11:32 AM, kurt greaves wrote: > > ​1) You mean restarting the node in the middle of the bootstrap with >> join_ring=false? Would this option require me to issue a nodetool boostrap >> resume, correct? I didn't know you could instruct the

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread Stefano Ortolani
Hi Kurt, 1) You mean restarting the node in the middle of the bootstrap with join_ring=false? Would this option require me to issue a nodetool boostrap resume, correct? I didn't know you could instruct the join via JMX. Would it be the same of the nodetool boostrap command? 2) Yes, they are

Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-23 Thread Stefano Ortolani
Hi Kurt, sorry, I forgot to specify. I am on 3.0.14. Cheers, Stefano On Wed, Aug 23, 2017 at 12:11 AM, kurt greaves wrote: > What version are you running? 2.2 has an improvement that will retain > levels when streaming and this shouldn't really happen. If you're on 2.1 >

Bootstrapping a node fails because of compactions not keeping up

2017-08-22 Thread Stefano Ortolani
compaction at L0 is done with STCS, but 1 TB is way more than twice the amount of data the node should own in theory, so something else might be responsible for the over streaming. Thanks in advance! Stefano Ortolani

Re: Is it safe to upgrade 2.2.6 to 3.0.13?

2017-05-20 Thread Stefano Ortolani
AM, Varun Gupta <var...@uber.com> wrote: > We upgraded from 2.2.5 to 3.0.11 and it works fine. I will suggest not to > go with 3.013, we are seeing some issues with schema mismatch due to which > we had to rollback to 3.0.11. > > Thanks, > Varun > > On May 19, 2

Re: Is it safe to upgrade 2.2.6 to 3.0.13?

2017-05-19 Thread Stefano Ortolani
Here (https://github.com/apache/cassandra/blob/cassandra-3.0/NEWS.txt) is stated that the minimum supported version for the 2.2.X branch is 2.2.2. On Fri, May 19, 2017 at 2:16 PM, Nicolas Guyomar wrote: > Hi Xihui, > > I was looking for this documentation also, but I

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
> On 16 May 2017, at 19:40, Stefano Ortolani <ostef...@gmail.com> wrote: > > Little update: also the following query timeouts, which is weird since the > range tombstone should have been read by then... > > SELECT * > FROM test_cql.test_cf > WHERE hash

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
017 at 5:17 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > Yes, that was my intention but I wanted to cross-check with the ML and the > devs keeping an eye on it first. > > On Tue, May 16, 2017 at 5:10 PM, Hannu Kröger <hkro...@gmail.com> wrote: > >> Well, &g

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
t > information and the tombstone timestamp it might be possible to skip some > data but I’m not sure that Cassandra currently does that. Maybe it would be > worth a JIRA ticket and see what the devs think about it. If optimizing > this case would make sense. > > Hannu > > On 16 May

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
16, 2017, at 10:03 AM, Stefano Ortolani <ostef...@gmail.com> wrote: > > Hi Hannu, > > the piece of data in question is older. In my example the tombstone is the > newest piece of data. > Since a range tombstone has information re the clustering key ranges, and > the data i

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
gt; Therefore some partition level statistics of cell ages would need to be > kept in the column index for the skipping and that is probably not there. > > Hannu > > On 16 May 2017, at 17:33, Stefano Ortolani <ostef...@gmail.com> wrote: > > That is another way to see the question:

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
herefore you will get an immediate answer. > > > > Does it make sense? > > > > Hannu > > > >> On 16 May 2017, at 16:33, Stefano Ortolani <ostef...@gmail.com> wrote: > >> > >> Hi all, > >> > >> I am seeing inconsist

Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
Hi all, I am seeing inconsistencies when mixing range tombstones, wide partitions, and reverse iterators. I still have to understand if the behaviour is to be expected hence the message on the mailing list. The situation is conceptually simple. I am using a table defined as follows: CREATE

Re: LCS, range tombstones, and eviction

2017-05-16 Thread Stefano Ortolani
'deleted' > data from being returned in the read. It's a bit more complicated than > that, but that's the general idea. > > > On May 12, 2017 at 6:23:01 AM, Stefano Ortolani (ostef...@gmail.com) > wrote: > > Thanks a lot Blake, that definitely helps! > > I

Re: LCS, range tombstones, and eviction

2017-05-12 Thread Stefano Ortolani
e a *lot* of over streaming, so you might want to take a look at how > much streaming your cluster is doing with full repairs, and incremental > repairs. It might actually be more efficient to run full repairs. > > Hope that helps, > > Blake > > On May 11, 2017 at 7:16:26

LCS, range tombstones, and eviction

2017-05-11 Thread Stefano Ortolani
Hi all, I am trying to wrap my head around how C* evicts tombstones when using LCS. Based on what I understood reading the docs, if the ratio of garbage collectable tomstones exceeds the "tombstone_threshold", C* should start compacting and evicting. I am quite puzzled however by what might

Re: Incremental repairs leading to unrepaired data

2016-11-01 Thread Stefano Ortolani
sequentially on each node (no overlapping, next node waits for the previous to complete). Regards, Stefano Ortolani On Mon, Oct 31, 2016 at 11:18 PM, kurt Greaves <k...@instaclustr.com> wrote: > Blowing out to 1k SSTables seems a bit full on. What args are you passing to > repair? > >

Re: Incremental repairs leading to unrepaired data

2016-10-31 Thread Stefano Ortolani
have any impact in theory. Nodes do not seem that overloaded either and don't see any GC spikes while those mutations are dropped :/ Hitting a dead end here, any further idea where to look for further ideas? Regards, Stefano On Wed, Aug 10, 2016 at 12:41 PM, Stefano Ortolani <ostef...@gmail.

Re: Import failure for use python cassandra-driver

2016-10-26 Thread Stefano Ortolani
Did you try the workaround they posted (aka, downgrading Cython)? Cheers, Stefano On Wed, Oct 26, 2016 at 10:01 AM, Zao Liu wrote: > Same happen to my ubuntu boxes. > > File >

Re: Repairing without -pr shows unexpected out-of-sync ranges

2016-10-03 Thread Stefano Ortolani
, Sep 27, 2016 at 4:09 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > Didn't know about (2), and I actually have a time drift between the nodes. > Thanks a lot Paulo! > > Regards, > Stefano > > On Thu, Sep 22, 2016 at 6:36 PM, Paulo Motta <pauloricard...@gmail.com>

Re: Repairing without -pr shows unexpected out-of-sync ranges

2016-09-27 Thread Stefano Ortolani
epair will not be marked as > repaired, so nodes with different compaction cadences will have different > data in their unrepaired set, what will cause mismatches in the subsequent > incremental repairs. CASSANDRA-9143 will hopefully fix that limitation. > > 2016-09-22 7:10 GMT-03:

Repairing without -pr shows unexpected out-of-sync ranges

2016-09-22 Thread Stefano Ortolani
Hi, I am seeing something weird while running repairs. I am testing 3.0.9 so I am running the repairs manually, node after node, on a cluster with RF=3. I am using a standard repair command (incremental, parallel, full range), and I just noticed that the third node detected some ranges out of

Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
t; the repairedAt field is mutated), which is leveraged by full range repair, > which would not work in many cases for partial range repairs, yielding > higher I/O. > > 2016-08-26 10:17 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > >> I see. Didn't think about it that way. T

Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
gt; you will not have the problem of re-doing work as in non-inc non-pr repair. > > 2016-08-26 7:57 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > >> Hi Paulo, could you elaborate on 2? >> I didn't know incremental repairs were not compatible with -pr >> What is the

Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
Hi Paulo, could you elaborate on 2? I didn't know incremental repairs were not compatible with -pr What is the underlying reason? Regards, Stefano On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta wrote: > 1. Migration procedure is no longer necessary after CASSANDRA-8004,

Re: JVM Crash on 3.0.6

2016-08-11 Thread Stefano Ortolani
Not really related, but know that on 12.04 I had to disable jemalloc, otherwise nodes would randomly die at startup ( https://issues.apache.org/jira/browse/CASSANDRA-11723) Regards, Stefano On Thu, Aug 11, 2016 at 10:28 AM, Riccardo Ferrari wrote: > Hi C* users, > > In

Re: Incremental repairs leading to unrepaired data

2016-08-10 Thread Stefano Ortolani
e problem might be somewhere else. Generally > dropped mutations is a signal of cluster overload, so if there's nothing > else wrong perhaps you need to increase your capacity. What version are you > in? > > 2016-08-10 8:21 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > &g

Re: Incremental repairs leading to unrepaired data

2016-08-10 Thread Stefano Ortolani
ssandra.yaml or via nodetool > setcompactionthroughput. Did you try lowering that and checking if that > improves the dropped mutations? > > 2016-08-09 13:32 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > >> Hi all, >> >> I am running incremental repaird on a w

Incremental repairs leading to unrepaired data

2016-08-09 Thread Stefano Ortolani
Hi all, I am running incremental repaird on a weekly basis (can't do it every day as one single run takes 36 hours), and every time, I have at least one node dropping mutations as part of the process (this almost always during the anticompaction phase). Ironically this leads to a system where

Re: (C)* stable version after 3.5

2016-07-14 Thread Stefano Ortolani
FWIW, I've recently upgraded from 2.1 to 3.0 without issues of any sort, but admittedly I haven't been using anything too fancy. Cheers, Stefano On Wed, Jul 13, 2016 at 10:28 PM, Alain RODRIGUEZ wrote: > Hi Anuj > > From >

Re: Open source equivalents of OpsCenter

2016-07-14 Thread Stefano Ortolani
Replaced OpsCenter with a mix of: * metrics-graphite-3.1.0.jar installed in the same classpath of C* * Custom script to push system metrics (cpu/mem/io) * Grafana to create the dashboard * Custom repairs script Still not optimal but getting there... Stefano On Thu, Jul 14, 2016 at 10:18 AM,

Re: C* 3.0.7 - Compactions pending after TRUNCATE

2016-06-28 Thread Stefano Ortolani
I am updating the following ticket https://issues.apache.org/jira/browse/CASSANDRA-12100 as I discover new bits. Regards, Stefano On Tue, Jun 28, 2016 at 9:37 AM, Stefano Ortolani <ostef...@gmail.com> wrote: > Hi all, > > I've just updated to C* 3.0.7, and I am now seeing s

C* 3.0.7 - Compactions pending after TRUNCATE

2016-06-28 Thread Stefano Ortolani
ago. I am fairly confident this issue was not there in C* 3.0.5. Any idea? Regards, Stefano Ortolani

Re: Question about sequential repair vs parallel

2016-06-23 Thread Stefano Ortolani
Yes, because you keep a snapshot in the meanwhile if I remember correctly. Regards, Stefano On Thu, Jun 23, 2016 at 4:22 PM, Jean Carlo wrote: > Cassandra 2.1.12 > > In the moment of a repair -pr sequential, we are experimenting an > exponential increase of number of

Re: Incorrect progress percentage while repairing

2016-06-02 Thread Stefano Ortolani
Forgot to add the C* version. That would be 3.0.6. Regards, Stefano Ortolani On Thu, Jun 2, 2016 at 3:55 PM, Stefano Ortolani <ostef...@gmail.com> wrote: > Hi, > > While running incremental (parallel) repairs on the first partition range > (-pr), I rarely see the progress per

Incorrect progress percentage while repairing

2016-06-02 Thread Stefano Ortolani
%) Nodetool does return normally and no error is found in its output or in the cassandra logs. Any idea why? Is this behavior expected? Regards, Stefano Ortolani

Re: Upgrade from 2.1.11 to 3.0.5 leads to unstable nodes

2016-05-06 Thread Stefano Ortolani
, Stefano Ortolani <ostef...@gmail.com> wrote: > Hi, > > I am experiencing some weird behaviors after upgrading 2 nodes (out of 13) > to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there > is a small chance that the node will die without outputting a

Upgrade from 2.1.11 to 3.0.5 leads to unstable nodes

2016-05-05 Thread Stefano Ortolani
Hi, I am experiencing some weird behaviors after upgrading 2 nodes (out of 13) to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there is a small chance that the node will die without outputting anything to the logs (not even dmesg). This happened on both nodes I upgraded.

Re: [Marketing Mail] Migrating to incremental repairs

2015-11-19 Thread Stefano Ortolani
As far as I know, docs is quite inconsistent on the matter. Based on some research here and on IRC, recent versions of Cassandra do no require anything specific when migrating to incremental repairs but the the -inc switch even on LCS. Any confirmation on the matter is more than welcome.

Re: Running Cassandra on Java 8 u60..

2015-09-25 Thread Stefano Ortolani
I think those were referring to Java7 and G1GC (early versions were buggy). Cheers, Stefano On Fri, Sep 25, 2015 at 5:08 PM, Kevin Burton wrote: > Any issues with running Cassandra 2.0.16 on Java 8? I remember there is > long term advice on not changing the GC but not the

Re: Leveled Compaction Strategy with a really intensive delete workload

2015-05-26 Thread Stefano Ortolani
= 0.000508MB/s. 6 total partitions merged to 3. Partition merge counts were {1:2, 4:1, } hth jason On Tue, May 26, 2015 at 6:24 AM, Stefano Ortolani ostef...@gmail.com wrote: Ok, I am reading a bit more about compaction subproperties here ( http://docs.datastax.com/en/cql/3.1/cql/cql_reference

Re: LeveledCompactionStrategy

2015-05-26 Thread Stefano Ortolani
Hi Jean, I am trying to solve a similar problem here. I would say that the only deterministic way is to rebuild the SStable of that column family via nodetool scrub. Otherwise you'd need to : * decrease tombstone_threshold * wait for gc_grace_time Cheers, Stefano On Tue, May 26, 2015 at

Re: Leveled Compaction Strategy with a really intensive delete workload

2015-05-25 Thread Stefano Ortolani
for read intensive workloads. Depending on your use case, you might better of with data tiered or size tiered strategy. regards regards On Sun, May 24, 2015 at 10:50 AM, Stefano Ortolani ostef...@gmail.com wrote: Hi all, I have a question re leveled compaction strategy that has

Re: Leveled Compaction Strategy with a really intensive delete workload

2015-05-25 Thread Stefano Ortolani
) is possible without downtime, and how fast those values are picked up? Cheers, Stefano On Mon, May 25, 2015 at 1:32 PM, Stefano Ortolani ostef...@gmail.com wrote: Hi all, Thanks for your answers! Yes, I agree that a delete intensive workload is not something Cassandra is designed

Leveled Compaction Strategy with a really intensive delete workload

2015-05-24 Thread Stefano Ortolani
took place)? Regards, Stefano Ortolani

Re: Recommissioned a node

2015-02-12 Thread Stefano Ortolani
Definitely, I think the very same re this issue. On Thu, Feb 12, 2015 at 7:04 AM, Eric Stevens migh...@gmail.com wrote: I definitely find it surprising that a node which was decommissioned is willing to rejoin a cluster. I can't think of any legitimate scenario where you'd want that, and I'm

Re: Recommissioned a node

2015-02-11 Thread Stefano Ortolani
having a consistent view of the data. A safer approach would be to wipe the data directory and bootstrap it as a clean new member. I'm curious what prompted that cycle of decommission then recommission. On Tue, Feb 10, 2015 at 10:13 PM, Stefano Ortolani ostef...@gmail.com wrote: Hi, I