used by the java driver, but I fail to see how
that could be different when using CQLSH (python).
Does anybody more familiar with the reading path able to shed some light on
the stack trace?
Thanks,
Stefano
On Tue, Jan 2, 2018 at 6:44 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
&g
Hi all,
apparently the year started with a node (version 3.0.15) exhibiting some
data corruption (discovered by a spark job enumerating all keys).
The exception is attached below.
The invalid string is a partition key, and it is supposed to be a file
name. If I manually decode the bytes I get
of each partition is in rac3, which is
> going to blow up that instance
>
>
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 2017, at 1:42 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
>
> Hi Jeff,
>
> this my third attempt bootstrapping the node so I tried several tri
> Can you post (anonymize as needed) nodetool status, nodetool netstats,
> nodetool tpstats, and nodetool compctionstats ?
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 2017, at 1:14 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
>
> Hi Jeff,
>
> that would be 3.0.15,
Hi Jeff,
that would be 3.0.15, single disk, vnodes enabled (num_tokens 256).
Stefano
On Sun, Oct 15, 2017 at 9:11 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> What version?
>
> Single disk or JBOD?
>
> Vnodes?
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 201
es anybody know anything else I could try?
Cheers,
Stefano
On Fri, Oct 13, 2017 at 3:58 PM, Stefano Ortolani <ostef...@gmail.com>
wrote:
> Other little update: at the same time I see the number of pending tasks
> stuck (in this case at 1847); restarting the node doesn't help, so
ther nodes.
Feeling more and more puzzled here :S
On Fri, Oct 13, 2017 at 1:28 PM, Stefano Ortolani <ostef...@gmail.com>
wrote:
> I have been trying to add another node to the cluster (after upgrading to
> 3.0.15) and I just noticed through "nodetool netstats" that
I have been trying to add another node to the cluster (after upgrading to
3.0.15) and I just noticed through "nodetool netstats" that all nodes have
been streaming to the joining node approx 1/3 of their SSTables, basically
their whole primary range (using RF=3)?
Is this expected/normal?
I was
You might find this interesting:
https://medium.com/@foundev/synthetic-sharding-in-cassandra-to-deal-with-large-partitions-2124b2fd788b
Cheers,
Stefano
On Mon, Sep 18, 2017 at 5:07 AM, Adam Smith wrote:
> Dear community,
>
> I have a table with inlinks to URLs, i.e.
Hi Kurt,
On Wed, Aug 23, 2017 at 11:32 AM, kurt greaves wrote:
>
> 1) You mean restarting the node in the middle of the bootstrap with
>> join_ring=false? Would this option require me to issue a nodetool boostrap
>> resume, correct? I didn't know you could instruct the
Hi Kurt,
1) You mean restarting the node in the middle of the bootstrap with
join_ring=false? Would this option require me to issue a nodetool boostrap
resume, correct? I didn't know you could instruct the join via JMX. Would
it be the same of the nodetool boostrap command?
2) Yes, they are
Hi Kurt,
sorry, I forgot to specify. I am on 3.0.14.
Cheers,
Stefano
On Wed, Aug 23, 2017 at 12:11 AM, kurt greaves wrote:
> What version are you running? 2.2 has an improvement that will retain
> levels when streaming and this shouldn't really happen. If you're on 2.1
>
compaction at L0 is done with STCS, but 1 TB is way more than twice the
amount of data the node should own in theory, so something else might be
responsible for the over streaming.
Thanks in advance!
Stefano Ortolani
AM, Varun Gupta <var...@uber.com> wrote:
> We upgraded from 2.2.5 to 3.0.11 and it works fine. I will suggest not to
> go with 3.013, we are seeing some issues with schema mismatch due to which
> we had to rollback to 3.0.11.
>
> Thanks,
> Varun
>
> On May 19, 2
Here (https://github.com/apache/cassandra/blob/cassandra-3.0/NEWS.txt) is
stated that the minimum supported version for the 2.2.X branch is 2.2.2.
On Fri, May 19, 2017 at 2:16 PM, Nicolas Guyomar
wrote:
> Hi Xihui,
>
> I was looking for this documentation also, but I
> On 16 May 2017, at 19:40, Stefano Ortolani <ostef...@gmail.com> wrote:
>
> Little update: also the following query timeouts, which is weird since the
> range tombstone should have been read by then...
>
> SELECT *
> FROM test_cql.test_cf
> WHERE hash
017 at 5:17 PM, Stefano Ortolani <ostef...@gmail.com>
wrote:
> Yes, that was my intention but I wanted to cross-check with the ML and the
> devs keeping an eye on it first.
>
> On Tue, May 16, 2017 at 5:10 PM, Hannu Kröger <hkro...@gmail.com> wrote:
>
>> Well,
&g
t
> information and the tombstone timestamp it might be possible to skip some
> data but I’m not sure that Cassandra currently does that. Maybe it would be
> worth a JIRA ticket and see what the devs think about it. If optimizing
> this case would make sense.
>
> Hannu
>
> On 16 May
16, 2017, at 10:03 AM, Stefano Ortolani <ostef...@gmail.com> wrote:
>
> Hi Hannu,
>
> the piece of data in question is older. In my example the tombstone is the
> newest piece of data.
> Since a range tombstone has information re the clustering key ranges, and
> the data i
gt; Therefore some partition level statistics of cell ages would need to be
> kept in the column index for the skipping and that is probably not there.
>
> Hannu
>
> On 16 May 2017, at 17:33, Stefano Ortolani <ostef...@gmail.com> wrote:
>
> That is another way to see the question:
herefore you will get an immediate answer.
> >
> > Does it make sense?
> >
> > Hannu
> >
> >> On 16 May 2017, at 16:33, Stefano Ortolani <ostef...@gmail.com> wrote:
> >>
> >> Hi all,
> >>
> >> I am seeing inconsist
Hi all,
I am seeing inconsistencies when mixing range tombstones, wide partitions,
and reverse iterators.
I still have to understand if the behaviour is to be expected hence the
message on the mailing list.
The situation is conceptually simple. I am using a table defined as follows:
CREATE
'deleted'
> data from being returned in the read. It's a bit more complicated than
> that, but that's the general idea.
>
>
> On May 12, 2017 at 6:23:01 AM, Stefano Ortolani (ostef...@gmail.com)
> wrote:
>
> Thanks a lot Blake, that definitely helps!
>
> I
e a *lot* of over streaming, so you might want to take a look at how
> much streaming your cluster is doing with full repairs, and incremental
> repairs. It might actually be more efficient to run full repairs.
>
> Hope that helps,
>
> Blake
>
> On May 11, 2017 at 7:16:26
Hi all,
I am trying to wrap my head around how C* evicts tombstones when using LCS.
Based on what I understood reading the docs, if the ratio of garbage
collectable tomstones exceeds the "tombstone_threshold", C* should start
compacting and evicting.
I am quite puzzled however by what might
sequentially on each node (no overlapping, next node
waits for the previous to complete).
Regards,
Stefano Ortolani
On Mon, Oct 31, 2016 at 11:18 PM, kurt Greaves <k...@instaclustr.com> wrote:
> Blowing out to 1k SSTables seems a bit full on. What args are you passing to
> repair?
>
>
have any impact in theory.
Nodes do not seem that overloaded either and don't see any GC spikes
while those mutations are dropped :/
Hitting a dead end here, any further idea where to look for further ideas?
Regards,
Stefano
On Wed, Aug 10, 2016 at 12:41 PM, Stefano Ortolani <ostef...@gmail.
Did you try the workaround they posted (aka, downgrading Cython)?
Cheers,
Stefano
On Wed, Oct 26, 2016 at 10:01 AM, Zao Liu wrote:
> Same happen to my ubuntu boxes.
>
> File
>
, Sep 27, 2016 at 4:09 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
> Didn't know about (2), and I actually have a time drift between the nodes.
> Thanks a lot Paulo!
>
> Regards,
> Stefano
>
> On Thu, Sep 22, 2016 at 6:36 PM, Paulo Motta <pauloricard...@gmail.com>
epair will not be marked as
> repaired, so nodes with different compaction cadences will have different
> data in their unrepaired set, what will cause mismatches in the subsequent
> incremental repairs. CASSANDRA-9143 will hopefully fix that limitation.
>
> 2016-09-22 7:10 GMT-03:
Hi,
I am seeing something weird while running repairs.
I am testing 3.0.9 so I am running the repairs manually, node after node,
on a cluster with RF=3. I am using a standard repair command (incremental,
parallel, full range), and I just noticed that the third node detected some
ranges out of
t; the repairedAt field is mutated), which is leveraged by full range repair,
> which would not work in many cases for partial range repairs, yielding
> higher I/O.
>
> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>
>> I see. Didn't think about it that way. T
gt; you will not have the problem of re-doing work as in non-inc non-pr repair.
>
> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>
>> Hi Paulo, could you elaborate on 2?
>> I didn't know incremental repairs were not compatible with -pr
>> What is the
Hi Paulo, could you elaborate on 2?
I didn't know incremental repairs were not compatible with -pr
What is the underlying reason?
Regards,
Stefano
On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta
wrote:
> 1. Migration procedure is no longer necessary after CASSANDRA-8004,
Not really related, but know that on 12.04 I had to disable jemalloc,
otherwise nodes would randomly die at startup (
https://issues.apache.org/jira/browse/CASSANDRA-11723)
Regards,
Stefano
On Thu, Aug 11, 2016 at 10:28 AM, Riccardo Ferrari
wrote:
> Hi C* users,
>
> In
e problem might be somewhere else. Generally
> dropped mutations is a signal of cluster overload, so if there's nothing
> else wrong perhaps you need to increase your capacity. What version are you
> in?
>
> 2016-08-10 8:21 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>
&g
ssandra.yaml or via nodetool
> setcompactionthroughput. Did you try lowering that and checking if that
> improves the dropped mutations?
>
> 2016-08-09 13:32 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>
>> Hi all,
>>
>> I am running incremental repaird on a w
Hi all,
I am running incremental repaird on a weekly basis (can't do it every day
as one single run takes 36 hours), and every time, I have at least one node
dropping mutations as part of the process (this almost always during the
anticompaction phase). Ironically this leads to a system where
FWIW, I've recently upgraded from 2.1 to 3.0 without issues of any sort,
but admittedly I haven't been using anything too fancy.
Cheers,
Stefano
On Wed, Jul 13, 2016 at 10:28 PM, Alain RODRIGUEZ
wrote:
> Hi Anuj
>
> From
>
Replaced OpsCenter with a mix of:
* metrics-graphite-3.1.0.jar installed in the same classpath of C*
* Custom script to push system metrics (cpu/mem/io)
* Grafana to create the dashboard
* Custom repairs script
Still not optimal but getting there...
Stefano
On Thu, Jul 14, 2016 at 10:18 AM,
I am updating the following ticket
https://issues.apache.org/jira/browse/CASSANDRA-12100 as I discover new
bits.
Regards,
Stefano
On Tue, Jun 28, 2016 at 9:37 AM, Stefano Ortolani <ostef...@gmail.com>
wrote:
> Hi all,
>
> I've just updated to C* 3.0.7, and I am now seeing s
ago.
I am fairly confident this issue was not there in C* 3.0.5.
Any idea?
Regards,
Stefano Ortolani
Yes, because you keep a snapshot in the meanwhile if I remember correctly.
Regards,
Stefano
On Thu, Jun 23, 2016 at 4:22 PM, Jean Carlo
wrote:
> Cassandra 2.1.12
>
> In the moment of a repair -pr sequential, we are experimenting an
> exponential increase of number of
Forgot to add the C* version. That would be 3.0.6.
Regards,
Stefano Ortolani
On Thu, Jun 2, 2016 at 3:55 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
> Hi,
>
> While running incremental (parallel) repairs on the first partition range
> (-pr), I rarely see the progress per
%)
Nodetool does return normally and no error is found in its output or in the
cassandra logs.
Any idea why? Is this behavior expected?
Regards,
Stefano Ortolani
, Stefano Ortolani <ostef...@gmail.com> wrote:
> Hi,
>
> I am experiencing some weird behaviors after upgrading 2 nodes (out of 13)
> to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there
> is a small chance that the node will die without outputting a
Hi,
I am experiencing some weird behaviors after upgrading 2 nodes (out of 13)
to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there
is a small chance that the node will die without outputting anything to the
logs (not even dmesg).
This happened on both nodes I upgraded.
As far as I know, docs is quite inconsistent on the matter.
Based on some research here and on IRC, recent versions of Cassandra do no
require anything specific when migrating to incremental repairs but the the
-inc switch even on LCS.
Any confirmation on the matter is more than welcome.
I think those were referring to Java7 and G1GC (early versions were buggy).
Cheers,
Stefano
On Fri, Sep 25, 2015 at 5:08 PM, Kevin Burton wrote:
> Any issues with running Cassandra 2.0.16 on Java 8? I remember there is
> long term advice on not changing the GC but not the
= 0.000508MB/s. 6 total
partitions merged to 3. Partition merge counts were {1:2, 4:1, }
hth
jason
On Tue, May 26, 2015 at 6:24 AM, Stefano Ortolani ostef...@gmail.com
wrote:
Ok, I am reading a bit more about compaction subproperties here (
http://docs.datastax.com/en/cql/3.1/cql/cql_reference
Hi Jean,
I am trying to solve a similar problem here. I would say that the only
deterministic way is to rebuild the SStable of that column family via
nodetool scrub.
Otherwise you'd need to :
* decrease tombstone_threshold
* wait for gc_grace_time
Cheers,
Stefano
On Tue, May 26, 2015 at
for read intensive workloads.
Depending on your use case, you might better of with data tiered or size
tiered strategy.
regards
regards
On Sun, May 24, 2015 at 10:50 AM, Stefano Ortolani ostef...@gmail.com
wrote:
Hi all,
I have a question re leveled compaction strategy that has
) is possible
without downtime, and how fast those values are picked up?
Cheers,
Stefano
On Mon, May 25, 2015 at 1:32 PM, Stefano Ortolani ostef...@gmail.com
wrote:
Hi all,
Thanks for your answers! Yes, I agree that a delete intensive workload is
not something Cassandra is designed
took place)?
Regards,
Stefano Ortolani
Definitely, I think the very same re this issue.
On Thu, Feb 12, 2015 at 7:04 AM, Eric Stevens migh...@gmail.com wrote:
I definitely find it surprising that a node which was decommissioned is
willing to rejoin a cluster. I can't think of any legitimate scenario
where you'd want that, and I'm
having a consistent view of the data.
A safer approach would be to wipe the data directory and bootstrap it as a
clean new member.
I'm curious what prompted that cycle of decommission then recommission.
On Tue, Feb 10, 2015 at 10:13 PM, Stefano Ortolani ostef...@gmail.com
wrote:
Hi,
I
56 matches
Mail list logo