Re: Update: C/C NA Call for Presentations Deadline Extended to April 15th

2024-04-06 Thread Paulo Motta
Paulo Motta wrote: > Hi, > > I wanted to update that the Call for Presentations deadline was extended > by two weeks to April 15th, 2024 for Community Over Code North America > 2024. Find more information on this blog post: > https://news.apache.org/foundation/entry/apache-s

Update: C/C NA Call for Presentations Deadline Extended to April 15th

2024-03-19 Thread Paulo Motta
Hi, I wanted to update that the Call for Presentations deadline was extended by two weeks to April 15th, 2024 for Community Over Code North America 2024. Find more information on this blog post:

Two weeks remaining to submit abstracts to Community Over Code 2024

2024-03-18 Thread Paulo Motta
Hi, I'd like to send a friendly reminder that the deadline for submissions to Community Over Code North America 2024 ends in two weeks on April 1st, 2024. This conference will be held in Denver, Colorado, October 7-10, 2024. We're looking for abstracts in the following areas: * Customizing and

Call for Presentations: Cassandra @ Community Over Code North America 2024

2024-03-11 Thread Paulo Motta
Hi, After a successful experience in ApacheCon 2022, the Cassandra track is back to Community Over Code North America 2024 to be held in Denver, Colorado, October 7-10, 2024. I will be facilitating this track and I would like to request abstract drafts in the following topics to be presented in

Call for Presentations closing soon: Community over Code EU 2024

2024-01-08 Thread Paulo Motta
I wanted to remind that the call for speakers for Community Over Code EU 2024 (formerly Apachecon EU) will be closing this Friday 2024/01/12 23:59:59 GMT. If you reside in Europe/EMEA and have an interesting talk proposal about using, deploying or modifying Apache Cassandra please see details

Fwd: Reminder: CFP for Community Over Code Europe is now open

2023-11-29 Thread Paulo Motta
There will be a Cassandra track at Community Over Code Europe 2024 (formerly known as Apachecon EU) that will happen on Bratislava, Slovakia on 3 Jun 2024. If you have any talks proposals about using, deploying or modifying Apache Cassandra please make a submission before 12 Jan 2024 to speak at

Re: Cassandra Summit: Engage those networks!

2023-11-29 Thread Paulo Motta
This Cassandra Summit is going to be epic! Looking forward to meet the Cassandra community in two weeks! 落 On Wed, 29 Nov 2023 at 18:26 Patrick McFadin wrote: > Hi everyone, > > We are a couple of weeks away from Cassandra Summit. People get busy and > forget to register or miss that there is

[DISCUSS] disk_access_mode setting on cassandra.yaml

2023-09-30 Thread Paulo Motta
Hi, On the dev@ mailing list I proposed updating the default value of the advanced property "disk_access_mode" to a more stable default for typical workloads. See the discussion on [1] for details. I wanted to check if anyone had experiences (good or bad) with overriding this setting in the

Invitation to take the 2022 ASF Community Survey

2022-08-25 Thread Paulo Motta
Hello everyone, The 2022 ASF Community Survey is looking to gather scientific data that allows us to understand our community better, both in its demographic composition, and also in collaboration styles and preferences. We want to find areas where we can continue to do great work, and others

Re: Topology vs RackDC

2022-06-02 Thread Paulo Motta
It think topology file is better for static clusters, while rackdc for dynamic clusters where users can add/remove hosts without needing to update the topology file on all hosts. On Thu, 2 Jun 2022 at 09:13 Marc Hoppins wrote: > Hi all, > > Why is RACKDC preferred for production than TOPOLOGY?

Re: sstables changing in snapshots

2022-03-22 Thread Paulo Motta
> It was my understanding that when the nodetool snapshot process finished, the snapshot was done. This is correct. But snapshots could be partially available when using incremental_backups or snapshot_before_compaction option. If the compression/upload process starts after nodetool snapshot

Re: sstables changing in snapshots

2022-03-22 Thread Paulo Motta
How does the backup process ensure the snapshot is taken before starting to upload it ? A snapshot is only safe to use after the "manifest.json" file is written. I wonder if the snapshot is being compressed while the snapshot file is still being created. Em ter., 22 de mar. de 2022 às 14:17,

Re: Default permissions for /var/lib/cassandra are world-readable

2022-03-22 Thread Paulo Motta
Hi Sebastian, I'm not aware of any reasoning behind this choice (happy to be corrected), but I think it wouldn't hurt to have better default permissions. Feel free to open a JIRA ticket to suggest this change on https://issues.apache.org/jira/projects/CASSANDRA/summary Em ter., 22 de mar. de

Re: High frequency garbage collection leading to High load average

2022-03-08 Thread Paulo Motta
All these symptoms indicate a potential hotspot in this replica, which can be caused by one or likely multiple "hot" partitions. Finding out which particular partition(s) is responsible for this is tricky, but good candidates are the ones mentioned in the log warning. Ideally you should fix your

Fwd: New Apache Cassandra Group on LinkedIn

2022-03-03 Thread Paulo Motta
Cross-posting announcement to user list -- Forwarded message - From: Benjamin Lerer Date: Thu, 3 Mar 2022 at 08:41 Subject: New Apache Cassandra Group on LinkedIn To: Hi everybody, We just created a new Apache Cassandra group on LinkedIn (

Re: Cassandra 4.0 randomly freezes on heavy load

2022-02-25 Thread Paulo Motta
> I can reproduce this with a huge load using dsbulk, but still can't determine the cause of the problem. Can you get a thread dump (jstack ) when the system freezes? This might be helpful to determine the cause of the freeze. Also, can you reproduce this in a simpler environment (ccm + dsbulk)?

Re: moving from 4.0-alpha4 to 4.0.1

2021-10-09 Thread Paulo Motta
Hi Attila, Minor version upgrades are generally fine to do in-place, unless otherwise specified on NEWS.txt < https://github.com/apache/cassandra/blob/cassandra-4.0.1/NEWS.txt> for the specific versions you're upgrading. Cassandra is designed with this goal in mind, and potentially disruptive

Re: Compatibility between Cassandra 3.11 and cqlsh from Cassandra 4.0 RC1

2021-05-04 Thread Paulo Motta
Hi Bowen, This seems like a bug to me, please kindly file an issue on https://issues.apache.org/jira/projects/CASSANDRA/summary with the provided reproduction steps. Thanks, Paulo Em ter., 4 de mai. de 2021 às 18:22, Bowen Song escreveu: > Hi all, > > > I was using the cqlsh from Cassandra

Re: Understanding logging in Cassandra

2021-02-17 Thread Paulo Motta
> I don't want to enable DEBUG logs as there are a lot of DiskBoundary messages, which are very high in volume. These messages shouldn't be very high volume as they only appear when there are ring updates, schema changes or node startup. If this is not the case please file a JIRA issue. Em qua.,

Re: Increased read latency with Cassandra >= 3.11.7

2020-12-03 Thread Paulo Motta
. Em qui., 3 de dez. de 2020 às 09:33, Paulo Motta escreveu: > I think this could've been caused by > https://issues.apache.org/jira/browse/CASSANDRA-15690 which was > introduced on 3.11.7 and removed an optimization that may cause a > correctness issue when there are partition deletions.

Re: Increased read latency with Cassandra >= 3.11.7

2020-12-03 Thread Paulo Motta
I think this could've been caused by https://issues.apache.org/jira/browse/CASSANDRA-15690 which was introduced on 3.11.7 and removed an optimization that may cause a correctness issue when there are partition deletions. I'd suggest you to open an issue at

Feedback on new Apache Cassandra project website

2020-10-06 Thread Paulo Motta
Hi all, I'd like to invite the user community to give feedback on the design concepts below to a new website for the Apache Cassandra project. Thanks, Paulo -- Forwarded message - De: Melissa Logan Date: qua., 30 de set. de 2020 às 16:43 Subject: Re: [DISCUSS] Updating the C*

Re: Node is UNREACHABLE after decommission

2020-09-17 Thread Paulo Motta
ioned node to the other cluster it is > giving me an error that cluster_name is not matching however cluster name > is correct as per new cluster. > So until i issue assasinate , i am not able to move forward. > > On Thu, Sep 17, 2020 at 1:13 PM Paulo Motta > wrote: > >> After de

Re: Node is UNREACHABLE after decommission

2020-09-17 Thread Paulo Motta
After decommissioning the node remains in gossip for a period of 3 days (if I recall correctly) and it will show up on describecluster during that period, so this is expected behavior. This allows other nodes that eventually were down when the node decommissioned to learn that this node left the

Re: Repeated messages about Removing tokens

2020-08-23 Thread Paulo Motta
These messages should go away when the decommission/removenode is complete. Are you seeing them repeating for the same nodes after they've left or do they eventually stop? If not this is expected behavior but perhaps a bit too verbose if the message is being printed more than once per node

Re: Can "data_file_directories" make use of multiple disks?

2018-04-09 Thread Paulo Motta
> cassandra.yaml states that "Directories where Cassandra should store data on > disk. Cassandra will spread data evenly across them, subject to the > granularity of the configured compaction strategy.". I feel it is not correct > anymore. Is it worth updating the doc? In fact this changed

Re: Repair with –pr stuck in between on Cassandra 3.11.1

2018-01-25 Thread Paulo Motta
Are you using JBOD? A thread dump (jstack ) on the affected nodes would probably help troubleshoot this. 2018-01-25 6:45 GMT-02:00 shini gupta : > Hi, > > > We have upgraded the system from Cassandra 2.1.16 to 3.11.1. After about > 335M of data loading, repair with –pr and

Re: Need help with incremental repair

2017-10-30 Thread Paulo Motta
> This is also the case for full repairs, if I'm not mistaken. Assuming I'm not > missing something here, that should mean that he shouldn't need to mark > sstables as unrepaired? That's right, but he mentioned that he is using reaper which uses subrange repair if I'm not mistaken, which

Re: Need help with incremental repair

2017-10-29 Thread Paulo Motta
> Assuming the situation is just "we accidentally ran incremental repair", you > shouldn't have to do anything. It's not going to hurt anything Once you run incremental repair, your data is permanently marked as repaired, and is no longer compacted with new non-incrementally repaired data. This

Re: Multi-node repair fails after upgrading to 3.0.14

2017-09-19 Thread Paulo Motta
In 4.0 anti-compaction is no longer run after full repairs, so we should probably backport this behavior to 3.0, given there are known limitations with incremental repair on 3.0 and non-incremental users may want to run keep running full repairs without the additional cost of anti-compaction.

Re: Cassandra Startup taking very long.

2017-07-25 Thread Paulo Motta
This is address on 3.11.1 on CASSANDRA-13641. Workaround for now is probably truncating system.prepared_statements before restart of node as being done for now. 2017-07-25 11:12 GMT-05:00 Taylor Cressy : > +1 bump. > > We are experiencing the same issue > > On Jul 25,

Re: Repairing question

2017-06-25 Thread Paulo Motta
t;> as repaired since this require anti-compaction to be run. > > Not sure since what version, but in 3.10 at least (I think its since 3.x > started) full repair does do anti-compactions and marks sstables as > repaired. > > On 23 June 2017 at 06:30, Paulo Motta <pauloricard...@g

Re: Purge data from repair_history table?

2017-03-17 Thread Paulo Motta
It's safe to truncate this table since it's just used to inspect repairs for troubleshooting. You may also set a default TTL to avoid it from growing unbounded (this is going to be done by default on CASSANDRA-12701). 2017-03-17 8:36 GMT-03:00 Gábor Auth : > Hi, > > I've

Re: Incremental Repair

2017-03-13 Thread Paulo Motta
> there are some nasty edge cases when you mix incremental repair and full repair ( https://issues.apache.org/jira/browse/CASSANDRA-13153 ) mixing incremental and full repairs will just make that more likely to happen, but although unlikely it's still possible for a similar condition to happen

Re: incremental repairs with -pr flag?

2017-01-11 Thread Paulo Motta
The objective of non-incremental primary-range repair is to avoid redoing work, but with incremental repair anticompaction will segregate repaired data so no extra work is done on the next repair. You should run nodetool repair [ks] [table] in all nodes sequentially. The more often you run, the

Re: Streaming error during repair

2017-01-05 Thread Paulo Motta
Fixed on https://issues.apache.org/jira/browse/CASSANDRA-12905 (still to be released). If you're running repair of all keyspace or tables in a single command, run each table separately and should improve things a bit. 2017-01-05 7:54 GMT-02:00 Robert Sicoie : > Hi guys,

Re: Time range for metrics histogram

2016-12-17 Thread Paulo Motta
See CASSANDRA-11752 for 2.2+ histogram. 2016-12-17 21:13 GMT-02:00 Aleksandr Ivanov : > Hi C* experts! > > I'm trying to understand over what time range C* latency metrics histogram > is calculated. > Several sources state that max is calculated from C* start, but on graphs > I

Re: Bootstrap fails on 3.10

2016-11-25 Thread Paulo Motta
If you have an MV table It seems you're hitting https://issues.apache.org/jira/browse/CASSANDRA-12905. I will bump it's priority to critical since it can prevent or difficult bootstrap. Did you try resuming bootstrap with "nodetool bootstrap resume" after the failure? It may eventually succeed,

Re: Extremely large ValidationExecutor.MaxPoolSize in Cassandra 2.1.13

2016-11-24 Thread Paulo Motta
This is not a problem per se, it's just the maximum number of concurrent threads allowed in the validation pool which is Integer.MAX_VALUE, which will limit the maximum number of simultaneous validations the node will handle. It may be too big, but you probably will never reach anywhere close to

Re: Is it *safe* to issue multiple replace-node at the same time?

2016-11-22 Thread Paulo Motta
It's safe but since the replacement node will stream data from a single replica per local range, it will potentially propagate any inconsistencies from the replica it streams from, so it's recommended to run repair after a replace to reduce entropy specially when replacing a node with the same IP

Re: Node replacement failed in 2.2

2016-11-20 Thread Paulo Motta
ens field for 2401:db00:2130:4091:face:0:13:0 shows > "TOKENS: not present", on all live nodes. It means tokens are missing, > right? What would cause this? > > Thanks. > Dikang. > > On Fri, Nov 18, 2016 at 11:15 AM, Paulo Motta <pauloricard...@gmail.com> >

Re: Node replacement failed in 2.2

2016-11-18 Thread Paulo Motta
What does nodetool gossipinfo shows for endpoint /2401:db00:2130:4091:face:0:13:0 ? Does it contain the TOKENS attribute? If it's missing, is it only missing on this node or other nodes as well? 2016-11-18 17:02 GMT-02:00 Dikang Gu : > Hi, I encountered couple times that I

Re: Is it a must to run Cassandra repair in scheduled time

2016-11-18 Thread Paulo Motta
This is an informative piece on (anti-entropy) repairs: https://cassandra-zone.com/understanding-repairs/ 2016-11-18 8:12 GMT-02:00 wxn...@zjqunshuo.com : > Thanks Ben for the response. It's very helpfull and it's really what I > want. > > > *From:* Ben Dalling

Re: Storing videos in cassandra

2016-11-14 Thread Paulo Motta
For the record, there is an interesting use case of globo.com using Cassandra to store video payload and stream live video at scale (in particular, the FIFA World Cup + Olympics), but it's a pretty non-conventional/advanced use case: -

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-10-28 Thread Paulo Motta
Haven't seen this before, but perhaps it's related to CASSANDRA-10433? This is just a wild guess as it's in a related codepath, but maybe worth trying out the patch available to see if it helps anything... 2016-10-28 15:03 GMT-02:00 Dikang Gu : > We are seeing huge cpu

Re: dtests jolokia fails to attach

2016-10-06 Thread Paulo Motta
I had this problem before but can't remember the root cause, but I think it was related to conflicting JVMs on the same machine. Can you check if you have more than one JVM installed and try to define JAVA_HOME if it's not defined? Maybe this is related:

Re: Repairing without -pr shows unexpected out-of-sync ranges

2016-10-04 Thread Paulo Motta
t Paulo! > > > > Regards, > > Stefano > > > > On Thu, Sep 22, 2016 at 6:36 PM, Paulo Motta <pauloricard...@gmail.com> > > wrote: > >> > >> There are a couple of things that could be happening here: > >> - There will be tim

Re: Repairs at scale in Cassandra 2.1.13

2016-09-28 Thread Paulo Motta
There were a few streaming bugs fixed between 2.1.13 and 2.1.15 (see CHANGES.txt for more details), so I'd recommend you to upgrade to 2.1.15 in order to avoid having those. 2016-09-28 9:08 GMT-03:00 Alain RODRIGUEZ : > Hi Anubhav, > > >> I’m considering doing subrange

Re: New node block in autobootstrap

2016-09-27 Thread Paulo Motta
ve(StreamSession.java:524)* > *at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* > *at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* > *at java.lang.Thread.run(Thread.java:745)* > > O

Re: New node block in autobootstrap

2016-09-27 Thread Paulo Motta
<laxmikanth...@gmail.com>: > @Paulo Motta > > Even we are facing Streaming timeout exceptions during 'nodetool rebuild' > , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as suggested > in datastax blog - https://support.datastax.com/h > c/en-us/articles/206

Re: Repairing without -pr shows unexpected out-of-sync ranges

2016-09-22 Thread Paulo Motta
There are a couple of things that could be happening here: - There will be time differences between when nodes participating repair flush, so in write-heavy tables there will always be minor differences during validation, and those could be accentuated by low resolution merkle trees, which will

Re: https://issues.apache.org/jira/browse/CASSANDRA-10961 fix

2016-09-20 Thread Paulo Motta
Hello Zhiyan, Replying to the mailing list since this could help others. I'm not sure what that could be, it's generally related to some kind of corruption, perhaps CASSANDRA-10791. Although the message is similar to #10971, that is restricted to streaming so it's a different issue here. Was this

Re: Upgrade cassandra 2.1.14 to 3.0.7

2016-09-19 Thread Paulo Motta
cing this change and > start wondering about it. > > C*heers, > --- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2016-09-12 13:55 GMT

Re: How to start using incremental repairs?

2016-09-12 Thread Paulo Motta
ion by just marking SSTables as being repaired (which is fast), > but the rest of the nodes will still have to perform anticompaction as they > won't share all of its token ranges. Right ? > > Cheers, > > Alex > > Le lun. 12 sept. 2016 à 13:56, Paulo Motta <pauloricard.

Re: Multiple Network Interfaces in non-EC2

2016-09-12 Thread Paulo Motta
This seems like a bug, it seems we always bind the outgoing socket to the private/listen address. Would you mind opening a JIRA and posting the link here? Thanks 2016-09-12 3:35 GMT-03:00 Amir Dafny-Man : > Hi, > > > > I followed the docs

Re: [ANNOUNCEMENT] Website update

2016-09-12 Thread Paulo Motta
> Are there equivalent JIRAs for the TODOs somewhere? Not that I know of, but I think you can create a github pull request for punctual doc updates and AFAIK a jira ticket will be automatically created from it. Alternatively, feel free to open a JIRA meta-ticket with subtasks for doc TODOs and

Re: Incremental repairs in 3.0

2016-09-12 Thread Paulo Motta
> I truncate a table lcs, Then I inserted one line and I used nodetool flush to have all the sstables. Using a RF 3 I ran a repair -inc directly and I observed that the value of Reaired At was equal 0. Were you able to troubleshoot this? The value of repairedAt should be mutated even when there

Re: How to start using incremental repairs?

2016-09-12 Thread Paulo Motta
PM, Stefano Ortolani <ostef...@gmail.com> > wrote: > >> An extract of this conversation should definitely be posted somewhere. >> Read a lot but never learnt all these bits... >> >> On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta <pauloricard...@gmail.com> >

Re: Upgrade cassandra 2.1.14 to 3.0.7

2016-09-12 Thread Paulo Motta
Migration procedure is no longer required for incremental repair as of 2.1.4 since CASSANDRA-8004, which was the reason why the migration procedure was required for LCS before. The migration procedure is only useful now to skip validation on already repaired sstables in the first incremental

Re: nodetool repair uses option '-local' and '-pr' togather

2016-09-05 Thread Paulo Motta
0.0.5 > > > ccm node1 nodetool getendpoints replication_test sample bif > > 127.0.0.3 > 127.0.0.5 > 127.0.0.1 > > > ccm node1 nodetool getendpoints replication_test sample biz > > 127.0.0.2 > 127.0.0.3 > 127.0.0.5 > > On Fri, Sep 2, 2016 at 9:41 AM Paul

Re: CASSANDRA-12278

2016-09-02 Thread Paulo Motta
Forwarding to the user@cassandra.apache.org list as this list is specific for cassandra-development, not general cassandra questions. Can you check the repository you built the snapshot from contains the commit 01d5fa8acf05973074482eda497677c161a311ac? Is java 1.8.0_101 on your $env:PATH ? Can

Re: nodetool repair uses option '-local' and '-pr' togather

2016-09-02 Thread Paulo Motta
tions/82414/do-you-have-to-run-nodetool-repair-on-every-node. > > Thanks again. > > George > > On Thu, Sep 1, 2016 at 10:22 AM, Paulo Motta <pauloricard...@gmail.com> > wrote: > >> https://issues.apache.org/jira/browse/CASSANDRA-7450 >> >> 2016-09

Re: nodetool repair uses option '-local' and '-pr' togather

2016-09-01 Thread Paulo Motta
https://issues.apache.org/jira/browse/CASSANDRA-7450 2016-09-01 13:11 GMT-03:00 Li, Guangxing : > Hi, > > I have a cluster running 2.0.9 with 2 data centers. I noticed that > 'nodetool repair -pr keyspace cf' runs very slow (OpsCenter shows that the > node's data size

Re: How to start using incremental repairs?

2016-08-26 Thread Paulo Motta
it that way. Thanks for clarifying! > > > On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta <pauloricard...@gmail.com> > wrote: > >> > What is the underlying reason? >> >> Basically to minimize the amount of anti-compaction needed, since with >> RF=3 you'd

Re: How to start using incremental repairs?

2016-08-26 Thread Paulo Motta
? > I didn't know incremental repairs were not compatible with -pr > What is the underlying reason? > > Regards, > Stefano > > > On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta <pauloricard...@gmail.com> > wrote: > >> 1. Migration procedure is no longer nece

Re: How to start using incremental repairs?

2016-08-25 Thread Paulo Motta
1. Migration procedure is no longer necessary after CASSANDRA-8004, and since you never ran repair before this would not make any difference anyway, so just run repair and by default (CASSANDRA-7250) this will already be incremental. 2. Incremental repair is not supported with -pr, -local or

Re: Preferred IP is NULL

2016-08-22 Thread Paulo Motta
public interface is down on a node? My traffic would still fail. > > I want that at least nodes in my local DC should contact at each other on > private IP. I thought preferred IP is for that purpose so focussing on > fixing the null value of preferred IPs. > > > Than

Re: Preferred IP is NULL

2016-08-21 Thread Paulo Motta
See CASSANDRA-9748, I think it might be related. 2016-08-20 15:20 GMT-03:00 Anuj Wadehra : > Hi, > > We use multiple interfaces in multi DC setup.Broadcast address is public > IP while listen address is private IP. > > I dont understand why prefeerred IP in peers table is

Re: full and incremental repair consistency

2016-08-19 Thread Paulo Motta
full one because sstables or not flagged ? > > By the way, I suppose the repair flag don't break sstable file > immutability, so I wonder how it is stored. > > -- > Jérôme Mainaud > jer...@mainaud.com > > 2016-08-19 15:02 GMT+02:00 Paulo Motta <pauloricard...@

Re: full and incremental repair consistency

2016-08-19 Thread Paulo Motta
Running repair with -local flag does not mark sstables as repaired, since you can't guarantee data in other DCs are repaired. In order to support incremental repair, you need to run a full repair without the -local flag, and then in the next time you run repair, previously repaired sstables are

Re: New node block in autobootstrap

2016-08-15 Thread Paulo Motta
What version are you in? This seems like a typical case were there was a problem with streaming (hanging, etc), do you have access to the logs? Maybe look for streaming errors? Typically streaming errors are related to timeouts, so you should review your cassandra streaming_socket_timeout_in_ms

Re: nodetool repair with -pr and -dc

2016-08-11 Thread Paulo Motta
o if we want to use -pr option ( which i suppose we should to > prevent duplicate checks) in 2.0 then if we run the repair on all nodes in > a single DC then it should be sufficient and we should not need to run it > on all nodes across DC's ? > > > > On Wed, Aug 10, 2016 at

Re: migrating from 2.1.2 to 3.0.8 log errors

2016-08-10 Thread Paulo Motta
Another thing to note is that according to NEWS.txt upgrade from 2.1.x is only supported from version 2.1.9, so if this is not an effect of that I'm actually surprised upgrade from 2.1.2 worked without any issues. 2016-08-10 15:48 GMT-03:00 Tyler Hobbs : > That just means

Re: Incremental repairs leading to unrepaired data

2016-08-10 Thread Paulo Motta
wrong perhaps you need to increase your capacity. What version are you in? 2016-08-10 8:21 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > Not yet. Right now I have it set at 16. > Would halving it more or less double the repair time? > > On Tue, Aug 9, 2016 at 7:58 PM, Paulo

Re: nodetool repair with -pr and -dc

2016-08-10 Thread Paulo Motta
On 2.0 repair -pr option is not supported together with -local, -hosts or -dc, since it assumes you need to repair all nodes in all DCs and it will throw and error if you try to run with nodetool, so perhaps there's something wrong with range_repair options parsing. On 2.1 it was added support to

Re: Incremental repairs leading to unrepaired data

2016-08-09 Thread Paulo Motta
Anticompaction throttling can be done by setting the usual compaction_throughput_mb_per_sec knob on cassandra.yaml or via nodetool setcompactionthroughput. Did you try lowering that and checking if that improves the dropped mutations? 2016-08-09 13:32 GMT-03:00 Stefano Ortolani

Re: Adding Materialized View triggers "Mutation Too Large" error.

2016-08-08 Thread Paulo Motta
What happens is that when trying to rebuild the MV, the rebuilder tries to create a very large batch that exceeds commitlog_segment_size_in_mb. This limitation is currently being addressed on CASSANDRA-11670. Two options I can see to workaround this for now: 1) increase

Re: Sync failed between in AntiEntropySessions - Repair

2016-08-05 Thread Paulo Motta
> Jean Carlo > > "The best way to predict the future is to invent it" Alan Kay > > On Fri, Aug 5, 2016 at 5:16 PM, Paulo Motta <pauloricard...@gmail.com> > wrote: > >> you need to check 192.168.0.36/10.234.86.36 for streaming ERRORS >> >> 2016-

Re: Sync failed between in AntiEntropySessions - Repair

2016-08-05 Thread Paulo Motta
hreadPoolExecutor.java:1142) > [na:1.8.0_60] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_60] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] > > I al looking through the changes files to see if it is

Re: Sync failed between in AntiEntropySessions - Repair

2016-08-05 Thread Paulo Motta
It seems you have a streaming error, look for ERROR statement in the streaming classes before that which may give you a more specific root cause. In any case, I'd suggest you to upgrade to 2.1.15 as there were a couple of streaming fixes on this version that might help. 2016-08-05 11:15 GMT-03:00

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta
ss Load Tokens Owns (effective) Host ID > Rack > UN 10.4.54.176 127.67 GB 256 47.5% > 7163bf77-2fef-4e33-81c1-0e61038dece1 1b > UN 10.4.43.65 124.19 GB 256 46.2% > 80265afb-8beb-4887-a696-fc9b75956894 1a > UN

Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta
This looks somewhat related to CASSANDRA-9630. What is the C* version? Can you check with netstats if other nodes keep connections with the stopped node in the CLOSE_WAIT state? And also if the problem disappears if you run nodetool disablegossip before stopping the node? 2016-07-26 16:54

Re: use private ip for internode and public IP for seeds

2016-07-27 Thread Paulo Motta
Were you able to troubleshoot this yet? Private IPs for listen_address, public IP for broadcast_address, and prefer_local=true on cassandra-rackdc.properties should be sufficient to make nodes in the same DC communicate over private address, so something must be going on there. Can you check in

Re: Exception in logs using LCS .

2016-06-28 Thread Paulo Motta
1. Not necessarily data corruption, but it seems compaction is trying to write data in the wrong order most likely due to a temporary race condition/bug a la #9935, but since the compaction fails your original data is probably safe (you can try running scrub to verify/fix corruptions). 2. This is

Re: High Heap Memory usage during nodetool repair in Cassandra 3.0.3

2016-06-20 Thread Paulo Motta
You could also be hitting CASSANDRA-11739, which was fixed on 3.0.7 and could potentially cause OOMs for long-running repairs. 2016-06-20 13:26 GMT-03:00 Robert Stupp : > One possibility might be CASSANDRA-11206 (Support large partitions on the > 3.0 sstable format), which

Re: StreamCoordinator.ConnectionsPerHost set to 1

2016-06-16 Thread Paulo Motta
Increasing the number of threads alone won't help, because you need to add connectionsPerHost-awareness to StreamPlan.requestRanges (otherwise only a single connection per host is created) similar to what was done to StreamPlan.transferFiles by CASSANDRA-3668, but maybe bit trickier. There's an

Re: Streaming from 1 node only when adding a new DC

2016-06-15 Thread Paulo Motta
For rebuild, replace and -Dcassandra.consistent.rangemovement=false in general we currently pick the closest replica (as indicated by the Snitch) which has the range, what will often map to the same node due to the dynamic snitch, specially when N=RF. This is good for picking a node in the same DC

Re: Error while rebuilding a node: Stream failed

2016-05-27 Thread Paulo Motta
I'm afraid raising streaming_socket_timeout_in_ms won't help much in this case because the incoming connection on the source node is timing out on the network layer, and streaming_socket_timeout_in_ms controls the socket timeout in the app layer and throws SocketTimeoutException (not

Re: Error while rebuilding a node: Stream failed

2016-05-26 Thread Paulo Motta
1.22.104] 2016-05-26 11:08:05,114 > StreamResultFuture.java:207 - [Stream > #74c57bc0-231a-11e6-a698-1b05ac77baf9] Stream failed > > > Streaming does not seem to be resumed again from this node. Shall I just > kill again the entire rebuild process? > > On Thu, May 26,

Re: Error while rebuilding a node: Stream failed

2016-05-25 Thread Paulo Motta
e.deserialize(StreamMessage.java:51) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at java.lang.Thread.run(Unknown

Re: Error while rebuilding a node: Stream failed

2016-05-25 Thread Paulo Motta
> Workaround is to set to a larger streaming_socket_timeout_in_ms **on the source node**., the new default will be 8640ms (1 day). 2016-05-25 17:23 GMT-03:00 Paulo Motta <pauloricard...@gmail.com>: > Was there any other ERROR preceding this on this node (in particular the >

Re: Error while rebuilding a node: Stream failed

2016-05-25 Thread Paulo Motta
geHandler.run(ConnectionHandler.java:331) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at java.lang.Thread.run(Unknown Source) [na:1.7.0_79] > > On Wed, May 25, 2016 at 8:49 PM, Paulo Motta <pauloricard...@gmail.com> > wrote: > >> This is the log of the destination/

Re: Error while rebuilding a node: Stream failed

2016-05-25 Thread Paulo Motta
t; org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351) > [apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.stream

Re: Error while rebuilding a node: Stream failed

2016-05-25 Thread Paulo Motta
The stack trace from the rebuild command not show the root cause of the rebuild stream error. Can you check the system.log for ERROR logs during streaming and paste here?

Re: Autobootstrap in Cassandra

2016-05-23 Thread Paulo Motta
You may also check in the system.log, loaded properties are logged on node startup. 2016-05-23 19:55 GMT-03:00 Jonathan Haddad : > > find / -name 'cassandra.yaml' -exec grep -nH auto_bootstrap {} \; > > On Mon, May 23, 2016 at 3:44 PM Rajath Subramanyam

Re: sstableloader: Stream failed

2016-05-23 Thread Paulo Motta
Can you telnet 10.211.55.8 7000? This is the port used for streaming communication with the destination node. If not you should check what is the configured storage_port in the destination node and set that in the cassandra.yaml of the source node so it's picked up by sstableloader. 2016-05-23

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

2016-04-20 Thread Paulo Motta
with dummy data so I will throw that jar on the >>> nodes and I'll let you know how things shake out. >>> >>> On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pauloricard...@gmail.com> >>> wrote: >>> >>>> You could be hitting CASSAN

Re: Upgrade cassandra from 2.1.9 to 3.x?

2016-03-31 Thread Paulo Motta
If there isn't anything on NEWS.txt forbidding it, then it *should* be possible. That is the authoritative source for upgrade information. As noted by you, the only known restriction is that you upgrade from at least 2.1.9 as noted in the NEWS.txt entry. But as always, and specially when doing

Re: auto_boorstrap when a node is down

2016-03-30 Thread Paulo Motta
When you add a node it will take over the range of an existing node, and thus it should stream data from it to maintain consistency. If the existing node is unavailable, the new node may fetch the data from a different replica, which may not have some of data from the node which you are taking the

Re: Rack aware question.

2016-03-23 Thread Paulo Motta
> How come 127.0.0.1 is shown as an endpoint holding the ID when its token range doesn’t contain it ? Does “nodetool ring” shows all token-ranges for a node or just the primary range ? I am thinking its only primary. Can someone confirm ? The primary replica of id=1 is always 127.0.0.3. What

  1   2   >