回复: Data Loss irreparabley so

2017-08-02 Thread Peng Xiao
Due to the tombstone,we have set GC_GRACE_SECONDS to 6 hours.And for a huge table with 4T size,repair is a hard thing for us. -- 原始邮件 -- 发件人: "kurt";; 发送时间: 2017年8月3日(星期四) 中午12:08 收件人: "User"; 主题: Re: Data Loss

Re: Data Loss irreparabley so

2017-08-02 Thread kurt greaves
You should run repairs every GC_GRACE_SECONDS. If a node is overloaded/goes down, you should run repairs. LOCAL_QUORUM will somewhat maintain consistency within a DC, but certainly doesn't mean you can get away without running repairs. You need to run repairs even if you are using QUORUM or ONE.​

回复: Data Loss irreparabley so

2017-08-02 Thread Peng Xiao
Hi, We are also experiencing the same issue.we have 3 DCs(DC1 RF=3,DC2 RF=3,DC3,RF=1),if we use local_quorum,we are not meant to loss any data,right? if we use local_one, maybe loss data? then we need to run repair regularly? Could anyone advise? Thanks -- 原始邮件

Re: Cassandra Data migration from 2.2.3 to 3.7

2017-08-02 Thread Jeff Jirsa
Anytime you copy sstables around (sstableloader is excluded here), make sure you don't copy sstables from 2 nodes to the same directories without checking that the names don't collide or you'll lose one of the sstables -- Jeff Jirsa > On Aug 2, 2017, at 10:38 AM, Harika Vangapelli -T

Re: Cassandra isn't compacting old files

2017-08-02 Thread Sotirios Delimanolis
Turns out there are already logs for this in Tracker.java. I enabled those and clearly saw the old files are being tracked. What else can I look at for hints about whether these files are later invalidated/filtered out somehow? On Tuesday, August 1, 2017, 3:29:38 PM PDT, Sotirios Delimanolis

RE: Cassandra Data migration from 2.2.3 to 3.7

2017-08-02 Thread Harika Vangapelli -T (hvangape - AKRAYA INC at Cisco)
Jeff, what is the meaning of this line you mentioned in below email.. ‘making sure you dont overwrite any sstables with the same name’ [http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png] Harika Vangapelli Engineer - IT

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread Jeff Jirsa
By the time bootstrap is complete it should be as consistent as the source node - you can change start_native_transport to false to avoid serving clients directly (tcp/9042), but it'll still serve reads via the storage service (tcp/7000), but the guarantee is that data should be consistent by

Re: Cassandra data loss in come DC

2017-08-02 Thread Jeff Jirsa
Cassandra doesn't guarantee writes make it to all replicas unless you use a sufficiently high consistency level or run nodetool repair What consistency level did you use for writes? Have you run repair? -- Jeff Jirsa > On Aug 2, 2017, at 6:27 AM, Peng Xiao <2535...@qq.com> wrote: > > Hi

Cassandra data loss in come DC

2017-08-02 Thread Peng Xiao
Hi there, We have a three DCs Cluster (two DCs with RF=3,one remote DC with RF=1),we currently find that in DC1/DC2 select count(*) from t=1250,while in DC3 select count(*) from t=750. looks some data is missing in DC3(remote DC).there are no node down or anything exceptional. we only

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread kurt greaves
only in this one case might that work (RF==N)

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread Oleksandr Shulgin
On Wed, Aug 2, 2017 at 10:53 AM, Daniel Hölbling-Inzko < daniel.hoelbling-in...@bitmovin.com> wrote: > > Any advice on how to avoid this in the future? Is there a way to start up > a node that does not serve client requests but does replicate data? > Would it not work if you first increase the

Re: UndeclaredThrowableException, C* 3.11

2017-08-02 Thread Micha
ok, thanks, so I'll just start it again... On 02.08.2017 11:51, kurt greaves wrote: > If the repair command failed, repair also failed. Regarding % repaired, > no it's unlikely you will see 100% repaired after a single repair. Maybe > after a few consecutive repairs with no data load you might

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread kurt greaves
You can't just add a new DC and then tell their clients to connect to the new one (after migrating all the data to it obv.)? If you can't achieve that you should probably use GossipingPropertyFileSnitch.​ Your best plan is to have the desired RF/redundancy from the start. Changing RF in production

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread Daniel Hölbling-Inzko
Thanks for the pointers Kurt! I did increase the RF to N so that would not have been the issue. DC migration is also a problem since I am using the Google Cloud Snitch. So I'd have to take down the whole DC and restart anew (which would mess with my clients as they only connect to their local

Re: UndeclaredThrowableException, C* 3.11

2017-08-02 Thread kurt greaves
If the repair command failed, repair also failed. Regarding % repaired, no it's unlikely you will see 100% repaired after a single repair. Maybe after a few consecutive repairs with no data load you might get it to 100%.​

Re: Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread kurt greaves
If you want to change RF on a live system your best bet is through DC migration (add another DC with the desired # of nodes and RF), and migrate your clients to use that DC. There is a way to boot a node and not join the ring, however I don't think it will work for new nodes (have not confirmed),

Bootstrapping a new Node with Consistency=ONE

2017-08-02 Thread Daniel Hölbling-Inzko
Hi, It's probably a strange question but I have a heavily read-optimized payload where data integrity is not a big deal. So to keep latencies low I am reading with Consistency ONE from my Multi-DC Cluster. Now the issue I saw is that I needed to add another Cassandra node (for redundancy

UndeclaredThrowableException, C* 3.11

2017-08-02 Thread Micha
Hi, has someone experienced this? I added a fourth node to my cluster, after the boostrap I changed RP from 2 to 3 and ran nodetool repair on the new node. A few hours later the repair command exited with the UndeclaredThrowableException and the node was down. In the logs I don't see a reason