Due to the tombstone,we have set GC_GRACE_SECONDS to 6 hours.And for a huge
table with 4T size,repair is a hard thing for us.
-- 原始邮件 --
发件人: "kurt";;
发送时间: 2017年8月3日(星期四) 中午12:08
收件人: "User";
主题: Re: Data Loss
Hi,
We are also experiencing the same issue.we have 3 DCs(DC1 RF=3,DC2
RF=3,DC3,RF=1),if we use local_quorum,we are not meant to loss any data,right?
if we use local_one, maybe loss data? then we need to run repair regularly?
Could anyone advise?
Thanks
-- 原始邮件
Hi there,
We have a three DCs Cluster (two DCs with RF=3,one remote DC with RF=1),we
currently find that in DC1/DC2 select count(*) from t=1250,while in DC3 select
count(*) from t=750.
looks some data is missing in DC3(remote DC).there are no node down or anything
exceptional.
we only
"anujw_2...@yahoo.co.in"<anujw_2...@yahoo.co.in>;
: "Peng Xiao"<2535...@qq.com>;
: Re: ?? tolerate how many nodes down in the cluster
I've never really understood why Datastax recommends against racks. In those
docs they make it out to be much m
Thanks for the remind,we will setup a new DC as suggested.
-- 原始邮件 --
发件人: "kurt greaves";;
发送时间: 2017年7月26日(星期三) 上午10:30
收件人: "User";
抄送: "anujw_2...@yahoo.co.in";
主题: Re: 回复: tolerate
t idea would be to plan failover modes
appropriately and letting cassandra know of the same.
Regards,
Bhuvan
On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:
Hi,
Suppose we have a 30 nodes cluster in one DC with RF=3,
how many nodes can be down?can we tolerate 10 nod
Thanks all for your thorough explanation.
-- --
??: "Anuj Wadehra";<anujw_2...@yahoo.co.in.INVALID>;
: 2017??7??28??(??) 0:49
??: "User cassandra.apache.org"<user@cassandra.apache.org>; "Pe
as per Brooke suggests,RACs a multipile of RF.
https://www.youtube.com/watch?v=QrP7G1eeQTI
if we have 6 machines with RF=3,then we can set up 6 RACs or setup 3RACs,which
will be better?
Could you please further advise?
Many thanks
-- 原始邮件 --
发件人:
One more question.why the # of racks should be equal to RF?
For example,we have 4 machines,each virtualized to 8 vms ,can we set 4 RACs
with RF3?I mean one machine one RAC.
Thanks
-- 原始邮件 --
发件人: "我自己的邮箱";<2535...@qq.com>;
发送时间: 2017年7月26日(星期三) 上午10:32
收件人:
https://datastax-oss.atlassian.net/browse/JAVA-1002
This one says it's the driver issue,we will have a try.
-- Original --
From: "";<2535...@qq.com>;
Date: Wed, Jul 26, 2017 04:12 PM
To: "user";
Subject: Timeout
Dear All,
We are expericencing a strange issue.Currently we have a Cluster with Cassandra
2.1.13.
when the applications start,it will print the following warings.And it takes
long time for applications to start.
Could you please advise ?
2017-07-26 15:49:20.676 WARN 11706 --- [-]
-- --
??: "Anuj Wadehra";<anujw_2...@yahoo.co.in.INVALID>;
: 2017??7??27??(??) 1:41
??: "Brooke Thorley"<bro...@instaclustr.com>;
"user@cassandra.apache.org"<user@cassandra.apache.org>
Hi,
Suppose we have a 30 nodes cluster in one DC with RF=3,
how many nodes can be down?can we tolerate 10 nodes down?
it seems that we are not able to avoid the data distribution 3 replicas in the
10 nodes?,
then we can only tolerate 1 node down even we have 30 nodes?
Could anyone please
Dear All,
we are currently using Cassandra 2.1.13,and it has grown to 5TB size with 32
nodes in one DC.
For monitoring,opsCenter does not send alarm and not free in higher version.so
we have to use a simple JMX+Zabbix template.And we plan to use
Jolokia+JMX2Graphite to draw the metrics chart
Hi,
We are experiencing the following issue,the rt will fly to 15s sometime.and
after adjusting the batch size,
it looks better,but still have the following issue.Could any one advise?
INFO [GossipTasks:1] 2017-07-07 08:56:33,410 Gossiper.java:1009 - InetAddress
/172.16.xx.39 is now DOWN
on
hi??
Does message drop mean data loss?
Thanks
-- Original --
From: Akhil Mehra
Date: ,8?? 4,2017 16:00
To: user
Subject: Re: MUTATION messages were dropped in last 5000 ms for cross
nodetimeout
Glad I
Dear All,
any suggestion for optimal value for native_transport_max_threads?
as per
https://issues.apache.org/jira/browse/CASSANDRA-11363,max_queued_native_transport_requests=4096,how
about native_transport_max_threads?
Thanks,
Peng Xiao
ve_period=0,looks Row Cache does not work in this
situation?
but we can still see the row cache hit.
Row Cache : entries 202787, size 100 MB, capacity 100 MB, 3095293
hits, 6796801 requests, 0.455 recent hit rate, 0 save period in seconds
Could anyone please explain this?
Thanks,
Peng Xiao
Could anyone please explain this?
Thanks,
Peng Xiao
not save cache to disk. But the cache
is still working, if it's enabled in table schema, just the cache will be empty
after restart.
--Dikang.
On Tue, Sep 19, 2017 at 8:27 PM, Peng Xiao <2535...@qq.com> wrote:
And we are using C* 2.1.18.
on this?
Thanks,
Peng Xiao
Hi,
as Datastax suggests,we should only bootstrap one new node one time.
but can we add new nodes in two DCs at the same time?
Thanks,
Peng Xiao
Dear All,
when we are bootstrapping a new node,we are experiencing high cpu load and this
affect the rt ,and we noticed that it's mainly costing on
Pending-range-calculator ,this did not happen before.
We are using C* 2.1.13.
Could anyone please advise on this?
Thanks,
Peng Xiao
Hi there,
We have two DCs for a Cassandra Cluster,if the network is down less than 3
hours(default hint window),with my understanding,it will recover
automatically,right?Do we need to run repair manually?
Thanks,
Peng Xiao
erstag, 21. September 2017 10:32
To: Peng Xiao <2535...@qq.com>; user@cassandra.apache.org
Subject: Re: network down between DCs
Hi,
That’s correct.
You need to run repairs only after a node/DC/connection is down for more then
max_hint_window
Dear All,
We'd like to migrate one keyspace from one cluster to another,the keyspace is
about 100G.
If we use sstableloader,we have to stop the application during the
migration.any good idea?
Thanks,
Peng Xiao
Hi there,
we are struggling on hardware selection,we all know that ssd is good,and
Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
considering to use sata disk,we noticed that the normal IO throughput is 7MB/s.
Could anyone give some advice?
Thanks,
Peng Xiao
Dear All,
As for STCS,datastax suggest us to keep half of the free space for
compaction,this is not strict,could anyone advise how many space should we left
for one node?
Thanks,
Peng Xiao
Dear All,
Can we limit the sstable file size?as we have a huge cluster,the sstable file
is too large for ETL to extract,Could you please advise?
Thanks,
Peng Xiao
o way to limit file size in STCS. If you use LCS, it will default to
160MB (except in cases where you have a very large partition - in those cases,
the sstable will scale with your partition size, but you really shouldn't have
partitions larger than 160MB)
On Fri, Sep 29, 2017 at 8:41 PM,
same DC only
--
Jeff Jirsa
> On Sep 28, 2017, at 2:41 AM, Peng Xiao <2535...@qq.com> wrote:
>
> Dear All,
>
> We have a cluster with one DC1:RF=3,another DC DC2:RF=1 only for ETL,but we
> found that sometimes we can query records in DC1,while not able not find th
repair to fix it.
Thanks,
Peng Xiao
???: "user"<user@cassandra.apache.org>;
: Re: data loss in different DC
If you're writing into DC1 with CL = LOCAL_xxx, there is no guarantee to be
sure to read the same data in DC2. Only repair will help you
On Thu, Sep 28, 2017 at 11:41 AM, Peng Xiao <2535...@qq.com> wrote:
Dear
,
Peng Xiao
ter with one DC1:RF=3,another DC DC2:RF=1,DC2 only for ETL,but we
found that sometimes we can query records in DC1,while not able not find the
same record in DC2 with local_quorum.How it happens?looks data loss in DC2.
Could anyone please advise?
looks we can only run repair to fix it.
Thanks,
Peng Xiao
hi,
nodetool cleanup will only remove those keys which no longer belong to those
nodes,than theoretically we can run nodetool cleanup in parallel,right?the
document suggests us to run this one by one,but it's too slow.
Thanks,
Peng Xiao
increase
compaction throughput to speed the process up.
On 27 Sep. 2017 13:20, "Peng Xiao" <2535...@qq.com> wrote:
hi,
nodetool cleanup will only remove those keys which no longer belong to those
nodes,than theoretically we can run nodetool cleanup in parallel,right?the
docu
Hi,
We want to split one DC from a cluster and make this DC a new cluster(rename
this DC to a new cluster name).
Could you please advise?
Thanks,
Peng Xiao
Hi there,
Can we add a new node (bootstrap) and run repair on another DC in the cluster
or even run repair in the same DC?
Thanks,
Peng Xiao
Hi there,
we need to repair a huge CF,just want to clarify
1.nodetool repair -pr keyspace cf
2.nodetool repair -st -et -dc
which will be better? or any other advice?
Thanks,
Peng Xiao
on, Nov 13, 2017 06:51 PM
To: "user"<user@cassandra.apache.org>;
Subject: best practice for repair
Hi there,
we need to repair a huge CF,just want to clarify
1.nodetool repair -pr keyspace cf
2.nodetool repair -st -et -dc
which will be better? or any other advice?
Thanks,
Peng Xiao
Hi there,
We know that we need to run repair regularly to make data consistency,suppose
we have DC1 & DC2,
if we add a new DC3 and rebuild from DC1,can we suppose the DC3 is consistency
with DC1 at least at the time when DC3 is rebuild successfully?
Thanks,
Peng Xiao,
Hi there,
We need to rebuild a new DC,but the stream is always failed with the following
errors.
we are using C* 2.1.18.Could anyone please advise?
error: null
-- StackTrace --
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:267)
at
out. Do you see logs on the servers
indicating that it was ever invoked? Did it calculate a streaming plan? Did it
start sending files?
-- Jeff Jirsa
On Dec 16, 2017, at 4:56 PM, Peng Xiao <2535...@qq.com> wrote:
Hi Jeff,
This is the only informaiton we found from
Hi there,
if we have a Cassandra DC1 with data size 60T,RF=3,then we rebuild a new
DC2(RF=3),how much data will stream to DC2?20T or 60T?
Thanks,
Peng Xiao
uild in the new DC always failed
What??s the rest of the stack beneath the null pointer exception?
--
Jeff Jirsa
> On Dec 16, 2017, at 4:11 PM, Peng Xiao <2535...@qq.com> wrote:
>
> Hi there,
>
> We need to rebuild a new DC,but the stream is always failed with the
>
Dear All,
We have decommisioned a DC,but from system.log,it'still gossiping
INFO [GossipStage:1] 2017-11-01 17:21:36,310 Gossiper.java:1008 - InetAddress
/x.x.x.x is now DOWN
Could you please advise?
Thanks,
Peng Xiao
g around in gossip for 3-15 days but then should disappear.
As long as it's not showing up in the cluster it should be OK.
On 1 Nov. 2017 20:25, "Peng Xiao" <2535...@qq.com> wrote:
Dear All,
We have decommisioned a DC,but from system.log,it'still gossiping
INFO [GossipStage:1] 2017-11-
1048576K
Could anyone please advise?
Thanks,
Peng Xiao
n.
Best, Oliver
On Thu, Dec 7, 2017 at 7:12 AM, Peng Xiao <2535...@qq.com> wrote:
Dear All,
Can we run Cassandra on physical machine directly?
we all know that vm can reduce the performance.For instance,we have a machine
with 56 core,8 ssd disks.
Can we run 8 cassandra instance in the
All,If we update a record which actually does not exist in Cassandra,will
it generate a new record or exit?
UPDATE columnfamily SET data = 'test data' WHERE key = 'row1';
as in CQL Update and insert are semantically the same.Could anyone please
advise?
Thanks,
Peng Xiao
Dear All,If we update a record which actually does not exist in Cassandra,will
it generate a new record or exit?
UPDATE columnfamily SET data = 'test data' WHERE key = 'row1';
as in CQL Update and insert are semantically the same.Could anyone please
advise?
Thanks,
Peng Xiao
,
Peng Xiao
07
??: "user"<user@cassandra.apache.org>;
: Re: rebuild stream issue
The streams fail, the rebuild times out if you??ve set a timeout. Or you??ll
need to restart the nodes if you didn??t set a streaming timeout.
--
Jeff Jirsa
> On Dec 10, 2017, at 9:05 P
Dear All,
We are rebuilding a new DC,if one of the source node was restarted,what will
happed with the rebuild?
Thanks,
Peng Xiao
the rebuild - it??ll stream some duplicate data but it??ll compact
away when it??s done
You can also use subrange repair instead of rebuild if you??re short on disk
space
-- Jeff Jirsa
On Dec 10, 2017, at 9:14 PM, Peng Xiao <2535...@qq.com> wrote:
Then,how can we restore the rebui
Dear All,
We are using C* 2.1.18,when we bootstrap a new node,the rt will jump when the
new node start up,then it back to normal.Could anyone please advise?
Thanks,
Peng Xiao
bootstrap a new node.
Could you please advise how to tune this ?
Many Thanks,
Peng Xiao
t: EXT: Re: Tuning bootstrap new node
Do not stop compaction, you will end up with thousands of sstables.
You increase stream throughput from default 200 to a heifer value if your
network can handle it.
Sent from my iPhone
On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> wrot
Thanks Kurt,we may will still use snapshot and sstableloader to split this
schema to another cluster.
-- 原始邮件 --
发件人: "kurt";;
发送时间: 2017年10月19日(星期四) 晚上6:11
收件人: "User";
主题: Re: split one DC from a cluster
Hi,
We have a cluster with 48 nodes configured with RACK,sometimes it's hang for
even 2 minutes.the response time jump from 300ms to 15s.
Could anyone please advise how to identified the root cause ?
The following is from the system log
INFO [Service Thread] 2017-10-26 21:45:46,796
for a while - a
reference to the the cleaned up sstables will be held by the rebuild streams,
causing you to temporarily increase disk usage until rebuild finishes streaming
-- Jeff Jirsa
On Dec 22, 2017, at 4:30 PM, Peng Xiao <2535...@qq.com> wrote:
Hi there,Can we run nodetool cleanup in D
?
Thanks,
Peng Xiao
Hi guys,
Could anyone please help on this simple question?
How to check C* partition size and related information.
looks nodetool ring only shows the token distribution.
Thanks
Thanks Kurt.
-- --
??: "kurt";;
: 2018??1??11??(??) 11:46
??: "User";
: Re: secondary index creation causes C* oom
1.not sure if secondary index creation is the same as
Hi there,
We plan to set keyspace1 in DC1 and DC2,keyspace2 in DC3 and DC4,all still in
the same cluster,to avoid the interrupt.Is there any potential risk for this
architecture?
Thanks,
Peng Xiao
Dear All,
We met some C* nodes oom during secondary index creation with C* 2.1.18.
As per https://issues.apache.org/jira/browse/CASSANDRA-12796,the flush writer
will be blocked by index rebuild.but we still have some confusions:
1.not sure if secondary index creation is the same as index
Dear All,
I'm trying to import csv file to a table with copy command?The question is:
will the copy command clear all the old data in this table?as we only want to
append the csv file to this table
Thanks
eck your primary key before play with COPY
command.
Thanks
On Tue, Feb 13, 2018 at 12:49 PM, Peng Xiao <2535...@qq.com> wrote:
Dear All,
I'm trying to import csv file to a table with copy command?The question is:
will the copy command clear all the old data in this table?as we only want
Hi there,Can we run nodetool cleanup in DC1,and run rebuild in DC2 against DC1
simultaneously?
in C* 2.1.18
Thanks,
Peng Xiao
We followed this
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html,
but it does not mention that change bootstrap for seed nodes after the rebuild.
Thanks,
Peng Xiao
-- Original --
From: "Ali Hubail&quo
Dear All,
For adding a new DC ,we need to set auto_bootstrap: false and then run the
rebuild,finally we need to change auto_bootstrap: true,but for seed nodes,it
seems that we still need to keep bootstrap false?
Could anyone please confirm?
Thanks,
Peng Xiao
-03-12/replace-a-dead-node-in-cassandra.html,we
can replace this dead node,is it the same as bootstrap new node?that means we
don't need to remove node and rejoin?
Could anyone please advise?
Thanks,
Peng Xiao
Dear All,
We noticed that when bootstrap new node,the source node is also quite busy
doing compactions which impact the rt severely.Is it reasonable to disable
compaction on all the source node?
Thanks,
Peng Xiao
is a painful process.
Thanks,
Peng Xiao
-- --
??: "Anthony Grasso"<anthony.gra...@gmail.com>;
: 2018??3??22??(??) 7:13
??: "user"<user@cassandra.apache.org>;
: Re: replace dead node vs re
???: "user"<user@cassandra.apache.org>;
: ?? disable compaction in bootstrap process
Thanks Alain.We are using C* 2.1.18,7core/30G/1.5T ssd,as the cluster is
growing too fast,we are painful in bootstrap/rebuild/
Thanks Alain.We are using C* 2.1.18,7core/30G/1.5T ssd,as the cluster is
growing too fast,we are painful in bootstrap/rebuild/remove node.
Thanks,
Peng Xiao
-- --
??: "Alain RODRIGUEZ"<arodr...@gmail.com>;
: 2018??3??22??(
Many thanks Alain for the thorough explanation,we will not disable compaction
for now.
Thanks,
Peng Xiao
-- --
??: "arodrime"<arodr...@gmail.com>;
: 2018??3??23??(??) ????8:57
??: "Peng Xiao"<2535.
78 matches
Mail list logo