RE: SSTables are not getting removed

2015-11-02 Thread Walsh, Stephen
Thanks to both Nate and Jeff, for both the bug highlighting and the configure 
issues.

We've upgraded to 2.1.11
Lowered our memtable_cleanup_threshold to .11
Lowered out thrift_framed_transport_size_in_mb to 15

We kicked off another run.

The results was that the cassandra failed after 1 hour.
SSTables grew to about 8 before we lost JMX connection.
(so that's about 32000 SSTables in total over all nodes)
Major GC happened every 3 min - 5 min

We then reset for a direct comparison between 2.1.6 & 2.1.11.

There was no difference in the output of 2.1.6 to 2.1.11




From: Nate McCall [mailto:n...@thelastpickle.com]
Sent: 30 October 2015 22:06
To: Cassandra Users 
Subject: Re: SSTables are not getting removed


memtable_offheap_space_in_mb: 4096
memtable_cleanup_threshold: 0.99

^ What led to this setting? You are basically telling Cassandra to not flush 
the highest-traffic memtable until the memtable space is 99% full. With that 
many tables and keyspaces, you are basically locking up everything on the flush 
queue, causing substantial back pressure. If you run 'nodetool tpstats' you 
will probably see a massive number of 'All Time Blocked' for FlushWriter and 
'Dropped' for Mutations.

Actually, this is probably why you are seeing a lot of small tables: commit log 
segments are being filled and blocked from flushing due to the above, so they 
have to attempt to flush repeatedly with whatever is there whenever they get 
the chance.

thrift_framed_transport_size_in_mb: 150

^ This is also a super bad idea. Thrift buffers grow as needed to accomodate 
larger results, but they dont ever shrink. This will lead to a bunch of open 
connections holding onto large, empty byte arrays. This will show up 
immediately in a heap dump inspection.

concurrent_compactors: 4
compaction_throughput_mb_per_sec: 0
endpoint_snitch: GossipingPropertyFileSnitch

This grinds our system to a halt and causes a major GC nearly every second.

So far the only way to get around this is to run a cron job every hour that 
does a "nodetool compact".

What's the output of 'nodetool compactionstats'? CASSANDRA-9882 and 
CASSANDRA-9592 could be to blame (both fixed in recent versions) or this could 
just be a side effect of the memory pressure from the above settings.

Start back at the default settings (except snitch - GPFS is always a good place 
to start) and change settings serially and in small increments based on 
feedback gleaned from monitoring runtimes.


--
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Work-around needed for issue seen as mentioned in CASSANDRA-8072

2015-11-02 Thread K F
Hi folks,
We are hitting similar issue described in 
https://issues.apache.org/jira/browse/CASSANDRA-8072When we try to bootstrap a 
node it doesn't bootstrap due to issues encountered in JIRA ticket above.
We are using version of cassandra 2.0.14.
Is there a work-around for the situation? Please note we have tried couple of 
things.
1. increased the ring_delay timeout 2. tried to just bootstrap this node 
without replace address option, since we were replacing this node with other 
node that had hardware failure
So, would appreciate if someone encountered this situation and was able to get 
around it. Share what were the steps taken.
Thanks.

Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Jeff Ferland
Having caught a node in an undesirable state, many of my threads are reading 
like this:
"SharedPool-Worker-5" #875 daemon prio=5 os_prio=0 tid=0x7f3e14196800 
nid=0x96ce waiting on condition [0x7f3ddb835000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
at 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService.maybeWaitForSync(PeriodicCommitLogService.java:44)
at 
org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:152)
at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:252)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:379)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:359)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:214)
at 
org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:54)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)

But commit log loading seems evenly spaced and low enough in volume:
/mnt/cassandra/commitlog$ ls -lht | head
total 7.2G
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:50 CommitLog-4-1446162051324.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:50 CommitLog-4-1446162051323.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:50 CommitLog-4-1446162051322.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:49 CommitLog-4-1446162051321.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:49 CommitLog-4-1446162051320.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:48 CommitLog-4-1446162051319.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:48 CommitLog-4-1446162051318.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:47 CommitLog-4-1446162051317.log
-rw-r--r-- 1 cassandra cassandra 32M Nov  2 18:46 CommitLog-4-1446162051316.log

Commit logs are on 10 second periodic setting:
commitlog_sync: periodic
commitlog_sync_period_in_ms: 1

SSDs are fully trimmed out and mounted with discard since it snuck into my head 
that could be an issue. Still stuck diagnosing this.

> On Oct 30, 2015, at 3:37 PM, Nate McCall  wrote:
> 
> Does tpstats show unusually high counts for blocked flush writers? 

The “All Time Blocked” metric is 0 across my entire cluster.

> As Sebastian suggests, running ttop will paint a clearer picture about what 
> is happening within C*. I would however recommend going back to CMS in this 
> case as that is the devil we all know and more folks will be able to offer 
> advice on seeing its output (and it removes a delta). 

Forgive me, but what is CMS?

> 
> It’s starting to look to me like it’s possibly related to brief IO spikes 
> that are smaller than my usual graphing granularity. It feels surprising to 
> me that these would affect the Gossip threads, but it’s the best current lead 
> I have with my debugging right now. More to come when I learn it.
> 
> Probably not the case since this was a result of an upgrade, but I've seen 
> similar behavior on systems where some kernels had issues with irqbalance 
> doing the right thing and would end up parking most interrupts on CPU0 (like 
> say for the disk and ethernet modules) regardless of the number of cores. 
> Check out proc via 'cat /proc/interrupts' and make sure the interrupts are 
> spread out of CPU cores. You can steer them off manually at runtime if they 
> are not spread out. 

Interrupt loading is even.

> Also, did you upgrade anything besides Cassandra?

No. I’ve tried some mitigations since tuning thread pool sizes and GC, but the 
problem begins with only an upgrade of Cassandra. No other system packages, 
kernels, etc.

-Jeff




FW: Two node cassandra cluster doubts

2015-11-02 Thread Luis Miguel
Hello!
I have set a cassandra cluster with two nodes, Node A  and Node B --> RF=2, 
Read CL=1 and Write CL = 1;
Node A is seed...

At first everything is working well, when I add/delete/update entries on Node 
A, everything is replicated on Node B and vice-versa, even if I shut down node 
A, and I made new insertions on Node B meanwhile, and After that I start up 
node A again Cassandra recovers OKBUT there is ONE case when this situation 
fails I am going to describe the process:
Node A and Node B are sync.
Select Count (*) From MYTABLE;---> 10 rows
Shut down Node A.
Made some inserts on Node B.
Select Count (*) From MYTABLE;---> 15 rows
Shut down Node B.
Start Up Node B.
Select Count (*) From MYTABLE;---> 15 rows
(Everything Ok, yet).
Start Up Node A.
Select Count (*) From MYTABLE;---> 10 rows (uhmmm...this is weird...check it 
again)Select Count (*) From MYTABLE;---> 15 rows  (wow!..this is correct, lets 
try again)Select Count (*) From MYTABLE;---> 10 rows (Ok...values are dancing)
If I made the same queries on NODE B it Behaves the same way and it only is 
solved with a nodetool repair...but I would prefer an automatic fail-over...
is there any way to avoid this??? or a nodetool repair execution is mandatory???
Thanks in advance!!!
  

Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Nate McCall
>
>
> Forgive me, but what is CMS?
>

Sorry - ConcurrentMarkSweep garbage collector.


>
> No. I’ve tried some mitigations since tuning thread pool sizes and GC, but
> the problem begins with only an upgrade of Cassandra. No other system
> packages, kernels, etc.
>
>
>
>From what 2.0 version did you upgrade? If it was < 2.0.7, you would need to
run 'nodetool upgradesstables'  but I'm not sure the issue would manifest
that way. Otherwise, double check the DSE release notes and upgrade guide.
I've not had any issues like this going from 2.0.x to 2.1.x on vanilla C*.



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: memtable flush size with LCS

2015-11-02 Thread Dan Kinder
@Jeff Jirsa thanks the memtable_* keys were the actual determining factor
for my memtable flushes, they are what I needed to play with.

On Thu, Oct 29, 2015 at 8:23 AM, Ken Hancock 
wrote:

> Or if you're doing a high volume of writes, then your flushed file size
> may be completely determined by other CFs that have consumed the commitlog
> size, forcing any memtables whose commitlog is being delete to be forced to
> disk.
>
>
> On Wed, Oct 28, 2015 at 2:51 PM, Jeff Jirsa 
> wrote:
>
>> It’s worth mentioning that initial flushed file size is typically
>> determined by memtable_cleanup_threshold and the memtable space options
>> (memtable_heap_space_in_mb, memtable_offheap_space_in_mb, depending on
>> memtable_allocation_type)
>>
>>
>>
>> From: Nate McCall
>> Reply-To: "user@cassandra.apache.org"
>> Date: Wednesday, October 28, 2015 at 11:45 AM
>> To: Cassandra Users
>> Subject: Re: memtable flush size with LCS
>>
>>
>>  do you mean that this property is ignored at memtable flush time, and so
>>> memtables are already allowed to be much larger than sstable_size_in_mb?
>>>
>>
>> Yes, 'sstable_size_in_mb' plays no part in the flush process. Flushing
>> is based on solely on runtime activity and the file size is determined by
>> whatever was in the memtable at that time.
>>
>>
>>
>> --
>> -
>> Nate McCall
>> Austin, TX
>> @zznate
>>
>> Co-Founder & Sr. Technical Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>
>
>
>
>


-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: FW: Two node cassandra cluster doubts

2015-11-02 Thread ICHIBA Sara
I think that this is a normal behaviour as you shut down your seed and then
reboot it. You should know that when you start a seed node it doesn't do
the bootstrapping thing. Which means it doesn't look if there are changes
in the contents of the tables. In here in your tests, you shut down node A
before doing the inserts and started it after. So you node A doesn't have
the new rows you inserted. And yes it is normal to have  different values
of your query each time. Because the coordinator node changes and therfore
the query is executed each time on a different node ( when  node B answers
you've got 15 rows and WHE  node A does you have 10 rows)
Le 2 nov. 2015 19:22, "Luis Miguel"  a écrit :

> Hello!
>
> I have set a cassandra cluster with two nodes, Node A  and Node B --> RF=2,
> Read CL=1 and Write CL = 1;
>
> Node A is seed...
>
>
> At first everything is working well, when I add/delete/update entries on
> Node A, everything is replicated on Node B and vice-versa, even if I shut
> down node A, and I made new insertions on Node B meanwhile, and After that
> I start up node A again Cassandra recovers OKBUT there is ONE case when
> this situation fails I am going to describe the process:
>
> Node A and Node B are sync.
>
> Select Count (*) From MYTABLE;---> 10 rows
>
> Shut down Node A.
>
> Made some inserts on Node B.
>
> Select Count (*) From MYTABLE;---> 15 rows
>
> Shut down Node B.
>
> Start Up Node B.
>
> Select Count (*) From MYTABLE;---> 15 rows
>
> (Everything Ok, yet).
>
> Start Up Node A.
>
> Select Count (*) From MYTABLE;---> 10 rows (uhmmm...this is weird...check
> it again)
> Select Count (*) From MYTABLE;---> 15 rows  (wow!..this is correct, lets
> try again)
> Select Count (*) From MYTABLE;---> 10 rows (Ok...values are dancing)
>
> If I made the same queries on NODE B it Behaves the same way and it
> only is solved with a nodetool repair...but I would prefer an automatic
> fail-over...
>
> is there any way to avoid this??? or a nodetool repair execution is
> mandatory???
>
> Thanks in advance!!!
>


RE: Two node cassandra cluster doubts

2015-11-02 Thread Luis Miguel
Thanks for your answer! 
I thought that bootstrapping is executed only when you add a node to the 
cluster the first time after that I thought tgat gossip is the method used to 
discover the cluster members againIn my case I thought that it was more 
about a read repair issue.., am I wrong? 

Date: Mon, 2 Nov 2015 21:12:20 +0100
Subject: Re: FW: Two node cassandra cluster doubts
From: ichi.s...@gmail.com
To: user@cassandra.apache.org

I think that this is a normal behaviour as you shut down your seed and then 
reboot it. You should know that when you start a seed node it doesn't do the 
bootstrapping thing. Which means it doesn't look if there are changes in the 
contents of the tables. In here in your tests, you shut down node A before 
doing the inserts and started it after. So you node A doesn't have the new rows 
you inserted. And yes it is normal to have  different values of your query each 
time. Because the coordinator node changes and therfore  the query is executed 
each time on a different node ( when  node B answers you've got 15 rows and WHE 
 node A does you have 10 rows)
Le 2 nov. 2015 19:22, "Luis Miguel"  a écrit :



Hello!
I have set a cassandra cluster with two nodes, Node A  and Node B --> RF=2, 
Read CL=1 and Write CL = 1;
Node A is seed...

At first everything is working well, when I add/delete/update entries on Node 
A, everything is replicated on Node B and vice-versa, even if I shut down node 
A, and I made new insertions on Node B meanwhile, and After that I start up 
node A again Cassandra recovers OKBUT there is ONE case when this situation 
fails I am going to describe the process:
Node A and Node B are sync.
Select Count (*) From MYTABLE;---> 10 rows
Shut down Node A.
Made some inserts on Node B.
Select Count (*) From MYTABLE;---> 15 rows
Shut down Node B.
Start Up Node B.
Select Count (*) From MYTABLE;---> 15 rows
(Everything Ok, yet).
Start Up Node A.
Select Count (*) From MYTABLE;---> 10 rows (uhmmm...this is weird...check it 
again)Select Count (*) From MYTABLE;---> 15 rows  (wow!..this is correct, lets 
try again)Select Count (*) From MYTABLE;---> 10 rows (Ok...values are dancing)
If I made the same queries on NODE B it Behaves the same way and it only is 
solved with a nodetool repair...but I would prefer an automatic fail-over...
is there any way to avoid this??? or a nodetool repair execution is mandatory???
Thanks in advance!!!
  

  

Doubt regarding consistency-level in Cassandra-2.1.10

2015-11-02 Thread Ajay Garg
Hi All.

I have a 2*2 Network-Topology Replication setup, and I run my application
via DataStax-driver.

I frequently get the errors of type ::
*Cassandra timeout during write query at consistency SERIAL (3 replica were
required but only 0 acknowledged the write)*

I have already tried passing a "write-options with LOCAL_QUORUM
consistency-level" in all create/save statements, but I still get this
error.

Does something else need to be changed in /etc/cassandra/cassandra.yaml too?
Or may be some another place?

-- 
Regards,
Ajay


Re: Doubt regarding consistency-level in Cassandra-2.1.10

2015-11-02 Thread Eric Stevens
Serial consistency gets invoked at the protocol level when doing
lightweight transactions such as CAS operations.  If you're expecting that
your topology is RF=2, N=2, it seems like some keyspace has RF=3, and so
there aren't enough nodes available to satisfy serial consistency.

See
http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_ltwt_transaction_c.html

On Mon, Nov 2, 2015 at 1:29 AM Ajay Garg  wrote:

> Hi All.
>
> I have a 2*2 Network-Topology Replication setup, and I run my application
> via DataStax-driver.
>
> I frequently get the errors of type ::
> *Cassandra timeout during write query at consistency SERIAL (3 replica
> were required but only 0 acknowledged the write)*
>
> I have already tried passing a "write-options with LOCAL_QUORUM
> consistency-level" in all create/save statements, but I still get this
> error.
>
> Does something else need to be changed in /etc/cassandra/cassandra.yaml
> too?
> Or may be some another place?
>
>
> --
> Regards,
> Ajay
>


Keyspaces missing after restarting cassandra service

2015-11-02 Thread Arun Sandu
Hello,

After restarting the cassandra, all of my keyspaces got missing. I can only
see system_traces, system, dse_system. I didn't make any changes to
cassandra.yaml.

But, I can see all the keyspaces data in my */data/ *directory. Is there
anyway to access those lost keyspaces through cqlsh?

Can someone help me with a solution?


-- 
Thanks
Arun


Two DC of same cluster migrate to Two different Cluster

2015-11-02 Thread qihuang.zheng
Few weeks ago, we move a ks to another DC but in same cluster.
Original: cluster_1: DC1,ks1+ks2
After: cluster_1: DC1,ks1; DC2,ks2


by reference http://www.planetcassandra.org/blog/cassandra-migration-to-ec2, 
our steps :


1.in all new Node(DC2) : 
$ vi /usr/install/cassandra/conf/cassandra.yaml:
 cluster_name: cluster_1
 endpoint_snitch: GossipingPropertyFileSnitch
 - seeds: "DC1 Nodes,DC2 Node"
 auto_bootstrap: false
$ sudo -u admin vi /usr/install/cassandra/conf/cassandra-rackdc.properties
 dc=DC2
 rack=RAC1


2.add DC2 to ks2
ALTER KEYSPACE ks2 WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 
'DC2' : 3, 'DC1' : 3};


3.migration data from original DC: 
/usr/install/cassandra/bin/nodetool rebuild DC1


4.ALTER KEYSPACE ks2 WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 
'DC2' : 3};


After that, I think It's OK now. because ks1 and ks2 now had different DC to 
separate the data come from. 


But recently we had some issue: node in DC1 still invoke DC2.
use iftop command, I can see DC1 node flow to DC2. 
I think it's the problem of seed in DC2 still use DC1 node at step1. 


Our leader want to totaly seperate ks1 and ks2. so my program is modify DC2 
nodes to use another cluster_name.
But just reboot DC2 nodes can’t success start node:
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name 
cluster_1 != configured name cluster_2


and ref from 
here:http://stackoverflow.com/questions/22006887/cassandra-saved-cluster-name-test-cluster-configured-name
 can start node but has warning: 
WARN 16:41:35,824 ClusterName mismatch from /192.168.47.216 cluster_1!=cluster2


in this way also has some problem of nodetool status:
nodetool status at DC1 nodes: 
[qihuang.zheng@spark047219 ~]$ /usr/install/cassandra/bin/nodetool status
Datacenter: DC1
===
-- Address LoadTokens Owns  Host IDRack
UN 192.168.47.219 62.61 GB  256   7.2%  953dda8c-3908-401f-9adb-aa59c4cb92d1 
RAC1
Datacenter: DC2
===
-- Address LoadTokens Owns  Host IDRack
DN 192.168.47.223 49.49 GB  256   7.5%  a4b91faf-3e1f-46df-a1cc-39bb267bc683 
RAC1


nodetool status at DC2 nodes after change cluster_name:
[qihuang.zheng@spark047223 ~]$ /usr/install/cassandra/bin/nodetool status
Datacenter: DC1
===
-- Address LoadTokens Owns  Host IDRack
DN 192.168.47.219 ? 256   7.2%  953dda8c-3908-401f-9adb-aa59c4cb92d1 r1
DN 192.168.47.218 ? 256   7.7%  42b8bfae-a6ee-439b-b101-4a0963c9aaa0 r1
...
Datacenter: DC2
===
-- Address LoadTokens Owns  Host IDRack
UN 192.168.47.223 49.49 GB  256   7.5%  a4b91faf-3e1f-46df-a1cc-39bb267bc683 
RAC1


As u can see, DC1 think DC2 is DOWN, and DC2 think DC1 is DN and load is ?. but 
actualy all nodes are UP.

So, Here is the problem:
1. If I delete seeds of DC1 node but still in same cluster_name. will DC1 node 
iftop still flow to DC2?
2. If I want to change DC2 nodes to new cluster_name, what should I do next?
I thinks first way is easy but may be not fit our leader’s opinion.


TKS,qihuang.zheng

Re: Incremental repair from the get go

2015-11-02 Thread Robert Coli
On Mon, Nov 2, 2015 at 3:02 PM, Maciek Sakrejda  wrote:

> Following up on this older question: as per the docs, one *should* still
> do full repair periodically (the docs say weekly), right? And run
> incremental more often to fill in?
>

Something that amounts to full repair once every gc_grace_seconds, unless
you never do anything that results in a tombstone. In that (very rare)
case, one should probably still occasionally (2x a year?) run repair to
cover bitrot and similar (very rare) cases.

"Something that amounts to full repair" is either a full repair or an
incremental repair that covers 100% of the new data since gc_grace_seconds.

=Rob


Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Jeff Ferland

> On Nov 2, 2015, at 11:35 AM, Nate McCall  wrote:
> Forgive me, but what is CMS?
> 
> Sorry - ConcurrentMarkSweep garbage collector. 

Ah, my brain was trying to think in terms of something Cassandra specific. I 
have full GC logging on and since moving to G1, I haven’t had any >500ms GC 
cycles and the >200ms logger triggers about once every 2 minutes. I don’t 
intend to roll off that given positive confirmation that the cycles seem to be 
working well and GC logs don’t line up with outages. Also, the issue proved to 
be the same on CMS as on G1.

> 
> No. I’ve tried some mitigations since tuning thread pool sizes and GC, but 
> the problem begins with only an upgrade of Cassandra. No other system 
> packages, kernels, etc.
> 
> 
> 
> From what 2.0 version did you upgrade? If it was < 2.0.7, you would need to 
> run 'nodetool upgradesstables'  but I'm not sure the issue would manifest 
> that way. Otherwise, double check the DSE release notes and upgrade guide. 
> I've not had any issues like this going from 2.0.x to 2.1.x on vanilla C*. 

2.0.14 or higher. I don’t recall what version of DSE I standardized on last, 
but probably 2.0.16. In any case, the format moved from jb to ka.

I checked into the related source code and from there gripped my logs where I’m 
seeing messages like (most extreme example):

WARN  [PERIODIC-COMMIT-LOG-SYNCER] 2015-11-02 23:10:16,478  
AbstractCommitLogService.java:105 - Out of 38 commit log syncs over the past 
307s with average duration of 2087.32ms, 3 have exceeded the configured commit 
interval by an average of 15515.67ms

I seem to have 3-4 32MB commit logs created per minute. In a quick experiment, 
I’ve run nodetool flush just now and reduced a 5.7G directory to 58M. I’m going 
to flush all the nodes and see if that’s somehow related where it’s just 
holding commit logs too long. (Did I miss the configuration for maximum 
memtable age?)

-Jeff

Re: Keyspaces missing after restarting cassandra service

2015-11-02 Thread Robert Coli
On Mon, Nov 2, 2015 at 1:37 PM, Arun Sandu  wrote:

> After restarting the cassandra, all of my keyspaces got missing. I can
> only see system_traces, system, dse_system. I didn't make any changes to
> cassandra.yaml.
>
> But, I can see all the keyspaces data in my */data/ *directory. Is there
> anyway to access those lost keyspaces through cqlsh?
>
> Can someone help me with a solution?
>

This sort of problem is best debugged interactively, for example in
#cassandra on freenode IRC.

But... start by :

1) checking permissions for relevant directories
2) looking in system.log to see if Cassandra tried and failed to find your
CFs

=Rob


Re: Incremental repair from the get go

2015-11-02 Thread Maciek Sakrejda
Following up on this older question: as per the docs, one *should* still do
full repair periodically (the docs say weekly), right? And run incremental
more often to fill in?


Re: terrible read/write latency fluctuation

2015-11-02 Thread 曹志富
Thanks all of u.

--
Ranger Tsao

2015-10-30 18:25 GMT+08:00 Anishek Agarwal :

> if its some sort of timeseries DTCS might turn out to be better for
> compaction. also some disk monitoring might help to understand if disk is
> the bottleneck.
>
> On Sun, Oct 25, 2015 at 3:47 PM, 曹志富  wrote:
>
>> I will try to trace a read that take > 20msec
>> .
>>
>> just HDD.no delete just 60days ttl.value size is small ,max length is 140.
>>
>>
>> My data like Time Series . date of 90% reads which timestamp < 7days.
>> data almost just insert ,with a lit update.
>>
>
>


Re: Doubt regarding consistency-level in Cassandra-2.1.10

2015-11-02 Thread Ajay Garg
Hi Eric,

I am sorry, but I don't understand.

If there had been some issue in the configuration, then the
consistency-issue would be seen everytime (I guess).
As of now, the error is seen sometimes (probably 30% of times).

On Mon, Nov 2, 2015 at 10:24 PM, Eric Stevens  wrote:

> Serial consistency gets invoked at the protocol level when doing
> lightweight transactions such as CAS operations.  If you're expecting that
> your topology is RF=2, N=2, it seems like some keyspace has RF=3, and so
> there aren't enough nodes available to satisfy serial consistency.
>
> See
> http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_ltwt_transaction_c.html
>
> On Mon, Nov 2, 2015 at 1:29 AM Ajay Garg  wrote:
>
>> Hi All.
>>
>> I have a 2*2 Network-Topology Replication setup, and I run my application
>> via DataStax-driver.
>>
>> I frequently get the errors of type ::
>> *Cassandra timeout during write query at consistency SERIAL (3 replica
>> were required but only 0 acknowledged the write)*
>>
>> I have already tried passing a "write-options with LOCAL_QUORUM
>> consistency-level" in all create/save statements, but I still get this
>> error.
>>
>> Does something else need to be changed in /etc/cassandra/cassandra.yaml
>> too?
>> Or may be some another place?
>>
>>
>> --
>> Regards,
>> Ajay
>>
>


-- 
Regards,
Ajay