[jira] [Commented] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-17 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214638#comment-14214638
 ] 

Oleg Poleshuk commented on CASSANDRA-8245:
--

There are 6 nodes total: 3 per DC. 
Time difference is 2 seconds between 2 DCs. Within one DC it's 1 second.
Anyway, upgraded to 2.1 and the error is gone.

I would recommend to add an additional debug info to FailureDetector, hostname 
should be very useful that just Ignoring interval time 

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Assignee: Brandon Williams
Priority: Minor
 Attachments: stack1.txt, stack2.txt, stack3.txt, stack4.txt, 
 stack5.txt


 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8256) nodetool cleanup: java.lang.ClassCastException

2014-11-10 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204767#comment-14204767
 ] 

Oleg Poleshuk commented on CASSANDRA-8256:
--

Unfortunatelly I can't reproduce it anymore as I recovered by deleting all data 
and restarting Cassandra.

 nodetool cleanup: java.lang.ClassCastException
 --

 Key: CASSANDRA-8256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8256
 Project: Cassandra
  Issue Type: Bug
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
Reporter: Oleg Poleshuk

 $ nodetool cleanup
 error: null
 -- StackTrace --
 java.lang.ClassCastException
 system.log:
 ERROR [CompactionExecutor:2] 2014-11-05 14:10:35,416 CassandraDaemon.java:153 
 - Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.ClassCastException: null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8256) nodetool cleanup: java.lang.ClassCastException

2014-11-05 Thread Oleg Poleshuk (JIRA)
Oleg Poleshuk created CASSANDRA-8256:


 Summary: nodetool cleanup: java.lang.ClassCastException
 Key: CASSANDRA-8256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8256
 Project: Cassandra
  Issue Type: Bug
 Environment: Scientific Linux release 6.5
java version 1.7.0_51

Reporter: Oleg Poleshuk


system.log:

ERROR [CompactionExecutor:2] 2014-11-05 14:10:35,416 CassandraDaemon.java:153 - 
Exception in thread Thread[CompactionExecutor:2,1,main]
java.lang.ClassCastException: null




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8256) nodetool cleanup: java.lang.ClassCastException

2014-11-05 Thread Oleg Poleshuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Poleshuk updated CASSANDRA-8256:
-
Description: 
$ nodetool cleanup
error: null
-- StackTrace --
java.lang.ClassCastException

system.log:

ERROR [CompactionExecutor:2] 2014-11-05 14:10:35,416 CassandraDaemon.java:153 - 
Exception in thread Thread[CompactionExecutor:2,1,main]
java.lang.ClassCastException: null


  was:
system.log:

ERROR [CompactionExecutor:2] 2014-11-05 14:10:35,416 CassandraDaemon.java:153 - 
Exception in thread Thread[CompactionExecutor:2,1,main]
java.lang.ClassCastException: null



 nodetool cleanup: java.lang.ClassCastException
 --

 Key: CASSANDRA-8256
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8256
 Project: Cassandra
  Issue Type: Bug
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
Reporter: Oleg Poleshuk

 $ nodetool cleanup
 error: null
 -- StackTrace --
 java.lang.ClassCastException
 system.log:
 ERROR [CompactionExecutor:2] 2014-11-05 14:10:35,416 CassandraDaemon.java:153 
 - Exception in thread Thread[CompactionExecutor:2,1,main]
 java.lang.ClassCastException: null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-04 Thread Oleg Poleshuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Poleshuk updated CASSANDRA-8245:
-
Attachment: stack3.txt
stack2.txt
stack5.txt
stack4.txt
stack1.txt

So it happened again (not OOM but Gossip pending tasks).
I ran jstack several times. Not sure if that contains a stacktrace devs need.
If that is not enough just let me know how I can make a proper dump.

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor
 Attachments: stack1.txt, stack2.txt, stack3.txt, stack4.txt, 
 stack5.txt


 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-04 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196143#comment-14196143
 ] 

Oleg Poleshuk edited comment on CASSANDRA-8245 at 11/4/14 2:26 PM:
---

So it happened again (not OOM but Gossip pending tasks).
I ran jstack several times. Not sure if that contains a stacktrace devs need.
If that is not enough just let me know how I can make a proper dump.

Stacktraces are attached.


was (Author: relgames):
So it happened again (not OOM but Gossip pending tasks).
I ran jstack several times. Not sure if that contains a stacktrace devs need.
If that is not enough just let me know how I can make a proper dump.

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor
 Attachments: stack1.txt, stack2.txt, stack3.txt, stack4.txt, 
 stack5.txt


 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-03 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194724#comment-14194724
 ] 

Oleg Poleshuk commented on CASSANDRA-8245:
--

Processors   = 8
Memory   = 15.58 GB

processor   : 7
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
stepping: 1
cpu MHz : 2400.000
cache size  : 20480 KB


 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor

 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-03 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194725#comment-14194725
 ] 

Oleg Poleshuk commented on CASSANDRA-8245:
--

I will provide a thread dump when it dies next time.

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor

 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-03 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194726#comment-14194726
 ] 

Oleg Poleshuk commented on CASSANDRA-8245:
--

Why did it go to Minor? Datacenter dies periodically because of Gossiper memory 
leaks, this is not Minor as to me...

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor

 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-03 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194729#comment-14194729
 ] 

Oleg Poleshuk commented on CASSANDRA-8245:
--

Dates go from August because it went down several times. Probably, I grepped 
previous event.
Datacenters communicate properly. In any case, isn't it supposed to survive DC 
link failure?..

 Cassandra nodes periodically die in 2-DC configuration
 --

 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
 java version 1.7.0_51
 Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Minor

 We have 2 DCs with 3 nodes in each.
 Second DC periodically has 1-2 nodes down.
 Looks like it looses connectivity with another nodes and then Gossiper starts 
 to accumulate tasks until Cassandra dies with OOM.
 WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
 live ratio to maximum of 64.0 instead of Infinity
  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
 stage has 1 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
 stage has 4 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
 stage has 8 pending tasks; skipping status check (no nodes will be marked 
 down)
  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
 stage has 11 pending tasks; skipping status check (no nodes will be marked 
 down)
 ...
 WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
 stage has 1014764 pending tasks; skipping status check (no nodes will be 
 marked down)
  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
 Unexpected exception in the selector loop.
 java.lang.OutOfMemoryError: Java heap space
 Also those lines but not sure it is relevant:
 DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
 Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2014-11-03 Thread Oleg Poleshuk (JIRA)
Oleg Poleshuk created CASSANDRA-8245:


 Summary: Cassandra nodes periodically die in 2-DC configuration
 Key: CASSANDRA-8245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Scientific Linux release 6.5
java version 1.7.0_51
Cassandra 2.0.9
Reporter: Oleg Poleshuk
Priority: Critical


We have 2 DCs with 3 nodes in each.
Second DC periodically has 1-2 nodes down.

Looks like it looses connectivity with another nodes and then Gossiper starts 
to accumulate tasks until Cassandra dies with OOM.

WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
live ratio to maximum of 64.0 instead of Infinity
 WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
stage has 1 pending tasks; skipping status check (no nodes will be marked down)
 WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
stage has 4 pending tasks; skipping status check (no nodes will be marked down)
 WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
stage has 8 pending tasks; skipping status check (no nodes will be marked down)
 WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
stage has 11 pending tasks; skipping status check (no nodes will be marked down)
...
WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
stage has 1014764 pending tasks; skipping status check (no nodes will be marked 
down)
 WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space

Also those lines but not sure it is relevant:
DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
Ignoring interval time of 2085963047





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5797) DC-local CAS

2014-04-10 Thread Oleg Poleshuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965394#comment-13965394
 ] 

Oleg Poleshuk commented on CASSANDRA-5797:
--

Is there a way to change serial consistency level from CQL / Java driver?
http://stackoverflow.com/questions/22666911/cassandra-cql-consistency-during-cas-operations

 DC-local CAS
 

 Key: CASSANDRA-5797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5797
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 2.0 rc1

 Attachments: 0001-Thrift-generated-files.txt, 
 0002-Add-LOCAL_SERIAL-CL.txt, 0003-CQL-and-native-protocol-changes.txt


 For two-datacenter deployments where the second DC is strictly for disaster 
 failover, it would be useful to restrict CAS to a single DC to avoid cross-DC 
 round trips.
 (This would require manually truncating {{system.paxos}} when failing over.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (CASSANDRA-5797) DC-local CAS

2014-04-10 Thread Oleg Poleshuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Poleshuk updated CASSANDRA-5797:
-

Comment: was deleted

(was: Is there a way to change serial consistency level from CQL / Java driver?
http://stackoverflow.com/questions/22666911/cassandra-cql-consistency-during-cas-operations)

 DC-local CAS
 

 Key: CASSANDRA-5797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5797
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 2.0 rc1

 Attachments: 0001-Thrift-generated-files.txt, 
 0002-Add-LOCAL_SERIAL-CL.txt, 0003-CQL-and-native-protocol-changes.txt


 For two-datacenter deployments where the second DC is strictly for disaster 
 failover, it would be useful to restrict CAS to a single DC to avoid cross-DC 
 round trips.
 (This would require manually truncating {{system.paxos}} when failing over.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)