[jira] [Commented] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming

2016-04-25 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256291#comment-15256291
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11345:


During a sequential repair.

> Assertion Errors "Memory was freed" during streaming
> 
>
> Key: CASSANDRA-11345
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11345
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jean-Francois Gosselin
>Assignee: Paulo Motta
>
> We encountered the following AssertionError (twice on the same node) during a 
> repair :
> On node /172.16.63.41
> {noformat}
> INFO  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
> StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Session with /10.174.216.160 is complete  
>   
> WARN  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
> StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Stream failed   
> ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 
> StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Streaming error occurred
> java.lang.AssertionError: Memory was freed
>   
>
> at 
> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
>   
> at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
>  
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
> 
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]  
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]  
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]   
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]   
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
>   
>
> {noformat} 
> On node /10.174.216.160
>  
> {noformat}   
> ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 
> StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Streaming error occurred  
> java.io.IOException: Connection reset by peer 
>   
>
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65] 
>   
>
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
> ~[na:1.7.0_65]
>   
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
> ~[na:1.7.0_65]
>   
> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65] 
>   
>
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) 
> ~[na:1.7.0_65]
> 

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2016-03-23 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208597#comment-15208597
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10769:


Based on the comments in CASSANDRA-9935 the "AssertionError: row DecoratedKey" 
is still present in 2.1.13.

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 
> RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new 
> session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
> (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
> user_stats, user_device, user_quota, user_store, user_device_progress, 
> entity_by_id2]
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[AntiEntropySessions:414,5,RMI Runtime]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
> [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
> at 

[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-20 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199874#comment-15199874
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


Same issue as CASSANDRA-9117 but not fixed in 2.1.x ?

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Francois Gosselin updated CASSANDRA-11374:
---
Comment: was deleted

(was: We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a 
subrange repair (we are not using incremental repair).)

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)
Jean-Francois Gosselin created CASSANDRA-11374:
--

 Summary: LEAK DETECTED during repair
 Key: CASSANDRA-11374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
 Project: Cassandra
  Issue Type: Bug
Reporter: Jean-Francois Gosselin


When running a range repair we are seeing the following LEAK DETECTED errors:

{noformat}
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
 was not released before the reference was garbage collected
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201920#comment-15201920
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


We are using the Reaper from Spotify 
https://github.com/spotify/cassandra-reaper, so subrange repair . We are not 
using incremental repair.

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201915#comment-15201915
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a 
subrange repair (we are not using incremental repair).

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming

2016-03-11 Thread Jean-Francois Gosselin (JIRA)
Jean-Francois Gosselin created CASSANDRA-11345:
--

 Summary: Assertion Errors "Memory was freed" during streaming
 Key: CASSANDRA-11345
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11345
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: Jean-Francois Gosselin


We encountered the following AssertionError (twice on the same node) during a 
repair :

On node /172.16.63.41

{noformat}
INFO  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Session with /10.174.216.160 is complete

WARN  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Stream failed   
ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 
StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Streaming error occurred
java.lang.AssertionError: Memory was freed  

   
at 
org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) 
~[apache-cassandra-2.1.13.jar:2.1.13]   

at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) 
~[apache-cassandra-2.1.13.jar:2.1.13]   
   
at 
org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) 
~[apache-cassandra-2.1.13.jar:2.1.13]   
  
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
 ~[apache-cassandra-2.1.13.jar:2.1.13]  
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
 ~[apache-cassandra-2.1.13.jar:2.1.13]  
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
 ~[apache-cassandra-2.1.13.jar:2.1.13]   
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
 ~[apache-cassandra-2.1.13.jar:2.1.13]   
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]  

   
{noformat} 

On node /10.174.216.160
 
{noformat}   
ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 StreamSession.java:505 
- [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred   
   
java.io.IOException: Connection reset by peer   

   
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65]   

   
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
~[na:1.7.0_65]  

at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
~[na:1.7.0_65]  

at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65]   

   
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) 
~[na:1.7.0_65]  
 
at 
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
 ~[apache-cassandra-2.1.13.jar:2.1.13] 
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 

[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2016-02-22 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157028#comment-15157028
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

I will give it a try.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154789#comment-15154789
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

We are doing range repair with https://github.com/spotify/cassandra-reaper . We 
don't use incremental repair .  We also see the issue with :  nodetool repair 
-pr

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-12 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145336#comment-15145336
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] What's the next step to troubleshoot this issue ? Any specific log we 
could enable at DEBUG  ?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143400#comment-15143400
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

Ok from 172.16.63.39, same error "received out of order wrt DecoratedKey" :

{noformat}
ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,512 Validator.java:245 - 
Failed creating a merkle tree for [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b 
on foo/bar, (-5525881226490706160,-5525442713957813067]], /10.174.216.158 (see 
log for details)
ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,516 CassandraDaemon.java:223 
- Exception in thread Thread[ValidationExecutor:118,1,main]
java.lang.AssertionError: row DecoratedKey(-5525725068665570338, 
0010e3a74bf82717394598e2b7421c89382e250265336137346266382d323731372d333934352d393865322d62373432316338393338326510f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
 received out of order wrt DecoratedKey(-5525444669477674618, 
0010581499f0b99337e1bf468611fd0233e4250235383134393966302d623939332d333765312d626634362d3836313166643032653410f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143270#comment-15143270
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] Yesterday we ran nodetool scrub on all the nodes and restarted the 
nodes. No luck we're still getting "received out of order wrt DecoratedKey" . 
Any suggestions for the next step ?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143307#comment-15143307
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

Here's a new one with no clear message from the exception :

{noformat}
INFO  [AntiEntropyStage:1] 2016-02-11 17:21:20,947 RepairSession.java:171 - 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] Received merkle tree for bar 
from /10.53.10.30
ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,033 RepairSession.java:303 - 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,034 CassandraDaemon.java:223 
- Exception in thread Thread[AntiEntropySessions:28,5,RMI Runtime]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.9.jar:2.1.9]
... 3 common frames omitted
ERROR [Thread-20728] 2016-02-11 17:21:21,034 StorageService.java:2966 - Repair 
session d78e02b0-d0e3-11e5-a04a-4ffa10ef584b for range 
(-5525881226490706160,-5525442713957813067] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.7.0_65]
at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
[na:1.7.0_65]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2957)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140965#comment-15140965
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10769:


We are also seeing this issue in our multi datacenters cluster (3 DCs), C* 
2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the error 
keeps coming back .

How can we get into this state ?

{noformat}
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - 
Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de 
on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 
(see log for details)
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 
CassandraDaemon.java:223 - Exception in thread 
Thread[ValidationExecutor:5884,1,main]
java.lang.AssertionError: row DecoratedKey(5126475305931285312, 
00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400)
 received out of order wrt DecoratedKey(5128167525973821686, 
00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141245#comment-15141245
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim]  The WARN message should be in the C* log or on the stdout of nodetool 
?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141341#comment-15141341
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

No, we haven't seen this WARN. The only thing we haven't tried is a node 
restart (based on you comment above " ... The latter may be fixed by restarting 
the node." ) . Although I'm not sure it will fix the problem since we've used 
C* 2.1.9 from the beginning.


> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at 

[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2016-02-09 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139500#comment-15139500
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

It's not fixed. I end up adding a catch for the AssertionError  in the 
GraphiteReporter as a workaround.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134437#comment-15134437
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] We are also seeing this issue in our multi datacenters cluster (3 
DCs), C* 2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the 
error keeps coming back . 

We did have some network glitch, as [~mlowicki] was saying, can it be related 
to network issues ? 

{noformat}
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - 
Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de 
on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 
(see log for details)
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 
CassandraDaemon.java:223 - Exception in thread 
Thread[ValidationExecutor:5884,1,main]
java.lang.AssertionError: row DecoratedKey(5126475305931285312, 
00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400)
 received out of order wrt DecoratedKey(5128167525973821686, 
00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}


> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> 

[jira] [Commented] (CASSANDRA-10502) Cassandra query degradation with high frequency updated tables

2016-01-06 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086025#comment-15086025
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10502:


[~thobbs] Have you tried to dump the data for this key with sstable2json ? 

> Cassandra query degradation with high frequency updated tables
> --
>
> Key: CASSANDRA-10502
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10502
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dodong Juan
>Priority: Minor
>  Labels: perfomance, query, triage
> Fix For: 2.2.x
>
>
> Hi,
> So we are developing a system that computes profile of things that it 
> observes.  The observation comes in form of events. Each thing that it 
> observe has an id and each thing has a set of subthings in it which has 
> measurement of some kind. Roughly there are about 500 subthings within each 
> thing. We receive events containing measurements of these 500 subthings every 
> 10 seconds or so.
> So as we receive events, we  read the old profile value, calculate the new 
> profile based on the new value and save it back. 
> One of the things we observe are the processes running on the server.
> We use the following schema to hold the profile. 
> {noformat}
> CREATE TABLE processinfometric_profile (
> profilecontext text,
> id text,
> month text,
> day text,
> hour text,
> minute text,
> command text,
> cpu map,
> majorfaults map,
> minorfaults map,
> nice map,
> pagefaults map,
> pid map,
> ppid map,
> priority map,
> resident map,
> rss map,
> sharesize map,
> size map,
> starttime map,
> state map,
> threads map,
> user map,
> vsize map,
> PRIMARY KEY ((profilecontext, id, month, day, hour, minute), command)
> ) WITH CLUSTERING ORDER BY (command ASC)
> AND bloom_filter_fp_chance = 0.1
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {noformat}
> This profile will then be use for certain analytics that can use in the 
> context of the ‘thing’ or in the context of specific thing and subthing. 
> A profile can be defined as monthly, daily, hourly. So in case of monthly the 
> month will be set to the current month (i.e. ‘Oct’) and the day and hour will 
> be set to empty ‘’ string.
> The problem that we have observed is that over time (actually in just a 
> matter of hours) we will see a huge degradation of query response  for the 
> monthly profile. At the start it will be respinding in 10-100 ms and after a 
> couple of hours it will go to 2000-3000 ms . If you leave it for a couple of 
> days you will start experiencing readtimeouts . The query is basically just :
> {noformat}
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' 
> and minute=''
> {noformat}
> This will have only about 500 rows or so.
> We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the 
> issue to no avail. And since this is a test, we are running on a single node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-10-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943468#comment-14943468
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276try
278{
279CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280assert metadata != null : sstable.getFilename();
281if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}




> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting

2015-10-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943468#comment-14943468
 ] 

Jean-Francois Gosselin edited comment on CASSANDRA-9625 at 10/5/15 2:51 PM:


I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276  try
278  {
279 CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280 assert metadata != null : sstable.getFilename();
281 if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}





was (Author: jfgosselin):
I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276try
278{
279CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280assert metadata != null : sstable.getFilename();
281if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}




> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-26 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715194#comment-14715194
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~benedict] Can this issue be reopened ? 

 GraphiteReporter not reporting
 --

 Key: CASSANDRA-9625
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
 Project: Cassandra
  Issue Type: Bug
 Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
Reporter: Eric Evans
Assignee: T Jake Luciani
 Attachments: metrics.yaml, thread-dump.log


 When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
 working.  The usual startup is logged, and one batch of samples is sent, but 
 the reporting interval comes and goes, and no other samples are ever sent.  
 The logs are free from errors.
 Frustratingly, metrics reporting works in our smaller (staging) environment 
 on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
 on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
 level of concurrency?).
 Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-25 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711887#comment-14711887
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~tjake] I think that I've found the issue. When the Gauge metric for 
CompressionMetadataOffHeapMemoryUsed is called,  the following method is called 
in org.apache.cassandra.io.util.Memory :

{code:title=org.apache.cassandra.io.util.Memory.java|borderStyle=solid}
public long size()
{
assert peer != 0;
return size;
}
{code}
and for some reason peer was 0. After the AssertionError the metrics graphite 
reporter thread is no longer executed.

 GraphiteReporter not reporting
 --

 Key: CASSANDRA-9625
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
 Project: Cassandra
  Issue Type: Bug
 Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
Reporter: Eric Evans
Assignee: T Jake Luciani
 Attachments: metrics.yaml, thread-dump.log


 When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
 working.  The usual startup is logged, and one batch of samples is sent, but 
 the reporting interval comes and goes, and no other samples are ever sent.  
 The logs are free from errors.
 Frustratingly, metrics reporting works in our smaller (staging) environment 
 on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
 on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
 level of concurrency?).
 Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-13 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695853#comment-14695853
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~tjake] I can easily reproduce the issue after ~ 12h, 
com.yammer.metrics.reporting at DEBUG didn't provide anything . Any specific 
places where I should add traces in GraphiteRepoter ?

 GraphiteReporter not reporting
 --

 Key: CASSANDRA-9625
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
 Project: Cassandra
  Issue Type: Bug
 Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
Reporter: Eric Evans
Assignee: T Jake Luciani
 Attachments: metrics.yaml, thread-dump.log


 When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
 working.  The usual startup is logged, and one batch of samples is sent, but 
 the reporting interval comes and goes, and no other samples are ever sent.  
 The logs are free from errors.
 Frustratingly, metrics reporting works in our smaller (staging) environment 
 on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
 on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
 level of concurrency?).
 Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682117#comment-14682117
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

We are seeing this issue on 2.1.8.

 GraphiteReporter not reporting
 --

 Key: CASSANDRA-9625
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
 Project: Cassandra
  Issue Type: Bug
 Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
Reporter: Eric Evans
Assignee: T Jake Luciani
 Fix For: 2.1.x

 Attachments: metrics.yaml, thread-dump.log


 When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
 working.  The usual startup is logged, and one batch of samples is sent, but 
 the reporting interval comes and goes, and no other samples are ever sent.  
 The logs are free from errors.
 Frustratingly, metrics reporting works in our smaller (staging) environment 
 on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
 on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
 level of concurrency?).
 Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)