[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-06-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348394#comment-15348394
 ] 

mlowicki commented on CASSANDRA-10992:
--

We're using now C* 2.1.14 (for couple of weeks) and no hanging streaming 
sessions so far.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar, db1.ams.jstack, 
> db6.analytics.jstack
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-06-05 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315855#comment-15315855
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~pauloricardomg] any ETA for 2.1.15 release?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.15, 3.6, 3.0.6, 2.2.7
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> 

[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-05-27 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303849#comment-15303849
 ] 

mlowicki commented on CASSANDRA-10992:
--

[~pauloricardomg] yes:
{code}
WARN  [Thread-154755] 2016-05-26 17:36:20,625 CompressedInputStream.java:190 - 
Error while reading compressed input stream.
WARN  [STREAM-IN-/10.210.59.151] 2016-05-26 17:36:20,625 
CompressedStreamReader.java:115 - [Stream 14d2bb50-2366-11e6-aff3-094ba808857e] 
Error while reading partition DecoratedKey(-8649238600224809230, 
000933303034383932393204934600) from stream on ks='sync' and 
table='entity_by_id2'.
WARN  [Thread-156292] 2016-05-26 19:52:29,073 CompressedInputStream.java:190 - 
Error while reading compressed input stream.
WARN  [STREAM-IN-/10.210.59.84] 2016-05-26 19:52:29,073 
CompressedStreamReader.java:115 - [Stream 040b4041-2379-11e6-a363-41a0407f7ce6] 
Error while reading partition DecoratedKey(-3970687134714418221, 
000933303533393631373204000276d800) from stream on ks='sync' and 
table='entity_by_id2'.
WARN  [Thread-157643] 2016-05-26 23:17:09,393 CompressedInputStream.java:190 - 
Error while reading compressed input stream.
WARN  [STREAM-IN-/10.210.59.86] 2016-05-26 23:17:09,393 
CompressedStreamReader.java:115 - [Stream 97753900-2395-11e6-b5a2-b9dde4344a60] 
Error while reading partition DecoratedKey(2694075662350043685, 
00093238313135323204808800) from stream on ks='sync' and 
table='entity_by_id2'.
{code}

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar, db1.ams.jstack, 
> db6.analytics.jstack
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10992) Hanging streaming sessions

2016-05-18 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10992:
-
Attachment: db6.analytics.jstack
db1.ams.jstack

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar, db1.ams.jstack, 
> db6.analytics.jstack
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-05-18 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288768#comment-15288768
 ] 

mlowicki commented on CASSANDRA-10992:
--

We've 3 datacenter (ams, lati and analytics which is virtual datacenter on 
OpenStack). I've observed that from the list of active streams in OpsCenter in 
each pair always one node is from OpenStack (analytics cluster) but as I've 
restarted all analytics nodes still there is lots of hanging sessions so it's 
not purely related to them.

Attaching jstack output from two nodes.

Also I've doubled timeout (to 2 hours) and will soon start new repair run.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-05-14 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283451#comment-15283451
 ] 

mlowicki commented on CASSANDRA-10992:
--

Upgrade to 2.1.14 didn't helped. Still even almost 12h after end of repair run 
(using Cassandra Reaper) I've active streams (all with progress set to 100%). 
{{streaming_socket_timeout_in_ms}} has default value (360).

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-03-04 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179905#comment-15179905
 ] 

mlowicki commented on CASSANDRA-10992:
--

Repair finished successfully using Cassandra Reaper. It happened during the 
whole process (took couple of days) that Reaper terminated some sessions due to 
timeout (saw that in logs which live watching).

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-03-04 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179850#comment-15179850
 ] 

mlowicki commented on CASSANDRA-10992:
--

The same case after upgrade. Hanging streaming sessions visible in OpsCenter 
and returned by `nodetool netstats`. I've been waiting 2 hours after repair 
finished.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-02-29 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171656#comment-15171656
 ] 

mlowicki commented on CASSANDRA-10992:
--

We've started rolling out upgrade today so within couple of days I should have 
some feedback.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11174) org.apache.cassandra.metrics:type=Streaming,name=ActiveOutboundStreams is always zero

2016-02-17 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-11174:


 Summary: 
org.apache.cassandra.metrics:type=Streaming,name=ActiveOutboundStreams is 
always zero
 Key: CASSANDRA-11174
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11174
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.12, Debian Wheezy
Reporter: mlowicki
 Attachments: streams.png

{{org.apache.cassandra.metrics:type=Streaming,name=TotalIncomingBytes}} and 
{{org.apache.cassandra.metrics:type=Streaming,name=TotalOutgoingBytes}} work 
fine but 
{{org.apache.cassandra.metrics:type=Streaming,name=ActiveOutboundStreams}} is 
always 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-02-16 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149380#comment-15149380
 ] 

mlowicki commented on CASSANDRA-10992:
--

We'll upgrade our cluster this or next week (have been waiting a bit after 
release to make sure no critical issues have been introduced). Will let you 
know here when done.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10991) Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet

2016-01-19 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106872#comment-15106872
 ] 

mlowicki commented on CASSANDRA-10991:
--

{code}
cqlsh> desc keyspace "OpsCenter";

CREATE KEYSPACE "OpsCenter" WITH replication = {'class': 
'NetworkTopologyStrategy', 'Amsterdam': '1', 'Ashburn': '1'}  AND 
durable_writes = true;

CREATE TABLE "OpsCenter".events_timeline (
key text,
column1 bigint,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = '{"info": "OpsCenter management data.", "version": [5, 2, 1]}'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.25
AND speculative_retry = 'NONE';

CREATE TABLE "OpsCenter".settings (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = '{"info": "OpsCenter management data.", "version": [5, 2, 1]}'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 1.0
AND speculative_retry = 'NONE';

...
{code}

Ah I see that "Analytics" is missing in {{replication}}.

> Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet
> --
>
> Key: CASSANDRA-10991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10991
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Marcus Eriksson
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> I've C* cluster spread across 3 DCs. Running {{cleanup}} on all nodes in one 
> DC always fails:
> {code}
> root@db1:~# nt cleanup system
> root@db1:~# nt cleanup sync
> root@db1:~# nt cleanup OpsCenter
> Aborted cleaning up atleast one column family in keyspace OpsCenter, check 
> server logs for more information.
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:292)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:204)
> root@db1:~# 
> {code}
> Checked two other DCs and running cleanup there works fine (it didn't fail 
> immediately).
> Output from {{nodetool status}} from one node in problematic DC:
> {code}
> root@db1:~# nt status
> Datacenter: Amsterdam
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.3.162   518.54 GB  256 ?   
> 50e606f5-e893-4a3b-86d3-1e5986dceea9  RAC1
> UN  10.210.3.230   532.63 GB  256 ?   
> 7b8fc988-8a6a-4d94-ae84-ab9da9ab01e8  RAC1
> UN  10.210.3.161   538.82 GB  256 ?   
> d44b0f6d-7933-4a7c-ba7b-f8648e038f85  RAC1
> UN  10.210.3.160   497.6 GB   256 ?   
> e7332179-a47e-471d-bcd4-08c638ab9ea4  RAC1
> UN  10.210.3.224   334.25 GB  256 ?   
> 92b0bd8c-0a5a-446a-83ea-2feea4988fe3  RAC1
> UN  10.210.3.118   518.34 GB  256 ?   
> ebddeaf3-1433-4372-a4ca-9c7ba3d4a26b  RAC1
> UN  10.210.3.221   516.57 GB  256 ?   
> 44d67a49-5310-4ab5-b448-a44be350abf5  RAC1
> UN  10.210.3.117   493.83 GB  256 ?   
> aae92956-82d6-421e-8f3f-22393ac7e5f7  RAC1
> Datacenter: Analytics
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.59.124  392.83 GB  320 ?   
> f770a8cc-b7bf-44ac-8cc0-214d9228dfcd  RAC1
> UN  10.210.59.151  411.9 GB   320 ?   
> 3cc87422-0e43-4cd1-91bf-484f121be072  RAC1
> UN  10.210.58.132  309.8 GB   256 ?   
> 84d94d13-28d3-4b49-a3d9-557ab47e79b9  RAC1
> UN  10.210.58.133  281.82 

[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-01-15 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102147#comment-15102147
 ] 

mlowicki commented on CASSANDRA-10992:
--

[~pauloricardomg] CASSANDRA-10961 has been released for 2.1? We would need to 
replace our production cluster with attached build which is solid amount of 
work so if it'll be fixed in upcoming 2.1.x release then we would test patch 
after upgrading nodes.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-01-12 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15094919#comment-15094919
 ] 

mlowicki commented on CASSANDRA-10992:
--

Some IO errors I've found in logs:
{code}
ERROR [Thread-518762] 2016-01-12 14:36:11,130 CassandraDaemon.java:227 - 
Exception in thread Thread[Thread-518762,5,main]
java.lang.RuntimeException: java.io.IOException: Connection timed out
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.12.jar:2.1.12]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66]
Caused by: java.io.IOException: Connection timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_66]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
~[na:1.8.0_66]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
~[na:1.8.0_66]
at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_66]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
~[na:1.8.0_66]
at 
org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:178)
 ~[apache-cassandra-2.1.12.jar:2.1.12]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.12.jar:2.1.12]
... 1 common frames omitted
{code}

{code}
ERROR [STREAM-IN-/10.210.58.133] 2016-01-12 15:01:39,450 StreamSession.java:505 
- [Stream #193dd5c0-b93b-11e5-a713-8fe7d1d062ea] Streaming error occurred
java.io.IOException: Connection timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_66]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
~[na:1.8.0_66]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
~[na:1.8.0_66]
at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_66]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
~[na:1.8.0_66]
at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
 ~[apache-cassandra-2.1.12.jar:2.1.12]
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:250)
 ~[apache-cassandra-2.1.12.jar:2.1.12]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
INFO  [STREAM-IN-/10.210.58.133] 2016-01-12 15:01:39,451 
StreamResultFuture.java:180 - [Stream #193dd5c0-b93b-11e5-a713-8fe7d1d062ea] 
Session with /10.210.58.133 is complete
WARN  [STREAM-IN-/10.210.58.133] 2016-01-12 15:01:39,451 
StreamResultFuture.java:207 - [Stream #193dd5c0-b93b-11e5-a713-8fe7d1d062ea] 
Stream failed
{code}

{code}
ERROR [Thread-404196] 2016-01-12 14:44:05,532 CassandraDaemon.java:227 - 
Exception in thread Thread[Thread-404196,5,main]
java.lang.RuntimeException: java.nio.channels.AsynchronousCloseException
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.12.jar:2.1.12]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66]
Caused by: java.nio.channels.AsynchronousCloseException: null
at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
 ~[na:1.8.0_66]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:407) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) 
~[na:1.8.0_66]
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
~[na:1.8.0_66]
at 
org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:178)
 ~[apache-cassandra-2.1.12.jar:2.1.12]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.12.jar:2.1.12]
... 1 common frames omitted
{code}

{code}
ERROR [STREAM-OUT-/10.210.3.224] 2016-01-12 14:44:12,114 StreamSession.java:505 
- [Stream #e7af3850-b93a-11e5-bebc-2f019a24a954] Streaming error occurred
java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) ~[na:1.8.0_66]
at 
sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:427) 
~[na:1.8.0_66]
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:492) 
~[na:1.8.0_66]
at 

[jira] [Created] (CASSANDRA-10991) Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet

2016-01-09 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10991:


 Summary: Cleanup OpsCenter keyspace fails - node thinks that 
didn't joined the ring yet
 Key: CASSANDRA-10991
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10991
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.12, Debian Wheezy
Reporter: mlowicki
 Fix For: 2.1.12


I've C* cluster spread across 3 DCs. Running {{cleanup}} on all nodes in one DC 
always fails:
{code}
root@db1:~# nt cleanup system
root@db1:~# nt cleanup sync
root@db1:~# nt cleanup OpsCenter
Aborted cleaning up atleast one column family in keyspace OpsCenter, check 
server logs for more information.
error: nodetool failed, check server logs
-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logs
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:292)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:204)

root@db1:~# 
{code}

Checked two other DCs and running cleanup there works fine (it didn't fail 
immediately).

Output from {{nodetool status}} from one node in problematic DC:
{code}
root@db1:~# nt status
Datacenter: Amsterdam
=
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  OwnsHost ID   
Rack
UN  10.210.3.162   518.54 GB  256 ?   
50e606f5-e893-4a3b-86d3-1e5986dceea9  RAC1
UN  10.210.3.230   532.63 GB  256 ?   
7b8fc988-8a6a-4d94-ae84-ab9da9ab01e8  RAC1
UN  10.210.3.161   538.82 GB  256 ?   
d44b0f6d-7933-4a7c-ba7b-f8648e038f85  RAC1
UN  10.210.3.160   497.6 GB   256 ?   
e7332179-a47e-471d-bcd4-08c638ab9ea4  RAC1
UN  10.210.3.224   334.25 GB  256 ?   
92b0bd8c-0a5a-446a-83ea-2feea4988fe3  RAC1
UN  10.210.3.118   518.34 GB  256 ?   
ebddeaf3-1433-4372-a4ca-9c7ba3d4a26b  RAC1
UN  10.210.3.221   516.57 GB  256 ?   
44d67a49-5310-4ab5-b448-a44be350abf5  RAC1
UN  10.210.3.117   493.83 GB  256 ?   
aae92956-82d6-421e-8f3f-22393ac7e5f7  RAC1
Datacenter: Analytics
=
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  OwnsHost ID   
Rack
UN  10.210.59.124  392.83 GB  320 ?   
f770a8cc-b7bf-44ac-8cc0-214d9228dfcd  RAC1
UN  10.210.59.151  411.9 GB   320 ?   
3cc87422-0e43-4cd1-91bf-484f121be072  RAC1
UN  10.210.58.132  309.8 GB   256 ?   
84d94d13-28d3-4b49-a3d9-557ab47e79b9  RAC1
UN  10.210.58.133  281.82 GB  256 ?   
02bd2d02-41c5-4193-81b0-dee434adb0da  RAC1
UN  10.210.59.86   285.84 GB  256 ?   
bc6422ea-22e9-431a-ac16-c4c040f0c4e5  RAC1
UN  10.210.59.84   331.06 GB  256 ?   
a798e6b0-3a84-4ec2-82bb-8474086cb315  RAC1
UN  10.210.59.85   366.26 GB  256 ?   
52699077-56cf-4c1e-b308-bf79a1644b7e  RAC1
Datacenter: Ashburn
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  OwnsHost ID   
Rack
UN  10.195.15.176  534.51 GB  256 ?   
c6ac22df-c43a-4b25-b3b5-5e12ce9c69da  RAC1
UN  10.195.15.177  313.73 GB  256 ?   
eafa2a72-84a2-4cdc-a634-3c660acc6af8  RAC1
UN  10.195.15.163  470.92 GB  256 ?   
bcd2a534-94c4-4406-8d16-c1fc26b41844  RAC1
UN  10.195.15.162  539.82 GB  256 ?   
bb649cef-21de-4077-a35f-994319011a06  RAC1
UN  10.195.15.182  499.64 GB  256 ?   
6ce2d14d-9fb8-4494-8e97-3add05bd35de  RAC1
UN  10.195.15.167  508.48 GB  256 ?   
6f359675-852a-4842-9ff2-bdc69e6b04a2  RAC1
UN  10.195.15.166  490.28 GB  256 ?   
1ec5d0c5-e8bd-4973-96d9-523de91d08c5  RAC1
UN  10.195.15.183  447.78 GB  256 ?   
824165b0-1f1b-40e8-9695-e2f596cb8611  RAC1

Note: Non-system keyspaces don't have the same replication settings, effective 
ownership information is meaningless
{code}

Logs from one of the nodes where {{cleanup}} fails:
{code}
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,942 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,970 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,000 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,027 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,053 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,082 
CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
ring
INFO  

[jira] [Created] (CASSANDRA-10992) Hanging streaming sessions

2016-01-09 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10992:


 Summary: Hanging streaming sessions
 Key: CASSANDRA-10992
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.12, Debian Wheezy
Reporter: mlowicki
 Fix For: 2.1.12


I've started recently running repair using [Cassandra 
Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
noticed hanging streaming sessions:
{code}
root@db1:~# date
Sat Jan  9 16:43:00 UTC 2016
root@db1:~# nt netstats -H | grep total
Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
total
Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
total
Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
total
Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB 
total
Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
total
Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
total

root@db1:~# date
Sat Jan  9 17:45:42 UTC 2016
root@db1:~# nt netstats -H | grep total
Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
total
Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
total
Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
total
Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB 
total
Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
total
Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
total
{code}

Such sessions are left even when repair job is long time done (confirmed by 
checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} in 
cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10991) Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet

2016-01-09 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090699#comment-15090699
 ] 

mlowicki commented on CASSANDRA-10991:
--

According to {{nodetool status}} and looking at metrics node joined the ring 
many weeks ago.

> Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet
> --
>
> Key: CASSANDRA-10991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10991
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
> Fix For: 2.1.12
>
>
> I've C* cluster spread across 3 DCs. Running {{cleanup}} on all nodes in one 
> DC always fails:
> {code}
> root@db1:~# nt cleanup system
> root@db1:~# nt cleanup sync
> root@db1:~# nt cleanup OpsCenter
> Aborted cleaning up atleast one column family in keyspace OpsCenter, check 
> server logs for more information.
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:292)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:204)
> root@db1:~# 
> {code}
> Checked two other DCs and running cleanup there works fine (it didn't fail 
> immediately).
> Output from {{nodetool status}} from one node in problematic DC:
> {code}
> root@db1:~# nt status
> Datacenter: Amsterdam
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.3.162   518.54 GB  256 ?   
> 50e606f5-e893-4a3b-86d3-1e5986dceea9  RAC1
> UN  10.210.3.230   532.63 GB  256 ?   
> 7b8fc988-8a6a-4d94-ae84-ab9da9ab01e8  RAC1
> UN  10.210.3.161   538.82 GB  256 ?   
> d44b0f6d-7933-4a7c-ba7b-f8648e038f85  RAC1
> UN  10.210.3.160   497.6 GB   256 ?   
> e7332179-a47e-471d-bcd4-08c638ab9ea4  RAC1
> UN  10.210.3.224   334.25 GB  256 ?   
> 92b0bd8c-0a5a-446a-83ea-2feea4988fe3  RAC1
> UN  10.210.3.118   518.34 GB  256 ?   
> ebddeaf3-1433-4372-a4ca-9c7ba3d4a26b  RAC1
> UN  10.210.3.221   516.57 GB  256 ?   
> 44d67a49-5310-4ab5-b448-a44be350abf5  RAC1
> UN  10.210.3.117   493.83 GB  256 ?   
> aae92956-82d6-421e-8f3f-22393ac7e5f7  RAC1
> Datacenter: Analytics
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.59.124  392.83 GB  320 ?   
> f770a8cc-b7bf-44ac-8cc0-214d9228dfcd  RAC1
> UN  10.210.59.151  411.9 GB   320 ?   
> 3cc87422-0e43-4cd1-91bf-484f121be072  RAC1
> UN  10.210.58.132  309.8 GB   256 ?   
> 84d94d13-28d3-4b49-a3d9-557ab47e79b9  RAC1
> UN  10.210.58.133  281.82 GB  256 ?   
> 02bd2d02-41c5-4193-81b0-dee434adb0da  RAC1
> UN  10.210.59.86   285.84 GB  256 ?   
> bc6422ea-22e9-431a-ac16-c4c040f0c4e5  RAC1
> UN  10.210.59.84   331.06 GB  256 ?   
> a798e6b0-3a84-4ec2-82bb-8474086cb315  RAC1
> UN  10.210.59.85   366.26 GB  256 ?   
> 52699077-56cf-4c1e-b308-bf79a1644b7e  RAC1
> Datacenter: Ashburn
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.195.15.176  534.51 GB  256 ?   
> c6ac22df-c43a-4b25-b3b5-5e12ce9c69da  RAC1
> UN  10.195.15.177  313.73 GB  256 ?   
> eafa2a72-84a2-4cdc-a634-3c660acc6af8  RAC1
> UN  10.195.15.163  470.92 GB  256 ?   
> bcd2a534-94c4-4406-8d16-c1fc26b41844  RAC1
> UN  10.195.15.162  539.82 GB  256 ?   
> bb649cef-21de-4077-a35f-994319011a06  RAC1
> UN  10.195.15.182  499.64 GB  256 ?   
> 6ce2d14d-9fb8-4494-8e97-3add05bd35de  RAC1
> UN  10.195.15.167  508.48 GB  256 ?   
> 6f359675-852a-4842-9ff2-bdc69e6b04a2  RAC1
> UN  10.195.15.166  490.28 GB  256 ?   
> 1ec5d0c5-e8bd-4973-96d9-523de91d08c5  RAC1
> UN  10.195.15.183  447.78 GB  256 ?   
> 824165b0-1f1b-40e8-9695-e2f596cb8611  RAC1
> Note: Non-system keyspaces don't have the same replication settings, 
> effective ownership information is meaningless
> {code}
> Logs from one of the nodes where {{cleanup}} fails:
> {code}
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,942 
> CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
> ring
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,970 
> CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
> ring
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,000 
> CompactionManager.java:388 - 

[jira] [Commented] (CASSANDRA-10991) Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet

2016-01-09 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090702#comment-15090702
 ] 

mlowicki commented on CASSANDRA-10991:
--

Yes, all nodes in single DC have this problem. {{nodetool status}} looks the 
same in all DCs.

> Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet
> --
>
> Key: CASSANDRA-10991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10991
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
> Fix For: 2.1.12
>
>
> I've C* cluster spread across 3 DCs. Running {{cleanup}} on all nodes in one 
> DC always fails:
> {code}
> root@db1:~# nt cleanup system
> root@db1:~# nt cleanup sync
> root@db1:~# nt cleanup OpsCenter
> Aborted cleaning up atleast one column family in keyspace OpsCenter, check 
> server logs for more information.
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:292)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:204)
> root@db1:~# 
> {code}
> Checked two other DCs and running cleanup there works fine (it didn't fail 
> immediately).
> Output from {{nodetool status}} from one node in problematic DC:
> {code}
> root@db1:~# nt status
> Datacenter: Amsterdam
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.3.162   518.54 GB  256 ?   
> 50e606f5-e893-4a3b-86d3-1e5986dceea9  RAC1
> UN  10.210.3.230   532.63 GB  256 ?   
> 7b8fc988-8a6a-4d94-ae84-ab9da9ab01e8  RAC1
> UN  10.210.3.161   538.82 GB  256 ?   
> d44b0f6d-7933-4a7c-ba7b-f8648e038f85  RAC1
> UN  10.210.3.160   497.6 GB   256 ?   
> e7332179-a47e-471d-bcd4-08c638ab9ea4  RAC1
> UN  10.210.3.224   334.25 GB  256 ?   
> 92b0bd8c-0a5a-446a-83ea-2feea4988fe3  RAC1
> UN  10.210.3.118   518.34 GB  256 ?   
> ebddeaf3-1433-4372-a4ca-9c7ba3d4a26b  RAC1
> UN  10.210.3.221   516.57 GB  256 ?   
> 44d67a49-5310-4ab5-b448-a44be350abf5  RAC1
> UN  10.210.3.117   493.83 GB  256 ?   
> aae92956-82d6-421e-8f3f-22393ac7e5f7  RAC1
> Datacenter: Analytics
> =
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.210.59.124  392.83 GB  320 ?   
> f770a8cc-b7bf-44ac-8cc0-214d9228dfcd  RAC1
> UN  10.210.59.151  411.9 GB   320 ?   
> 3cc87422-0e43-4cd1-91bf-484f121be072  RAC1
> UN  10.210.58.132  309.8 GB   256 ?   
> 84d94d13-28d3-4b49-a3d9-557ab47e79b9  RAC1
> UN  10.210.58.133  281.82 GB  256 ?   
> 02bd2d02-41c5-4193-81b0-dee434adb0da  RAC1
> UN  10.210.59.86   285.84 GB  256 ?   
> bc6422ea-22e9-431a-ac16-c4c040f0c4e5  RAC1
> UN  10.210.59.84   331.06 GB  256 ?   
> a798e6b0-3a84-4ec2-82bb-8474086cb315  RAC1
> UN  10.210.59.85   366.26 GB  256 ?   
> 52699077-56cf-4c1e-b308-bf79a1644b7e  RAC1
> Datacenter: Ashburn
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  10.195.15.176  534.51 GB  256 ?   
> c6ac22df-c43a-4b25-b3b5-5e12ce9c69da  RAC1
> UN  10.195.15.177  313.73 GB  256 ?   
> eafa2a72-84a2-4cdc-a634-3c660acc6af8  RAC1
> UN  10.195.15.163  470.92 GB  256 ?   
> bcd2a534-94c4-4406-8d16-c1fc26b41844  RAC1
> UN  10.195.15.162  539.82 GB  256 ?   
> bb649cef-21de-4077-a35f-994319011a06  RAC1
> UN  10.195.15.182  499.64 GB  256 ?   
> 6ce2d14d-9fb8-4494-8e97-3add05bd35de  RAC1
> UN  10.195.15.167  508.48 GB  256 ?   
> 6f359675-852a-4842-9ff2-bdc69e6b04a2  RAC1
> UN  10.195.15.166  490.28 GB  256 ?   
> 1ec5d0c5-e8bd-4973-96d9-523de91d08c5  RAC1
> UN  10.195.15.183  447.78 GB  256 ?   
> 824165b0-1f1b-40e8-9695-e2f596cb8611  RAC1
> Note: Non-system keyspaces don't have the same replication settings, 
> effective ownership information is meaningless
> {code}
> Logs from one of the nodes where {{cleanup}} fails:
> {code}
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,942 
> CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
> ring
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,970 
> CompactionManager.java:388 - Cleanup cannot run before a node has joined the 
> ring
> INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,000 
> CompactionManager.java:388 - 

[jira] [Created] (CASSANDRA-10823) LEAK DETECTED (org.apache.cassandra.utils.concurrent.Ref$State@)

2015-12-07 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10823:


 Summary: LEAK DETECTED 
(org.apache.cassandra.utils.concurrent.Ref$State@)
 Key: CASSANDRA-10823
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10823
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.11, Debian Wheezy
Reporter: mlowicki


{code}
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,455 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@66909a93) to class 
org.apache.cassandra.io.util.MmappedSegmentedFile$Cleanup@529816960:/var/lib/cassandra/data2/sync/user_quota-fe54df20770e11e4a0a975bb514ae072/sync-user_quota-ka-61776-Index.db
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@45868eb2) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@84044743:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@61f1d862) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1286945834:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@e8110be) 
to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@997339490:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4608376b) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1164867000:[[OffHeapBitSet]]
 was not released before the reference was garbage collectedERROR 
[Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK DETECTED: a 
reference (org.apache.cassandra.utils.concurrent.Ref$State@56f2a6a4) to class 
org.apache.cassandra
.utils.concurrent.WrappedSharedCloseable$1@1419412884:[[OffHeapBitSet]] was not 
released before the reference was garbage collectedERROR [Reference-Reaper:1] 
2015-12-07 14:09:30,456 Ref.java:179 - LEAK DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@6cb7e2f0) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@479474259:[Memory@[0..4),
 Memory@[0..11)] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,457 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4573f5cd) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1074694490:[[OffHeapBitSet]]
 was not released before the reference was garbage collectedERROR 
[Reference-Reaper:1] 2015-12-07 14:09:30,457 Ref.java:179 - LEAK DETECTED: a 
reference (org.apache.cassandra.utils.concurrent.Ref$State@7a5b9490) to class 
org.apache.cassandra
.utils.concurrent.WrappedSharedCloseable$1@309770418:[[OffHeapBitSet]] was not 
released before the reference was garbage collectedERROR [Reference-Reaper:1] 
2015-12-07 14:09:30,457 Ref.java:179 - LEAK DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@3057b796) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1322643877:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,498 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@3febb012) to class 
org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@175410823:/var/lib/cassandra/data2/sync/entity2-e24b5040199b11e5a30f75bb514ae072/sync-entity2-tmplink-ka-1175811
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,498 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@6a39466d) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1446958230:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,499 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@36f6f016) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@235688075:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,499 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4a7bdce1) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@165830139:[Memory@[0..4),
 Memory@[0..11)] was not 

[jira] [Commented] (CASSANDRA-10823) LEAK DETECTED (org.apache.cassandra.utils.concurrent.Ref$State@)

2015-12-07 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045717#comment-15045717
 ] 

mlowicki commented on CASSANDRA-10823:
--

[~tjake] it was while running drain so probably dupe of CASSANDRA-10079 I just 
found.

> LEAK DETECTED (org.apache.cassandra.utils.concurrent.Ref$State@)
> 
>
> Key: CASSANDRA-10823
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10823
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> {code}
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,455 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@66909a93) to class 
> org.apache.cassandra.io.util.MmappedSegmentedFile$Cleanup@529816960:/var/lib/cassandra/data2/sync/user_quota-fe54df20770e11e4a0a975bb514ae072/sync-user_quota-ka-61776-Index.db
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@45868eb2) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@84044743:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@61f1d862) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1286945834:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@e8110be) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@997339490:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4608376b) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1164867000:[[OffHeapBitSet]]
>  was not released before the reference was garbage collectedERROR 
> [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK DETECTED: a 
> reference (org.apache.cassandra.utils.concurrent.Ref$State@56f2a6a4) to class 
> org.apache.cassandra
> .utils.concurrent.WrappedSharedCloseable$1@1419412884:[[OffHeapBitSet]] was 
> not released before the reference was garbage collectedERROR 
> [Reference-Reaper:1] 2015-12-07 14:09:30,456 Ref.java:179 - LEAK DETECTED: a 
> reference (org.apache.cassandra.utils.concurrent.Ref$State@6cb7e2f0) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@479474259:[Memory@[0..4),
>  Memory@[0..11)] was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,457 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4573f5cd) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1074694490:[[OffHeapBitSet]]
>  was not released before the reference was garbage collectedERROR 
> [Reference-Reaper:1] 2015-12-07 14:09:30,457 Ref.java:179 - LEAK DETECTED: a 
> reference (org.apache.cassandra.utils.concurrent.Ref$State@7a5b9490) to class 
> org.apache.cassandra
> .utils.concurrent.WrappedSharedCloseable$1@309770418:[[OffHeapBitSet]] was 
> not released before the reference was garbage collectedERROR 
> [Reference-Reaper:1] 2015-12-07 14:09:30,457 Ref.java:179 - LEAK DETECTED: a 
> reference (org.apache.cassandra.utils.concurrent.Ref$State@3057b796) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1322643877:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,498 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@3febb012) to class 
> org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@175410823:/var/lib/cassandra/data2/sync/entity2-e24b5040199b11e5a30f75bb514ae072/sync-entity2-tmplink-ka-1175811
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,498 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@6a39466d) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1446958230:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2015-12-07 14:09:30,499 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@36f6f016) to class 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031827#comment-15031827
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim]: any chance this is related to network issues? During the weekend I've 
monitored it carefully and repair failed at the same time I see drop in number 
of requests sent to C* cluster in this datacenter. I've decided to run repair 
for smaller tables where it takes 1-4 hours to complete and it happened once 
(launched on 6 nodes) also when such drop appears.

Tried 2nd time and now it works (and I don't see any anomalies in metrics).

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-28 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030436#comment-15030436
 ] 

mlowicki commented on CASSANDRA-9935:
-

Tried to run repair once again after online scrub and cleanup on all nodes. 
Failed with the same error. This is what I've found in logs:
{code}
ERROR [ValidationExecutor:1089] 2015-11-28 04:33:15,865 Validator.java:245 - 
Failed creating a merkle tree for [repair #0f9c5530-9589-11e5-b036-75bb514ae072 
on sync/entity2, (-6842825601551036942,-6841068234348096268]], /10.210.3.221 
(see log for details)
ERROR [ValidationExecutor:1089] 2015-11-28 04:33:15,866 
CassandraDaemon.java:227 - Exception in thread 
Thread[ValidationExecutor:1089,1,main]
java.lang.AssertionError: row DecoratedKey(-6842806631972123001, 
000932383331343239333204c3c700) received out of order wrt 
DecoratedKey(-6841074726771668561, 000932313637353230343404c3c700)
at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
ERROR [AntiEntropySessions:1957] 2015-11-28 04:33:15,868 RepairSession.java:303 
- [repair #0f9c5530-9589-11e5-b036-75bb514ae072] session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#0f9c5530-9589-11e5-b036-75bb514ae072 on sync/entity2, 
(-6842825601551036942,-6841068234348096268]] Validation failed in /10.210.3.221
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}

{code}
ERROR [AntiEntropySessions:1957] 2015-11-28 04:33:15,869 
CassandraDaemon.java:227 - Exception in thread 
Thread[AntiEntropySessions:1957,5,RMI Runtime]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #0f9c5530-9589-11e5-b036-75bb514ae072 on sync/entity2, 
(-6842825601551036942,-6841068234348096268]] Validation failed in /10.210.3.221
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#0f9c5530-9589-11e5-b036-75bb514ae072 on sync/entity2, 
(-6842825601551036942,-6841068234348096268]] Validation failed in /10.210.3.221
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]
... 3 common frames omitted
{code}

{code}
ERROR 

[jira] [Updated] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-28 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9935:

Attachment: system.log.10.210.3.117

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 

[jira] [Updated] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-28 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9935:

Attachment: system.log.10.210.3.230
system.log.10.210.3.221

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.221, 
> system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-28 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030439#comment-15030439
 ] 

mlowicki commented on CASSANDRA-9935:
-

Also If I run repair for range where got this "Endpoint X died" it works fine:
{code}
root@db1:~# time nodetool repair --in-local-dc -st 8066543735336862962 -et 
8074446636728465478
[2015-11-28 08:55:19,048] Nothing to repair for keyspace 'system'
[2015-11-28 08:55:19,069] Starting repair command #6, repairing 1 ranges for 
keyspace OpsCenter (parallelism=SEQUENTIAL, full=true)
[2015-11-28 08:55:19,176] Repair command #6 finished
[2015-11-28 08:55:19,188] Starting repair command #7, repairing 1 ranges for 
keyspace sync (parallelism=SEQUENTIAL, full=true)
[2015-11-28 09:03:49,529] Repair session c054ec60-95ad-11e5-b036-75bb514ae072 
for range (8066543735336862962,8074446636728465478] finished
[2015-11-28 09:03:49,529] Repair command #7 finished
[2015-11-28 09:03:49,544] Starting repair command #8, repairing 1 ranges for 
keyspace system_traces (parallelism=SEQUENTIAL, full=true)
[2015-11-28 09:03:49,562] Repair command #8 finished

real8m32.356s
user0m2.784s
sys 0m0.224s
{code}

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR 

[jira] [Created] (CASSANDRA-10782) AssertionError at getApproximateKeyCount

2015-11-28 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10782:


 Summary: AssertionError at getApproximateKeyCount
 Key: CASSANDRA-10782
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10782
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.11, Debian Wheezy
Reporter: mlowicki


{code}
ERROR [CompactionExecutor:9797] 2015-11-28 09:20:10,361 
CassandraDaemon.java:227 - Exception in thread 
Thread[CompactionExecutor:9797,1,main]
java.lang.AssertionError: 
/var/lib/cassandra/data/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-6335-Data.db
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:268)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]at 
java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10782) AssertionError at getApproximateKeyCount

2015-11-28 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10782:
-
Description: 
{code}
ERROR [CompactionExecutor:9845] 2015-11-28 09:26:10,525 
CassandraDaemon.java:227 - Exception in thread 
Thread[CompactionExecutor:9845,1,main]
java.lang.AssertionError: 
/var/lib/cassandra/data/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-6335-Data.db
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:268)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}

  was:
{code}
ERROR [CompactionExecutor:9797] 2015-11-28 09:20:10,361 
CassandraDaemon.java:227 - Exception in thread 
Thread[CompactionExecutor:9797,1,main]
java.lang.AssertionError: 
/var/lib/cassandra/data/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-6335-Data.db
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:268)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]at 
java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}


> AssertionError at getApproximateKeyCount
> 
>
> Key: CASSANDRA-10782
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10782
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> {code}
> ERROR [CompactionExecutor:9845] 2015-11-28 09:26:10,525 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[CompactionExecutor:9845,1,main]
> java.lang.AssertionError: 
> /var/lib/cassandra/data/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/system-sstable_activity-ka-6335-Data.db
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:268)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
>  

[jira] [Created] (CASSANDRA-10780) Exception encountered during startup

2015-11-27 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10780:


 Summary: Exception encountered during startup
 Key: CASSANDRA-10780
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10780
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.11 on Debian Wheezy
Reporter: mlowicki


{code}
ERROR [main] 2015-11-27 12:39:42,659 CassandraDaemon.java:579 - Exception 
encountered during startup
org.apache.cassandra.io.FSReadError: java.lang.NullPointerException
at 
org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:663)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:306) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) 
[apache-cassandra-2.1.11.jar:2.1.11]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:655)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
... 3 common frames omitted
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10780) Exception encountered during startup

2015-11-27 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10780:
-
Reproduced In: 2.1.11

> Exception encountered during startup
> 
>
> Key: CASSANDRA-10780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10780
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11 on Debian Wheezy
>Reporter: mlowicki
>
> {code}
> ERROR [main] 2015-11-27 12:39:42,659 CassandraDaemon.java:579 - Exception 
> encountered during startup
> org.apache.cassandra.io.FSReadError: java.lang.NullPointerException
> at 
> org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:663)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:306) 
> [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) 
> [apache-cassandra-2.1.11.jar:2.1.11]
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:655)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> ... 3 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-26 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028727#comment-15028727
 ] 

mlowicki commented on CASSANDRA-10769:
--

Another error from today (10.210.3.221 is node where I've started repair - 
still in progress):
{code}
ERROR [ValidationExecutor:588] 2015-11-26 12:48:02,877 Validator.java:245 - 
Failed creating a merkle tree for [repair #72d57040-943b-11e5-b036-75bb514ae072 
on sync/entity2, (-2928915626059257529,-2921716383005026147]], /10.210.3.221 
(see log for details)
ERROR [ValidationExecutor:588] 2015-11-26 12:48:02,878 CassandraDaemon.java:227 
- Exception in thread Thread[ValidationExecutor:588,1,main]
java.lang.AssertionError: row DecoratedKey(-2928866306571865615, 
000932383734343432313204b33100) received out of order wrt 
DecoratedKey(-2921918599167375595, 000933313439393634373204c3c700)
at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
ERROR [AntiEntropySessions:1061] 2015-11-26 12:48:02,880 RepairSession.java:303 
- [repair #72d57040-943b-11e5-b036-75bb514ae072] session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#72d57040-943b-11e5-b036-75bb514ae072 on sync/entity2, 
(-2928915626059257529,-2921716383005026147]] Validation failed in /10.210.3.221
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.11.jar:2.1.11]at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
INFO  [AntiEntropySessions:1062] 2015-11-26 12:48:02,880 RepairSession.java:260 
- [repair #ee6d25e0-943b-11e5-b036-75bb514ae072] new session: will sync 
/10.210.3.221, /10.210.3.224, /10.210.3.117 on range 
(-4713086263421125450,-4709745913912183602] for sync.[device_token, entity2, 
user_stats, user_device, user_quota, user_store, user_device_progress, 
entity_by_id2]
ERROR [AntiEntropySessions:1061] 2015-11-26 12:48:02,881 
CassandraDaemon.java:227 - Exception in thread 
Thread[AntiEntropySessions:1061,5,RMI Runtime]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #72d57040-943b-11e5-b036-75bb514ae072 on sync/entity2, 
(-2928915626059257529,-2921716383005026147]] Validation failed in /10.210.3.221
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#72d57040-943b-11e5-b036-75bb514ae072 on sync/entity2, 
(-2928915626059257529,-2921716383005026147]] Validation failed in /10.210.3.221
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 

[jira] [Created] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10769:


 Summary: "received out of order wrt DecoratedKey" after scrub
 Key: CASSANDRA-10769
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.11, Debian Wheezy
Reporter: mlowicki


After running scrub and cleanup on all nodes in single data center I'm getting:
{code}
ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
Failed creating a merkle tree for [repair #89fa2b70-933d-11e5-b036-75bb514ae072 
on sync/entity_by_id2, (-5867793819051725444,-5865919628027816979]], 
/10.210.3.221 (see log for details)
ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 CassandraDaemon.java:227 
- Exception in thread Thread[ValidationExecutor:103,1,main]
java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
000932373633313036313204808800) received out of order wrt 
DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}

What I did is to run repair on other node:
{code}
time nodetool repair --in-local-dc
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026472#comment-15026472
 ] 

mlowicki commented on CASSANDRA-10769:
--

Found this error on other node as well:
{code}
ERROR [ValidationExecutor:78] 2015-11-24 22:35:52,652 Validator.java:245 - 
Failed creating a merkle tree for [repair #93837260-92fb-11e5-b036-75bb514ae072 
on sync/entity2, (-6012485790753833422,-6009995015166063234]], /10.210.3.221 
(see log for details)
ERROR [ValidationExecutor:78] 2015-11-24 22:35:52,652 CassandraDaemon.java:227 
- Exception in thread Thread[ValidationExecutor:78,1,main]
java.lang.AssertionError: row DecoratedKey(-6012437544863914154, 
000932373632373537303204c3c700) received out of order wrt 
DecoratedKey(-6009997709246787268, 000932373538333034303204c3c700)
at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10769:
-
Description: 
After running scrub and cleanup on all nodes in single data center I'm getting:
{code}
ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
Failed creating a merkle tree for [repair #89fa2b70-933d-11e5-b036-75bb514ae072 
on sync/entity_by_id2, (-5867793819051725444,-5865919628027816979]], 
/10.210.3.221 (see log for details)
ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 CassandraDaemon.java:227 
- Exception in thread Thread[ValidationExecutor:103,1,main]
java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
000932373633313036313204808800) received out of order wrt 
DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}

What I did is to run repair on other node:
{code}
time nodetool repair --in-local-dc
{code}

Corresponding log on the node where repair has been started:
{code}
ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 RepairSession.java:303 
- [repair #89fa2b70-933d-11e5-b036-75bb514ae072] session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
(-5867793819051725444,-5865919628027816979]] Validation failed in /10.210.3.117
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 RepairSession.java:260 
- [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new session: will sync 
/10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
(7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
user_stats, user_device, user_quota, user_store, user_device_progress, 
entity_by_id2]
ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
CassandraDaemon.java:227 - Exception in thread 
Thread[AntiEntropySessions:414,5,RMI Runtime]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
(-5867793819051725444,-5865919628027816979]] Validation failed in /10.210.3.117
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
(-5867793819051725444,-5865919628027816979]] Validation failed in /10.210.3.117
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 

[jira] [Updated] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10769:
-
Reproduced In: 2.1.11

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 
> RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new 
> session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
> (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
> user_stats, user_device, user_quota, user_store, user_device_progress, 
> entity_by_id2]
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[AntiEntropySessions:414,5,RMI Runtime]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
> [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> 

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026601#comment-15026601
 ] 

mlowicki commented on CASSANDRA-10769:
--

Yeah, found CASSANDRA-9126 as well but decided to file a separate ticket as 
scrub didn't helped in my case.

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 
> RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new 
> session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
> (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
> user_stats, user_device, user_quota, user_store, user_device_progress, 
> entity_by_id2]
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[AntiEntropySessions:414,5,RMI Runtime]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
> [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
> at 

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2015-11-25 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026657#comment-15026657
 ] 

mlowicki commented on CASSANDRA-10769:
--

I'm struggling with CASSANDRA-9935.

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 
> RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new 
> session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
> (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
> user_stats, user_device, user_quota, user_store, user_device_progress, 
> entity_by_id2]
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[AntiEntropySessions:414,5,RMI Runtime]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
> [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> 

[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025432#comment-15025432
 ] 

mlowicki edited comment on CASSANDRA-9935 at 11/24/15 9:23 PM:
---

Did found these session IDs on other nodes:
* 
https://www.dropbox.com/s/qtx5rzmqzl9zj47/Screenshot%202015-11-24%2022.22.03.png?dl=0
* 
https://www.dropbox.com/s/o7k0cfhscd1au50/Screenshot%202015-11-24%2022.22.19.png?dl=0


was (Author: mlowicki):
Did found these session IDs on other nodes.

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025314#comment-15025314
 ] 

mlowicki commented on CASSANDRA-9935:
-

Launched repair and got the same exception after couple of days but grepped 
through logs and found:

{code}
ERROR [Thread-7155] 2015-11-24 17:38:24,895 StorageService.java:2999 - Repair 
session 3c9f7d40-8e19-11e5-bda4-0d9c8928349f for range 
(-1741218705797202342,-1741060704162047213] failed with error 
java.io.IOException: Failed during snapshot creation.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
java.io.IOException: Failed during snapshot creation.
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
[na:1.7.0_80]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2990)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: java.lang.RuntimeException: java.io.IOException: Failed during 
snapshot creation.
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_80]
... 1 common frames omitted
Caused by: java.io.IOException: Failed during snapshot creation.
at 
org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
~[guava-16.0.jar:na]
... 3 common frames omitted
{code}

Additionally:
{code}
ERROR [Thread-7155] 2015-11-24 17:38:24,907 StorageService.java:2999 - Repair 
session b55b4930-8e73-11e5-bda4-0d9c8928349f for range 
(5801873202797297113,5802832998541920530] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#b55b4930-8e73-11e5-bda4-0d9c8928349f on sync/entity2, 
(5801873202797297113,5802832998541920530]] Validation failed in /10.195.15.167
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#b55b4930-8e73-11e5-bda4-0d9c8928349f on sync/entity2, 
(5801873202797297113,5802832998541920530]] Validation failed in /10.195.15.167
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
[na:1.7.0_80]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2990)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#b55b4930-8e73-11e5-bda4-0d9c8928349f on sync/entity2, 
(5801873202797297113,5802832998541920530]] Validation failed in /10.195.15.167
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_80]
... 1 common frames omitted
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#b55b4930-8e73-11e5-bda4-0d9c8928349f on sync/entity2, 

[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025432#comment-15025432
 ] 

mlowicki edited comment on CASSANDRA-9935 at 11/24/15 9:25 PM:
---

Didn't found these session IDs on other nodes:
* 
https://www.dropbox.com/s/qtx5rzmqzl9zj47/Screenshot%202015-11-24%2022.22.03.png?dl=0
* 
https://www.dropbox.com/s/o7k0cfhscd1au50/Screenshot%202015-11-24%2022.22.19.png?dl=0


was (Author: mlowicki):
Did found these session IDs on other nodes:
* 
https://www.dropbox.com/s/qtx5rzmqzl9zj47/Screenshot%202015-11-24%2022.22.03.png?dl=0
* 
https://www.dropbox.com/s/o7k0cfhscd1au50/Screenshot%202015-11-24%2022.22.19.png?dl=0

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025516#comment-15025516
 ] 

mlowicki commented on CASSANDRA-9935:
-

Nothing found. Checked system.log.1.zip from /var/log/cassandra on each box but 
only on db8.lati (where repair started) found those session IDs.

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-11-24 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025432#comment-15025432
 ] 

mlowicki commented on CASSANDRA-9935:
-

Did found these session IDs on other nodes.

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
> at 
> 

[jira] [Created] (CASSANDRA-10744) Option to monitor pending compaction tasks per type

2015-11-20 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10744:


 Summary: Option to monitor pending compaction tasks per type
 Key: CASSANDRA-10744
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10744
 Project: Cassandra
  Issue Type: Wish
Reporter: mlowicki
 Attachments: compaction_monitoring.png

There is 
{{org.apache.cassandra.metrics:type=ColumnFamily,name=PendingCompactions}} 
which can help visualise number of pending compaction tasks (see attached 
screenshot). Unfortunately there is not way to distinguish what kind of tasks 
sit in this queue like how many SCRUB, COMPACTION, VALIDATION, CLEANUP or 
INDEX_BUILD tasks are there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10697) Leak detected while running offline scrub

2015-11-12 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10697:


 Summary: Leak detected while running offline scrub
 Key: CASSANDRA-10697
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10697
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.9 on Debian Wheezy
Reporter: mlowicki
Priority: Critical


I got couple of those:
{code}
ERROR 05:09:15 LEAK DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@3b60e162) to class 
org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1433208674:/var/lib/cassandra/data/sync/entity2-e24b5040199b11e5a30f75bb514ae072/sync-entity2-ka-405434
 was not released before the reference was garbage collected
{code}

and then:
{code}
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:99)

at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:81)

at 
org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:353)

at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444)

at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424)

at 
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:378)

at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:348)

at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327)

at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:397)

at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:381)

at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75)

at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52)

at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46)

at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:120)

at 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)

at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)

at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

at 
org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:165)

at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:121)

at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:192)

at 
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:127)

at 
org.apache.cassandra.io.sstable.SSTableRewriter.tryAppend(SSTableRewriter.java:158)

at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:220)

at 
org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:116)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10689) java.lang.OutOfMemoryError: Direct buffer memory

2015-11-11 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10689:


 Summary: java.lang.OutOfMemoryError: Direct buffer memory
 Key: CASSANDRA-10689
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10689
 Project: Cassandra
  Issue Type: Bug
Reporter: mlowicki
 Fix For: 2.1.11


{code}
ERROR [SharedPool-Worker-63] 2015-11-11 17:53:16,161 
JVMStabilityInspector.java:117 - JVM state determined to be unstable.  Exiting 
forcefully due to:

java.lang.OutOfMemoryError: Direct buffer memory

at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_80]

at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
~[na:1.7.0_80]

at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
~[na:1.7.0_80]

at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) 
~[na:1.7.0_80]

at sun.nio.ch.IOUtil.read(IOUtil.java:195) ~[na:1.7.0_80]

at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) 
~[na:1.7.0_80]

at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:104)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:81)
 ~[apache-cassandra-2.1.11.jar:2.1.11]  

at 
org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:310)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:64)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1894)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:107)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:83)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:42)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1994)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1837)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) 
~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) 
~[apache-cassandra-2.1.11.jar:2.1.11]

at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]

at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]

at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-2.1.11.jar:2.1.11]

at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.11.jar:2.1.11]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10689) java.lang.OutOfMemoryError: Direct buffer memory

2015-11-11 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000901#comment-15000901
 ] 

mlowicki commented on CASSANDRA-10689:
--

After upgrade from 2.1.9 to 2.1.11 two days ago I'm getting lots of:
{code}
WARN  [SharedPool-Worker-28] 2015-11-11 19:01:22,409 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-28,5,main]: {}
org.apache.cassandra.io.sstable.CorruptSSTableException: 
org.apache.cassandra.io.compress.CorruptBlockException: 
(/var/lib/cassandra/data2/sync/entity2-e24b5040199b11e5a30f75bb514ae072/sync-entity2-ka-392603-Data.db):
 corruption detected, chunk at 11612338 of length 156219476.
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:85)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:310)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:64)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1894)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:107)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:83)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:42)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1994)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1837)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.11.jar:2.1.11]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.11.jar:2.1.11]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: org.apache.cassandra.io.compress.CorruptBlockException: 
(/var/lib/cassandra/data2/sync/entity2-e24b5040199b11e5a30f75bb514ae072/sync-entity2-ka-392603-Data.db):
 corruption detected, chunk at 11612338 of length 156219476.
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:116)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:81)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
... 21 common frames omitted
Caused by: java.io.IOException: Compressed lengths mismatch
at 
org.apache.cassandra.io.compress.LZ4Compressor.uncompress(LZ4Compressor.java:98)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:112)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
... 22 common frames omitted
{code}

On 3 out of 7 nodes in one data center.

> java.lang.OutOfMemoryError: Direct buffer memory
> 
>
> Key: CASSANDRA-10689
> URL: 

[jira] [Updated] (CASSANDRA-10689) java.lang.OutOfMemoryError: Direct buffer memory

2015-11-11 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10689:
-
Reproduced In: 2.1.11
Fix Version/s: (was: 2.1.11)

> java.lang.OutOfMemoryError: Direct buffer memory
> 
>
> Key: CASSANDRA-10689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10689
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mlowicki
>
> {code}
> ERROR [SharedPool-Worker-63] 2015-11-11 17:53:16,161 
> JVMStabilityInspector.java:117 - JVM state determined to be unstable.  
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_80]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_80]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_80]
> at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) 
> ~[na:1.7.0_80]
> at sun.nio.ch.IOUtil.read(IOUtil.java:195) ~[na:1.7.0_80]
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) 
> ~[na:1.7.0_80]
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:104)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:81)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]  
> at 
> org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:310)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:64)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1894)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:107)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:83)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:42)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1994)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1837)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.1.11.jar:2.1.11]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10689) java.lang.OutOfMemoryError: Direct buffer memory

2015-11-11 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001037#comment-15001037
 ] 

mlowicki commented on CASSANDRA-10689:
--

Running {{scrub}} on nodes with corrupted blocks gives:
{code}
root@db7:~# time nodetool scrub sync entity2



error: null
-- StackTrace --
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:267)
at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
Source)
at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022)
at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
at com.sun.proxy.$Proxy7.scrub(Unknown Source)
at org.apache.cassandra.tools.NodeProbe.scrub(NodeProbe.java:247)
at org.apache.cassandra.tools.NodeProbe.scrub(NodeProbe.java:266)
at org.apache.cassandra.tools.NodeTool$Scrub.execute(NodeTool.java:1277)
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:289)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:203)


real11m38.347s
user0m2.356s
sys 0m0.168s
{code}

> java.lang.OutOfMemoryError: Direct buffer memory
> 
>
> Key: CASSANDRA-10689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10689
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mlowicki
>
> {code}
> ERROR [SharedPool-Worker-63] 2015-11-11 17:53:16,161 
> JVMStabilityInspector.java:117 - JVM state determined to be unstable.  
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_80]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_80]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_80]
> at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) 
> ~[na:1.7.0_80]
> at sun.nio.ch.IOUtil.read(IOUtil.java:195) ~[na:1.7.0_80]
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) 
> ~[na:1.7.0_80]
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:104)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:81)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]  
> at 
> org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:310)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:64)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1894)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:107)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:83)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:42)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1994)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1837)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)

[jira] [Updated] (CASSANDRA-10676) AssertionError in CompactionExecutor

2015-11-09 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10676:
-
Fix Version/s: 2.1.9

> AssertionError in CompactionExecutor
> 
>
> Key: CASSANDRA-10676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10676
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.9
>Reporter: mlowicki
> Fix For: 2.1.9
>
>
> {code}
> ERROR [CompactionExecutor:33329] 2015-11-09 08:16:22,759 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[CompactionExecutor:33329,1,main]
> java.lang.AssertionError: 
> /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-888705-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> ^C
> root@db1:~# tail -f /var/log/cassandra/system.log
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10676) AssertionError in CompactionExecutor

2015-11-09 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-10676:


 Summary: AssertionError in CompactionExecutor
 Key: CASSANDRA-10676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10676
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.9
Reporter: mlowicki


{code}
ERROR [CompactionExecutor:33329] 2015-11-09 08:16:22,759 
CassandraDaemon.java:223 - Exception in thread 
Thread[CompactionExecutor:33329,1,main]
java.lang.AssertionError: 
/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-888705-Data.db
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
^C
root@db1:~# tail -f /var/log/cassandra/system.log
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10676) AssertionError in CompactionExecutor

2015-11-09 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-10676:
-
Environment: C* 2.1.9 on Debian Wheezy  (was: C* 2.1.9)

> AssertionError in CompactionExecutor
> 
>
> Key: CASSANDRA-10676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10676
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.9 on Debian Wheezy
>Reporter: mlowicki
> Fix For: 2.1.9
>
>
> {code}
> ERROR [CompactionExecutor:33329] 2015-11-09 08:16:22,759 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[CompactionExecutor:33329,1,main]
> java.lang.AssertionError: 
> /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-888705-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> ^C
> root@db1:~# tail -f /var/log/cassandra/system.log
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:151)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:236)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8696) nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot

2015-09-16 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790743#comment-14790743
 ] 

mlowicki commented on CASSANDRA-8696:
-

[~folex] [~yukim] looks like this is the same as CASSANDRA-9935. In My C* 2.1.8 
cluster 100% reproducible.

> nodetool repair on cassandra 2.1.2 keyspaces return 
> java.lang.RuntimeException: Could not create snapshot
> -
>
> Key: CASSANDRA-8696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8696
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Liu
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: Logs.zip
>
>
> When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra 
> throw java exceptions: cannot create snapshot. 
> the error log from system.log:
> {noformat}
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 
> StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 
> ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 
> files(632105 bytes)
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 
> StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] 
> Session with /10.97.9.110 is complete
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 
> StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] 
> All sessions completed
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 
> StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] 
> streaming task succeed, returning response to /10.98.194.68
> INFO  [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - 
> [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for 
> Repair
> INFO  [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 
> StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> Starting streaming to /10.66.187.201
> INFO  [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 
> StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, 
> ID#0] Beginning stream session with /10.66.187.201
> INFO  [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 
> StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 
> ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 
> files(632105 bytes)
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,971 
> StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> Session with /10.66.187.201 is complete
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,972 
> StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> All sessions completed
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,972 
> StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] 
> streaming task succeed, returning response to /10.98.194.68
> ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error 
> occurred during snapshot phase
> java.lang.RuntimeException: Could not create snapshot at /10.97.9.110
> at 
> org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_45]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> INFO  [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 
> - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync 
> /10.98.194.68, /10.66.187.201, /10.226.218.135 on range 
> (12817179804668051873746972069086
> 2638799,12863540308359254031520865977436165] for events.[bigint0text, 
> bigint0boolean, bigint0int, dataset_catalog, column_categories, 
> bigint0double, bigint0bigint]
> ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 
> - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the 
> following error
> java.io.IOException: Failed during snapshot creation.
> at 
> org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) 
> 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-09-04 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731416#comment-14731416
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] I've launched repair for all keyspaces {{nodetool repair --in-local-dc 
--parallel}}. #1 was for "OpsCenter", #2 for sync which is mentioned above in 
this thread, #3 for system_traces. Part of the output in 
https://cpaste.org/plvyleda5. Interesting it says:
{code}
[2015-09-04 18:07:55,588] Repair command #2 finished
{code}

Maybe the problem with assertion error is while outputting results as repair 
for sync keyspace always fails after similar time period?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-09-04 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731811#comment-14731811
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] how can I detect that repair succeeded?

We've restarted all nodes couple of days ago so it didn't helped.

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
>

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-11 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681935#comment-14681935
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] any updates?

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 [apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-06 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659666#comment-14659666
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] I've launched repair in 2nd DC to get more logs from repair - 
https://gist.github.com/mlowicki/43e3074f46f12737577e.

I've found two exceptions:

{code}
[2015-08-06 03:03:33,231] Repair session d4f0d420-3baa-11e5-9ec3-75bb514ae072 
for range (-144620433819156,-1424504876804571443] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#d4f0d420-3baa-11e5-9ec3-75bb514ae072 on sync/entity2, 
(-144620433819156,-1424504876804571443]] Validation failed in /10.210.3.162
{code}


and
{code}
[2015-08-06 03:03:33,239] Repair session 967ca730-3bb1-11e5-9ec3-75bb514ae072 
for range (3125697280560263437,3131751716701120659] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#967ca730-3bb1-11e5-9ec3-75bb514ae072 on sync/entity_by_id2, 
(3125697280560263437,3131751716701120659]] Validation failed in /10.210.3.221
{code}

10.210.3.162 = db6.sync.ams.osa
10.210.3.221 = db1.sync.ams.osa

Repair was started on db1.sync.ams.osa.

I see no errors on db6.sync.ams.osa in system.log starting from 2015-08-06 
00:24:16,322 to 2015-08-06 08:04:58,283 (no ERROR string there).

On db1.sync.ams.osa I've found two errors - 
https://gist.github.com/mlowicki/3bf39f9f9ad0d4e202e5.

I've launched {{nodetool scrub}} on db6.sync.ams.osa and will send logs when 
finish.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-06 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660075#comment-14660075
 ] 

mlowicki commented on CASSANDRA-9935:
-

Logs from db6.sync.ams.osa where scrub was started - 
https://drive.google.com/file/d/0B_8mc_afWmd2NjZXZGJRRnI4TzA/view?usp=sharing

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-05 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654970#comment-14654970
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] the same error after ~12 hours:
{code}
[2015-08-05 06:35:07,340] Repair session 18f8c020-3b3c-11e5-a93e-4963524a8bde 
for range (-781874602493000830,-781745173070807746] finished[2015-08-05 
06:35:07,340] Repair command #6 finished
error: nodetool failed, check server logs-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logsat 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
{code}

Logs from db1.sync.lati.osa (10.195.15.162) - 
https://drive.google.com/file/d/0B_8mc_afWmd2LWcxRWRPWTFnMlk/view?usp=sharing 
Logs from db4.sync.lati.osa (10.195.15.167) - 
https://drive.google.com/file/d/0B_8mc_afWmd2ejVnR24tVm5OZUk/view?usp=sharing

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 

[jira] [Commented] (CASSANDRA-9702) Repair running really slow

2015-08-05 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654981#comment-14654981
 ] 

mlowicki commented on CASSANDRA-9702:
-

After upgrade to 2.1.8 we're seeing CASSANDRA-9935 instead.

 Repair running really slow
 --

 Key: CASSANDRA-9702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
 Fix For: 2.1.x

 Attachments: db1.system.log


 We're using 2.1.x since the very beginning and we always had problem with 
 failing or slow repair. In one data center we aren't able to finish repair 
 for many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
 periodically).
 I've launched it today morning (12 hours now) and monitor using 
 https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
  For the first hour it progressed to 9.43% but then it took ~10 hours to 
 reach 9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
 sometimes nothing new for 1 hour).
 Repair launched with:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
 {code}
 Attached log file from today.
 We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-05 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658878#comment-14658878
 ] 

mlowicki commented on CASSANDRA-9935:
-

It didn't print anything to the console on all nodes. I can grep through 
system.log or attach logs from each box if this helps?

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-04 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654169#comment-14654169
 ] 

mlowicki commented on CASSANDRA-9935:
-

Just finished running {{nodetool scrub}} on all nodes in single DC (took ~12 
hours) and started repair.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-03 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652680#comment-14652680
 ] 

mlowicki commented on CASSANDRA-9935:
-

Yes, I'm using LCS. I'll run scrub on these nodes and then repair. Will let you 
know about the result.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-03 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652638#comment-14652638
 ] 

mlowicki commented on CASSANDRA-9935:
-

[~yukim] ping.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 [apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Commented] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars

2015-08-03 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651743#comment-14651743
 ] 

mlowicki commented on CASSANDRA-8821:
-

Because of this bug f.ex. {{cassandra service status}} doesn't work.

 Errors in JVM_OPTS and cassandra_parms environment vars
 ---

 Key: CASSANDRA-8821
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 14.04 LTS amd64
Reporter: Terry Moschou
Assignee: Michael Shuler
Priority: Minor
 Fix For: 2.1.x, 2.2.x

 Attachments: 8821_2.0.txt, 8821_2.1.txt


 Repos:
 deb http://www.apache.org/dist/cassandra/debian 21x main
 deb-src http://www.apache.org/dist/cassandra/debian 21x main
 The cassandra init script
   /etc/init.d/cassandra
 is sourcing the environment file
   /etc/cassandra/cassandra-env.sh
 twice. Once directly from the init script, and again inside
   /usr/sbin/cassandra
 The result is arguments in JVM_OPTS are duplicated.
 Further the JVM opt
   -XX:CMSWaitDuration=1
 is defined twice if jvm = 1.7.60.
 Also, for the environment variable CASSANDRA_CONF used in this context
   -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler
 is undefined when
   /etc/cassandra/cassandra-env.sh
 is sourced from the init script.
 Lastly the variable cassandra_storagedir is undefined in
   /usr/sbin/cassandra
 when used in this context
   -Dcassandra.storagedir=$cassandra_storagedir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-31 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649575#comment-14649575
 ] 

mlowicki commented on CASSANDRA-9935:
-

Failed with the same error after ~13 hours:
{code}
[2015-07-31 16:57:43,909] Repair command #5 finished
error: nodetool failed, check server logs
-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logs
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
{code}

Log file - 
https://drive.google.com/file/d/0B_8mc_afWmd2OV96RDZBclRNSFE/view?usp=sharing.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at 

[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-31 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649575#comment-14649575
 ] 

mlowicki edited comment on CASSANDRA-9935 at 7/31/15 6:19 PM:
--

Failed with the same error after ~13 hours:
{code}
[2015-07-31 16:57:43,909] Repair command #5 finished
error: nodetool failed, check server logs
-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logs
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
{code}

Log file - 
https://drive.google.com/file/d/0B_8mc_afWmd2OV96RDZBclRNSFE/view?usp=sharing.

Tried yesterday to run repair in another DC but got the same error.


was (Author: mlowicki):
Failed with the same error after ~13 hours:
{code}
[2015-07-31 16:57:43,909] Repair command #5 finished
error: nodetool failed, check server logs
-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logs
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
{code}

Log file - 
https://drive.google.com/file/d/0B_8mc_afWmd2OV96RDZBclRNSFE/view?usp=sharing.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648760#comment-14648760
 ] 

mlowicki commented on CASSANDRA-9935:
-

{{nodetool scrub sync}} finished on db1.sync.lati.osa and db5.sync.lati.osa. 
Just launched repair but it can take up to 10-12 hours before it crashes. Will 
keep you updated.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Created] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-9935:
---

 Summary: Repair fails with RuntimeException
 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
 Attachments: db1.sync.lati.osa.cassandra.log

We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade to 
2.1.8 it started to work faster but now it fails with:
{code}
...
[2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
for range (-5474076923322749342,-5468600594078911162] finished
[2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
for range (-8631877858109464676,-8624040066373718932] finished
[2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
for range (-5372806541854279315,-5369354119480076785] finished
[2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
for range (8166489034383821955,8168408930184216281] finished
[2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
for range (6084602890817326921,6088328703025510057] finished
[2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
for range (-781874602493000830,-781745173070807746] finished
[2015-07-29 20:44:03,957] Repair command #4 finished
error: nodetool failed, check server logs
-- StackTrace --
java.lang.RuntimeException: nodetool failed, check server logs
at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
{code}

After running:
{code}
nodetool repair --partitioner-range --parallel --in-local-dc sync
{code}

Last records in logs regarding repair are:
{code}
INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - Repair 
session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
(-7695808664784761779,-7693529816291585568] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - Repair 
session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
(806371695398849,8065203836608925992] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - Repair 
session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
(-5474076923322749342,-5468600594078911162] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - Repair 
session 336f8740-3632-11e5-a93e-4963524a8bde for range 
(-8631877858109464676,-8624040066373718932] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - Repair 
session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
(-5372806541854279315,-5369354119480076785] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - Repair 
session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
(8166489034383821955,8168408930184216281] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - Repair 
session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
(6084602890817326921,6088328703025510057] finished
INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - Repair 
session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
(-781874602493000830,-781745173070807746] finished
{code}

but a bit above I see (at least two times in attached log):
{code}
ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - Repair 
session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
(5765414319217852786,5781018794516851576] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
(5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
(5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
[na:1.7.0_80]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
 ~[apache-cassandra-2.1.8.jar:2.1.8]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.8.jar:2.1.8]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_80]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647925#comment-14647925
 ] 

mlowicki commented on CASSANDRA-9935:
-

 ping db1.sync.lati.osa
PING a10-05-07.lati.osa (10.195.15.162): 56 data bytes

So you've log attached to this ticket.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648101#comment-14648101
 ] 

mlowicki commented on CASSANDRA-9935:
-

Should I run {{nodetool scrub sync}} on db1.sync.lati.osa and db5.sync.lati.osa 
or on all nodes inside this data center?

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 

[jira] [Updated] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9935:

Attachment: db5.sync.lati.osa.cassandra.log

Attached log from 10.195.15.176 (db5.sync.lati.osa).

Older ones available on 
https://drive.google.com/file/d/0B_8mc_afWmd2Vnk4ZE5kS3J6OE0/view?usp=sharing 
and  
https://drive.google.com/file/d/0B_8mc_afWmd2UElxUEZQUmtsaFk/view?usp=sharing 
(They are bigger than 10MB).

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647981#comment-14647981
 ] 

mlowicki commented on CASSANDRA-9935:
-

More logs from db1.sync.lati.osa (10.195.15.162) available on 
https://drive.google.com/file/d/0B_8mc_afWmd2QVk2VVRTRVl1ZDQ/view?usp=sharing 
and 
https://drive.google.com/file/d/0B_8mc_afWmd2MHREM2hzUlNjd0E/view?usp=sharing.

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log, 
 db5.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  

[jira] [Commented] (CASSANDRA-9702) Repair running really slow

2015-07-02 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611607#comment-14611607
 ] 

mlowicki commented on CASSANDRA-9702:
-

After another ~12 hours it progressed to 10.21%.

 Repair running really slow
 --

 Key: CASSANDRA-9702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
 Attachments: db1.system.log


 We're using 2.1.x since the very beginning and we always had problem with 
 failing or slow repair. In one data center we aren't able to finish repair 
 for many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
 periodically).
 I've launched it today morning (12 hours now) and monitor using 
 https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
  For the first hour it progressed to 9.43% but then it took ~10 hours to 
 reach 9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
 sometimes nothing new for 1 hour).
 Repair launched with:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
 {code}
 Attached log file from today.
 We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9702) Repair running really slow

2015-07-02 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611607#comment-14611607
 ] 

mlowicki edited comment on CASSANDRA-9702 at 7/2/15 1:55 PM:
-

After another ~12 hours it progressed to 10.21%. 6 hours later it's 10.52%.


was (Author: mlowicki):
After another ~12 hours it progressed to 10.21%.

 Repair running really slow
 --

 Key: CASSANDRA-9702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
 Fix For: 2.1.x

 Attachments: db1.system.log


 We're using 2.1.x since the very beginning and we always had problem with 
 failing or slow repair. In one data center we aren't able to finish repair 
 for many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
 periodically).
 I've launched it today morning (12 hours now) and monitor using 
 https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
  For the first hour it progressed to 9.43% but then it took ~10 hours to 
 reach 9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
 sometimes nothing new for 1 hour).
 Repair launched with:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
 {code}
 Attached log file from today.
 We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-07-01 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609848#comment-14609848
 ] 

mlowicki commented on CASSANDRA-9681:
-

After couple of hours it's still fine - 
https://www.dropbox.com/s/ox5xzxqbojyv7wz/Screenshot%202015-07-01%2011.49.53.png?dl=0.
 It always started to grow right after restart so we can assume that this 
problem is fixed.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9702) Repair running really slow

2015-07-01 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-9702:
---

 Summary: Repair running really slow
 Key: CASSANDRA-9702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
 Attachments: db1.system.log

We're using 2.1.x since the very beginning and we always had problem with 
failing or slow repair. In one data center we aren't able to finish repair for 
many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
periodically).

I've launched it today morning (12 hours now) and monitor using 
https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
 For the first hour it progressed to 9.43% but then it took ~10 hours to reach 
9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
sometimes nothing new for 1 hour).

Repair launched with:
{code}
nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
{code}

Attached log file from today.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9702) Repair running really slow

2015-07-01 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9702:

Description: 
We're using 2.1.x since the very beginning and we always had problem with 
failing or slow repair. In one data center we aren't able to finish repair for 
many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
periodically).

I've launched it today morning (12 hours now) and monitor using 
https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
 For the first hour it progressed to 9.43% but then it took ~10 hours to reach 
9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
sometimes nothing new for 1 hour).

Repair launched with:
{code}
nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
{code}

Attached log file from today.

We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).

  was:
We're using 2.1.x since the very beginning and we always had problem with 
failing or slow repair. In one data center we aren't able to finish repair for 
many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
periodically).

I've launched it today morning (12 hours now) and monitor using 
https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
 For the first hour it progressed to 9.43% but then it took ~10 hours to reach 
9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
sometimes nothing new for 1 hour).

Repair launched with:
{code}
nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
{code}

Attached log file from today.


 Repair running really slow
 --

 Key: CASSANDRA-9702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
 Attachments: db1.system.log


 We're using 2.1.x since the very beginning and we always had problem with 
 failing or slow repair. In one data center we aren't able to finish repair 
 for many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
 periodically).
 I've launched it today morning (12 hours now) and monitor using 
 https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
  For the first hour it progressed to 9.43% but then it took ~10 hours to 
 reach 9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
 sometimes nothing new for 1 hour).
 Repair launched with:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
 {code}
 Attached log file from today.
 We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-07-01 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Comment: was deleted

(was: Great. I'm not Java guy so what is the best way to patch jar file I've 
installed from DataStax repo?)

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-07-01 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609685#comment-14609685
 ] 

mlowicki edited comment on CASSANDRA-9681 at 7/1/15 7:34 AM:
-

So far so good - 
https://www.dropbox.com/s/ad8te1g6iz2wofe/Screenshot%202015-07-01%2009.31.00.png?dl=0.
 I'll let you know if it'll degrade or not. GC pauses we've talked about 
yesterday are probably caused by misbehaving Logstash or Kibana as I've checked 
using jstat and gc.log that everything is fine on this boxes.

All nodes in the cluster have been patched ~7am.


was (Author: mlowicki):
So far so good - 
https://www.dropbox.com/s/ad8te1g6iz2wofe/Screenshot%202015-07-01%2009.31.00.png?dl=0.
 I'll let you know if it'll degrade or not. GC pauses we've talked about 
yesterday are probably caused by misbehaving Logstash or Kibana as I've checked 
using jstat and gc.log that everything is fine on this boxes.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-07-01 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609685#comment-14609685
 ] 

mlowicki commented on CASSANDRA-9681:
-

So far so good - 
https://www.dropbox.com/s/ad8te1g6iz2wofe/Screenshot%202015-07-01%2009.31.00.png?dl=0.
 I'll let you know if it'll degrade or not. GC pauses we've talked about 
yesterday are probably caused by misbehaving Logstash or Kibana as I've checked 
using jstat and gc.log that everything is fine on this boxes.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607913#comment-14607913
 ] 

mlowicki commented on CASSANDRA-9681:
-

https://www.dropbox.com/s/cnv36bbdznbwc0g/Screenshot%202015-06-30%2010.07.27.png?dl=0
 - this if chart from the box I was creating heap dump. Please keep in mind 
that the metric changes rapidly. It can grow from ~300MB to over 1GB within 3 
minutes. I'll prepare heap dump today once again.

I'm using jmap:
{code}
root@db5:/var# jmap -F -dump:file=cassandra.bin 19189
{code}
and this C* node is dead for the rest of the cluster for ~40minutes. Can this 
be avoided?



 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607942#comment-14607942
 ] 

mlowicki edited comment on CASSANDRA-9681 at 6/30/15 8:31 AM:
--

Attaching logs from db5:
{code}
-rw-r--r--  1 cassandra cassandra 3.3M Jun 30 07:58 system.log
-rw-r--r--  1 cassandra cassandra 854K Jun 29 14:19 system.log.1.zip
-rw-r--r--  1 cassandra cassandra 1.3M Jun 27 22:31 system.log.2.zip
-rw-r--r--  1 cassandra cassandra 1.8M Jun 24 11:43 system.log.3.zip
{code}

Memtable heap size on this boxes behaves like on the chart - 
https://www.dropbox.com/s/l9cgch2hlguco85/Screenshot%202015-06-30%2010.30.59.png?dl=0


was (Author: mlowicki):
Attaching logs from db5:
{code}
-rw-r--r--  1 cassandra cassandra 3.3M Jun 30 07:58 system.log
-rw-r--r--  1 cassandra cassandra 854K Jun 29 14:19 system.log.1.zip
-rw-r--r--  1 cassandra cassandra 1.3M Jun 27 22:31 system.log.2.zip
-rw-r--r--  1 cassandra cassandra 1.8M Jun 24 11:43 system.log.3.zip
{code}

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607934#comment-14607934
 ] 

mlowicki commented on CASSANDRA-9681:
-

Without -F it gives:
{code}
root@db5:/var# jmap -dump:file=cassandra.bin 19189
19189: Unable to open socket file: target process not responding or HotSpot VM 
not loaded
The -F option can be used when the target process is not responding
{code}

I've started dumping heap when metrics shows 1.7GB. Will attach soon. Logs will 
be available in a few.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607913#comment-14607913
 ] 

mlowicki edited comment on CASSANDRA-9681 at 6/30/15 8:12 AM:
--

https://www.dropbox.com/s/cnv36bbdznbwc0g/Screenshot%202015-06-30%2010.07.27.png?dl=0
 - this if chart from the box I was creating heap dump. Please keep in mind 
that the metric changes rapidly. It can grow from ~300MB to over 1GB within 3 
minutes. I'll prepare heap dump today once again.

I'm using jmap:
{code}
root@db5:/var# jmap -F -dump:file=cassandra.bin 19189
{code}
and this C* node is dead for the rest of the cluster for ~40minutes 
(https://gist.github.com/mlowicki/7645963e2a1ac4563578). Can this be avoided?




was (Author: mlowicki):
https://www.dropbox.com/s/cnv36bbdznbwc0g/Screenshot%202015-06-30%2010.07.27.png?dl=0
 - this if chart from the box I was creating heap dump. Please keep in mind 
that the metric changes rapidly. It can grow from ~300MB to over 1GB within 3 
minutes. I'll prepare heap dump today once again.

I'm using jmap:
{code}
root@db5:/var# jmap -F -dump:file=cassandra.bin 19189
{code}
and this C* node is dead for the rest of the cluster for ~40minutes. Can this 
be avoided?



 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Attachment: db5.system.log.3.zip
db5.system.log.2.zip
db5.system.log.1.zip
db5.system.log

Attaching logs from db5:
{code}
-rw-r--r--  1 cassandra cassandra 3.3M Jun 30 07:58 system.log
-rw-r--r--  1 cassandra cassandra 854K Jun 29 14:19 system.log.1.zip
-rw-r--r--  1 cassandra cassandra 1.3M Jun 27 22:31 system.log.2.zip
-rw-r--r--  1 cassandra cassandra 1.8M Jun 24 11:43 system.log.3.zip
{code}

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608231#comment-14608231
 ] 

mlowicki commented on CASSANDRA-9681:
-

Cool. If more logs / dumps / cheers needed just let me know.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608278#comment-14608278
 ] 

mlowicki commented on CASSANDRA-9681:
-

Sure, just let me know and we'll try to apply the patch.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608082#comment-14608082
 ] 

mlowicki commented on CASSANDRA-9681:
-

Heap dump - 
https://drive.google.com/file/d/0B_8mc_afWmd2bGhpd0p2Ql9UMkU/view?usp=sharing.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608449#comment-14608449
 ] 

mlowicki commented on CASSANDRA-9681:
-

Great. I'm not Java guy so what is the best way to patch jar file I've 
installed from DataStax repo?

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-30 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Attachment: schema.cql

Attaching our schema.

We're using LCS and we aren't using secondary indexes.

Heap dump is uploading to Google Drive so should be available soon.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, 
 db5.system.log.2.zip, db5.system.log.3.zip, schema.cql, system.log.6.zip, 
 system.log.7.zip, system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)
mlowicki created CASSANDRA-9681:
---

 Summary: Memtable heap size grows and many long GC pauses are 
triggered
 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Priority: Critical


C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Description: 
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.

Cliffs on the graphs are nodes restarts.

  was:
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.


 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Priority: Critical

 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Description: 
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.

Cliffs on the graphs are nodes restarts.

Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
level - 
https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.

  was:
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.

Cliffs on the graphs are nodes restarts.


 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Priority: Critical

 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Comment: was deleted

(was: I'll get heap dump probably tomorrow then as nodes have been restarted ~2 
hours ago.)

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Attachment: cassandra.yaml

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Attachment: system.log.9.zip
system.log.8.zip
system.log.7.zip
system.log.6.zip

{code}
-rw-r--r--  1 cassandra cassandra 1.3M Jun 12 14:32 system.log.6.zip
-rw-r--r--  1 cassandra cassandra 1.9M Jun 10 13:11 system.log.7.zip
-rw-r--r--  1 cassandra cassandra 1.9M Jun  6 21:55 system.log.8.zip
-rw-r--r--  1 cassandra cassandra 1.9M Jun  4 01:29 system.log.9.zip
{code}

Logs from the time when it basically started. If more needed just let me know.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mlowicki updated CASSANDRA-9681:

Description: 
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.

Cliffs on the graphs are nodes restarts.

Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
level - 
https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.

Replication factor is set to 3.

  was:
C* 2.1.7 cluster is behaving really bad after 1-2 days. 
{{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
 jumps to 7 GB 
(https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
 on 3/6 nodes in each data center and then there are many long GC pauses. 
Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})

Before C* 2.1.5 memtables heap size was basically constant ~500MB 
(https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)

After restarting all nodes is behaves stable for 1-2days. Today I've done that 
and long GC pauses are gone (~18:00 
https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
 The only pattern we've found so far is that long GC  pauses are happening 
basically at the same time on all nodes in the same data center - even on the 
ones where memtables heap size is not growing.

Cliffs on the graphs are nodes restarts.

Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
level - 
https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.


 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606057#comment-14606057
 ] 

mlowicki commented on CASSANDRA-9681:
-

It started ~ 04.06 (at about the same time on all affected boxes - 
https://www.dropbox.com/s/9c6p2xdmncktbnu/Screenshot%202015-06-29%2020.16.02.png?dl=0,
 
https://www.dropbox.com/s/gs8bztzr394icz0/Screenshot%202015-06-29%2020.16.24.png?dl=0).
 Will attach logs soon.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606084#comment-14606084
 ] 

mlowicki commented on CASSANDRA-9681:
-

I'll get heap dump probably tomorrow then as nodes have been restarted ~2 hours 
ago.

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered

2015-06-29 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606399#comment-14606399
 ] 

mlowicki commented on CASSANDRA-9681:
-

https://www.dropbox.com/s/nhgudkyxwjdrq0f/cassandra.bin?dl=0 This dump has been 
created when memtables heap size was ~800MB (on not affected boxes it's  
500MB).

 Memtable heap size grows and many long GC pauses are triggered
 --

 Key: CASSANDRA-9681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.7, Debian Wheezy
Reporter: mlowicki
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.x

 Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, 
 system.log.8.zip, system.log.9.zip


 C* 2.1.7 cluster is behaving really bad after 1-2 days. 
 {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
  jumps to 7 GB 
 (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
  on 3/6 nodes in each data center and then there are many long GC pauses. 
 Cluster is using default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
 Before C* 2.1.5 memtables heap size was basically constant ~500MB 
 (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
 After restarting all nodes is behaves stable for 1-2days. Today I've done 
 that and long GC pauses are gone (~18:00 
 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
  The only pattern we've found so far is that long GC  pauses are happening 
 basically at the same time on all nodes in the same data center - even on the 
 ones where memtables heap size is not growing.
 Cliffs on the graphs are nodes restarts.
 Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same 
 level - 
 https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
 Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9612) Assertion error while running `nodetool cfstats`

2015-06-18 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592309#comment-14592309
 ] 

mlowicki commented on CASSANDRA-9612:
-

[~mambocab] yes.

 Assertion error while running `nodetool cfstats`
 

 Key: CASSANDRA-9612
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9612
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.6
Reporter: mlowicki

  nodetool cfstats sync.entity
 {code}
 Keyspace: sync
   Read Count: 2916573
   Read Latency: 0.26340278573517617 ms.
   Write Count: 2356495
   Write Latency: 0.03296340242606074 ms.
   Pending Flushes: 0
   Table: entity
   SSTable count: 919
   SSTables in each level: [50/4, 11/10, 101/100, 756, 0, 0, 0, 0, 
 0]
   Space used (live): 146265014558
   Space used (total): 146265014558
   Space used by snapshots (total): 0
   Off heap memory used (total): 97950899
   SSTable Compression Ratio: 0.1870809135227128
 error: 
 /var/lib/cassandra/data2/sync/entity-f73d1360770e11e49f1d673dc3e50a5f/sync-entity-tmplink-ka-516810-Data.db
 -- StackTrace --
 java.lang.AssertionError: 
 /var/lib/cassandra/data2/sync/entity-f73d1360770e11e49f1d673dc3e50a5f/sync-entity-tmplink-ka-516810-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:270)
   at 
 org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
   at 
 org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
   at 
 com.yammer.metrics.reporting.JmxReporter$Gauge.getValue(JmxReporter.java:63)
   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at 
 com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
   at 
 com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657)
   at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$2.run(Transport.java:202)
   at sun.rmi.transport.Transport$2.run(Transport.java:199)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:198)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681)
   at java.security.AccessController.doPrivileged(Native Method)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 

  1   2   >