[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-03-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208474#comment-15208474
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

[~marco.zattoo] I think your problem will be fixed with CASSANDRA-11373

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-03-01 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173441#comment-15173441
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

I had to bring the cluster into a healthy state and deleted the whole keyspace 
which was affected. Node that to me it seemed that the problem somehow spread 
from first only occurring on one node and then spread to other nodes. Also I 
guess rate limiting the amount of how many times this log message gets logged 
would be nice in order to be able to debug issues. 

Since the deletion of the whole keyspace I haven't seen any issues.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171829#comment-15171829
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

If that is the cause, then it is another bug (I will have a look)

Unless you have already fixed this, could you try an offline scrub before you 
start?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169058#comment-15169058
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

Here is the exception what happens prior the log spamming starts
{code}
ERROR [CompactionExecutor:5] 2016-02-26 14:05:26,622 CassandraDaemon.java:195 - 
Exception in thread Thread[CompactionExecutor:5,1,main]
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:49) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BufferCell.tombstone(BufferCell.java:88) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BufferCell.tombstone(BufferCell.java:83) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at org.apache.cassandra.db.rows.BufferCell.purge(BufferCell.java:175) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.lambda$purge$107(ComplexColumnData.java:165)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData$$Lambda$126/379481423.apply(Unknown
 Source) ~[na:na]
at 
org.apache.cassandra.utils.btree.BTree$FiltrationTracker.apply(BTree.java:650) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:693) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:668) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.transformAndFilter(ComplexColumnData.java:170)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.purge(ComplexColumnData.java:165)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.purge(ComplexColumnData.java:43) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow.lambda$purge$102(BTreeRow.java:333) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow$$Lambda$125/1572342504.apply(Unknown 
Source) ~[na:na]
at 
org.apache.cassandra.utils.btree.BTree$FiltrationTracker.apply(BTree.java:650) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:693) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:668) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow.transformAndFilter(BTreeRow.java:338) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at org.apache.cassandra.db.rows.BTreeRow.purge(BTreeRow.java:333) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.partitions.PurgeFunction.applyToRow(PurgeFunction.java:88)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:116) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:120) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:57) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:153)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:118)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:74)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:132)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_45]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169049#comment-15169049
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

I've hit the bug again without doing any repair. Again the logs are so quickly 
filled with the message that I was unable to get any message prior the LCS 
message.
{code}
root@cassandra1:/var/log/cassandra# nodetool compactionstats
pending tasks: 41
- ham.raw_sessions: 41
{code}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168960#comment-15168960
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] I've added nodetool tpstats output. Would you like to see some other 
output from nodetool?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168954#comment-15168954
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

No I didn't start any repair after restarting the nodes. Actually on one node I 
just restarted and will try to catch the exception happening in the thread 
doing the compaction. I hope that helps to figure out the root cause.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168939#comment-15168939
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

no, it should fix itself on restart, you sure you are not running repairs at 
all?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168938#comment-15168938
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

Unfortunately I can't provide any exception trace from before the issue starts 
because all the 20 zip logs are already full with the log line above. I also 
started a {{nodetool scrub ham raw_sessions}} but I'm wondering whether this 
will do any good and there is lot's data so it might take a while until it hits 
the sstables causing the issue.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168934#comment-15168934
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] I've restarted nodes many times and the issue just comes up again. 
Does this mean that it is not related to the fix you've done? The issue happens 
during normal compactions on just restarted node after some time (I think after 
it tries to do the first compaction) with the result that the system.log starts 
get spammed with the infinite amount to the messages.
{code}
INFO  [CompactionExecutor:5] 2016-02-26 05:45:56,479 LeveledManifest.java:438 - 
Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data2/ham/raw_sessions-417de7c0bb4711e4972d05e7bd5b0c2f/ma-45159-big-Data.db')
 to candidates
{code}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168895#comment-15168895
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

[~marco.zattoo] just restart the affected nodes and it is fixed

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168706#comment-15168706
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] Ok I guess that is to prevent it from occurring but we've already 
done this. Is there a way to fix the issue afterwards e.g. would crub'ing the 
tables bring something?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-25 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168526#comment-15168526
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

yes, don't do {{-full}} repairs

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-25 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168435#comment-15168435
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

We are seeing this issue on multiple nodes and it brings down our cassandra 
cluster. Is there some way to mitigate the issue until 3.4 is out?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-23 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158646#comment-15158646
 ] 

Branimir Lambov commented on CASSANDRA-11172:
-

Thank you, +1 on the patch.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158636#comment-15158636
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

pushed an updated dtest which finishes in 300s with the patch and never without 
the patch

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158492#comment-15158492
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

I will try

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158490#comment-15158490
 ] 

Branimir Lambov commented on CASSANDRA-11172:
-

Wouldn't it also hang with a 10x smaller stress size and {{sstable_size_in_mb}}?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158454#comment-15158454
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

[~kohlisankalp] we have no way of triggering this in 2.1, it only happens if we 
repair (and anticompact) an already repaired sstable, and we can't do that in 
2.1 - the only way to trigger anticompaction in 2.1 is with "-inc" and then we 
only include non-repaired sstables. But I guess I can commit it to 2.1 for 
correctness etc

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158081#comment-15158081
 ] 

Jeff Jirsa commented on CASSANDRA-11172:


+1 vote for inclusion in 2.1

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157952#comment-15157952
 ] 

sankalp kohli commented on CASSANDRA-11172:
---

[~krummas] Fix version should also include 2.1? 

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157735#comment-15157735
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

I used [this|https://github.com/krummas/cassandra-dtest/commits/marcuse/11172] 
to reproduce locally, but I was a bit skeptical to committing as it takes quite 
a long time to run and the failure case is that it hangs. But I guess it is a 
good test to run anyway.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157275#comment-15157275
 ] 

Branimir Lambov commented on CASSANDRA-11172:
-

The code looks good, but it needs a regression test.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156721#comment-15156721
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

and the problem is not the "Adding high-level ..." - it is the fact that we 
keep around sstables in the wrong compaction strategy instance..

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156720#comment-15156720
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

Think I found the issue, this happens if you do nodetool repair -full (when 
replacing repaired sstables there was an assumption that the sstables was 
unrepaired before, but that is not the case when you do -full repairs)

||branch||testall||dtest||
|[marcuse/11172|https://github.com/krummas/cassandra/tree/marcuse/11172]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-dtest]|
|[marcuse/11172-3.0|https://github.com/krummas/cassandra/tree/marcuse/11172-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-3.0-dtest]|
|[marcuse/11172-trunk|https://github.com/krummas/cassandra/tree/marcuse/11172-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11172-trunk-dtest]|


> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-20 Thread Pete Ehlke (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155763#comment-15155763
 ] 

Pete Ehlke commented on CASSANDRA-11172:


Just got hit with this today on 2.2.5.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-17 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150140#comment-15150140
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

The problem is (probably) that we have an sstable in the wrong level in the 
manifest.

I have not been able to reproduce this locally but it would make sense (ie, the 
sstable would not get removed from the manifest and we could end up in an 
infinite loop), but I think we should see an exception before things explode 
like this, so it would be really helpful if anyone has logs an hour or so 
*before* this happens

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150049#comment-15150049
 ] 

Will Hayworth commented on CASSANDRA-11172:
---

debug.log? I can go hunt through my compressed files but this quickly pushes 
the other logs out of rotation. I've watched it happen. Thousands upon 
thousands of lines. And it's happened for a bunch of different nodes. The 
solution has been to kill them and restart but it's temporary.

I just checked--all 20 zipped system.logs are filled with 40,000+ lines like 
the ones in my sample file. I was going to upload some to you but they're just 
identical, including the earliest lines I have:
{noformat}
INFO  [CompactionExecutor:35] 2016-02-17 01:59:03,111 LeveledManifest.java:438 
- Adding high-level (L0) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/domain_events_by_event_domain_time-e81d74f0cd3a11e5aad8e7b84e29e52f/ma-3663-big-Data.db')
 to candidates
INFO  [CompactionExecutor:37] 2016-02-17 01:59:03,112 LeveledManifest.java:438 
- Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/times_by_event_domain_user-e6112a30cd3a11e5ba896547d15a24f6/ma-4586-big-Data.db')
 to candidates
INFO  [CompactionExecutor:35] 2016-02-17 01:59:03,276 LeveledManifest.java:438 
- Adding high-level (L0) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/domain_events_by_event_domain_time-e81d74f0cd3a11e5aad8e7b84e29e52f/ma-3663-big-Data.db')
 to candidates
INFO  [CompactionExecutor:37] 2016-02-17 01:59:03,276 LeveledManifest.java:438 
- Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/times_by_event_domain_user-e6112a30cd3a11e5ba896547d15a24f6/ma-4586-big-Data.db')
 to candidates
{noformat}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149983#comment-15149983
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

greping the logs for ma-3663 would be helpful as well

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149978#comment-15149978
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

[~_wsh] do you have the logs leading up to the breakage? The debug.log would be 
helpful

Does it happen every time for this node?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149812#comment-15149812
 ] 

Will Hayworth commented on CASSANDRA-11172:
---

I'm seeing this exact problem with full repairs on C* 3.3.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149188#comment-15149188
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

you should definitely upgrade

but I have never seen this happen, but LCS with inc repair was very broken 
before CASSANDRA-10831

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149009#comment-15149009
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

Yes. It's after incremental repair that I'm seeing this. Next time around I'll 
check that the file listed doesn't exist before restart, but I think this is a 
duplicate.

Alternately, though, I'm also seeing the gossip thread lockup at times after 
incremental repair without mentioning higher level sstables, so that might be a 
new ticket to file next time around.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149005#comment-15149005
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

do you run incremental repair?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148994#comment-15148994
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

Possible duplicate of https://issues.apache.org/jira/browse/CASSANDRA-10831 ?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148988#comment-15148988
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

It's this: `INFO  [CompactionExecutor:12750] 2016-02-16 17:47:48,642  
LeveledManifest.java:415 - Adding high-level (L0) 
SSTableReader(path='/mnt/cassandra/data/youtube/youtube_videos-2d16275b7ff93269bea0aac894e1abaa/youtube-youtube_videos-ka-104968-Data.db')
 to candidates` repeating endlessly. That one's extra special this time because 
of the L0. I'll look in our aggregation server and try to get value from the 
time when it starts.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-15 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147693#comment-15147693
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

could you post logs?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)