[jira] [Commented] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500435#comment-15500435 ] Navjyot Nishant commented on CASSANDRA-12655: - Hello Wei, Thank for responding. Actually its an issue with compaction getting blocked, anticompaction is moving through without any issue. Let me explain in detail - 1. We run incremental repair one one node at a time. 2. When repair starts it shows completion progress and for large keyspace after showing 100% it take some times/couple of minutes to move forward with next keyspace. When we verified actually it wait for anticompaction to get completed on all the relevant replicas. The moment anticompaction gets completed on all replicas it move forward with next keyspace. 3. Then compaction starts followed by anticompaction which sometime get hang on random replicas, resulting that particular replica become unresponsive which impact the repair running on next keyspace/node hence the repair also become unresponsive. I am able to omit this blocking behavior if i disable autocompaction before starting the repair. But post repair when i enable anticompaction it gets blocked on random node and the only way to resolve it bounce the node, which doesn't seems practical. For now i am able to resolve this issue by not using -dcpar. So far i have been trying to use -dcpar to speedup the repair but the moment i have removed it it is not complaining and compaction is also going through. This spare us some time to plan for the upgrade early next year directly to 3.x. -dcpar is working fine on other non prod environment but it seems it has problem with one of the largest keyspace which has table of size 3-4GB? If you guys can relate the above issues & resolution that would be great. Thanks! > Incremental repair & compaction hang on random nodes > > > Key: CASSANDRA-12655 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12655 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: CentOS Linux release 7.1.1503 (Core) > RAM - 64GB > HEAP - 16GB > Load on each node - ~5GB > Cassandra Version - 2.2.5 >Reporter: Navjyot Nishant >Priority: Blocker > > Hi We are setting up incremental repair on our 18 node cluster. Avg load on > each node is ~5GB. The repair run fine on couple of nodes and sudently get > stuck on random nodes. Upon checking the system.log of impacted node we dont > see much information. > Following are the lines we see in system.log and its there from the point > repair is not making progress - > {code} > INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 > CompactionManager.java:1221 - Anticompacting > [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), > > BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] > INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > {code} > When we try to see pending compaction by executing {code}nodetool > compactionstats{code} it hangs as well and doesn't return anything. However > {code}nodetool tpstats{code} show active and pending compaction which never > come down and keep increasing. > {code} > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 221208 0 > 0 > ReadStage 0 01288839 0 > 0 > RequestResponseStage 0 0 104356 0 > 0 > ReadRepairStage 0 0 72 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > HintedHandoff 0 0 46 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor866 68124 0 > 0 > MemtableReclaimMemory 0 0166 0 > 0 > PendingRangeCalculator0 0 38 0 > 0 > GossipStage 0 0 242455 0 > 0 > MigrationStage0 0 0 0 > 0 > MemtablePostFlush 0 0 3682 0 >
[jira] [Commented] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496384#comment-15496384 ] Navjyot Nishant commented on CASSANDRA-12655: - Thanks Marcus. Well all these issues has a point of identification, either the error is being logged in system.log or the user is getting some sort of error so that we can relate the issue. In our case we don't see any ERROR, WARN, timeout, failure etc in system.log. hence the issue is clueless. We wanted to understand what is causing this? where is the loophole? > Incremental repair & compaction hang on random nodes > > > Key: CASSANDRA-12655 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12655 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: CentOS Linux release 7.1.1503 (Core) > RAM - 64GB > HEAP - 16GB > Load on each node - ~5GB > Cassandra Version - 2.2.5 >Reporter: Navjyot Nishant >Priority: Blocker > > Hi We are setting up incremental repair on our 18 node cluster. Avg load on > each node is ~5GB. The repair run fine on couple of nodes and sudently get > stuck on random nodes. Upon checking the system.log of impacted node we dont > see much information. > Following are the lines we see in system.log and its there from the point > repair is not making progress - > {code} > INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 > CompactionManager.java:1221 - Anticompacting > [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), > > BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] > INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > {code} > When we try to see pending compaction by executing {code}nodetool > compactionstats{code} it hangs as well and doesn't return anything. However > {code}nodetool tpstats{code} show active and pending compaction which never > come down and keep increasing. > {code} > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 221208 0 > 0 > ReadStage 0 01288839 0 > 0 > RequestResponseStage 0 0 104356 0 > 0 > ReadRepairStage 0 0 72 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > HintedHandoff 0 0 46 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor866 68124 0 > 0 > MemtableReclaimMemory 0 0166 0 > 0 > PendingRangeCalculator0 0 38 0 > 0 > GossipStage 0 0 242455 0 > 0 > MigrationStage0 0 0 0 > 0 > MemtablePostFlush 0 0 3682 0 > 0 > ValidationExecutor0 0 2246 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0166 0 > 0 > InternalResponseStage 0 0 8866 0 > 0 > AntiEntropyStage 0 0 15417 0 > 0 > Repair#7 0 0160 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > Native-Transport-Requests 0 0 327334 0 > 0 > Message type Dropped > READ 0 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 0 > COUNTER_MUTATION 0 > REQUEST_RESPONSE 0 > PAGED_RANGE 0 > READ_REPAIR 0 > {code} > {code} nodetool netstats{code} shows some pending messages which never get > processed and noting in progress - > {code} > Mode: NORMAL > Not sending any streams. > Read Repair
[jira] [Updated] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-12655: Description: Hi We are setting up incremental repair on our 18 node cluster. Avg load on each node is ~5GB. The repair run fine on couple of nodes and sudently get stuck on random nodes. Upon checking the system.log of impacted node we dont see much information. Following are the lines we see in system.log and its there from the point repair is not making progress - {code} INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 CompactionManager.java:1221 - Anticompacting [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 IndexSummaryRedistribution.java:74 - Redistributing index summaries INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 IndexSummaryRedistribution.java:74 - Redistributing index summaries {code} When we try to see pending compaction by executing {code}nodetool compactionstats{code} it hangs as well and doesn't return anything. However {code}nodetool tpstats{code} show active and pending compaction which never come down and keep increasing. {code} Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 221208 0 0 ReadStage 0 01288839 0 0 RequestResponseStage 0 0 104356 0 0 ReadRepairStage 0 0 72 0 0 CounterMutationStage 0 0 0 0 0 HintedHandoff 0 0 46 0 0 MiscStage 0 0 0 0 0 CompactionExecutor866 68124 0 0 MemtableReclaimMemory 0 0166 0 0 PendingRangeCalculator0 0 38 0 0 GossipStage 0 0 242455 0 0 MigrationStage0 0 0 0 0 MemtablePostFlush 0 0 3682 0 0 ValidationExecutor0 0 2246 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0166 0 0 InternalResponseStage 0 0 8866 0 0 AntiEntropyStage 0 0 15417 0 0 Repair#7 0 0160 0 0 CacheCleanupExecutor 0 0 0 0 0 Native-Transport-Requests 0 0 327334 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {code} {code} nodetool netstats{code} shows some pending messages which never get processed and noting in progress - {code} Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 15585 Mismatch (Blocking): 0 Mismatch (Background): 0 Pool NameActive Pending Completed Large messages n/a12562 Small messages n/a 0 999779 Gossip messages n/a 0 264394 {code} The only solution we have is bounce the node and all the pending compactions started getting processed immediately and get processed in 5 - 10 minutes. This is a road blocker issue for us and and help in this matter would be highly appreciated. was: Hi We are setting up incremental repair on our 18 node cluster. Avg load on each node is ~5GB. The repair run fine on couple of nodes and sudently get stuck on random nodes. Upon checking the system.log of impacted node we dont see much information. Following are the lines we see in system.log and its there from the point repair is not making progress - {code} INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 CompactionManager.java:1221 - Anticompacting
[jira] [Commented] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496334#comment-15496334 ] Navjyot Nishant commented on CASSANDRA-12655: - Hi Marcus, Is that the only solution? This is the issues we are getting in our production environment and upgrade will not be straight forward and quick. We will have to follow several process and take sign off from several stake holders in order to move forward with that and also we will have to explain the reason of upgrade in detail.. We have anyways a plan to move on version 3 early next year. In the meanwhile if we could fix the issue by keeping same version that would be great. Would you please explain about the root cause or guide me to the actual issue which is driving this? we are clue less we are not sure what we are supposed to check and where? > Incremental repair & compaction hang on random nodes > > > Key: CASSANDRA-12655 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12655 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: CentOS Linux release 7.1.1503 (Core) > RAM - 64GB > HEAP - 16GB > Load on each node - ~5GB > Cassandra Version - 2.2.5 >Reporter: Navjyot Nishant >Priority: Blocker > > Hi We are setting up incremental repair on our 18 node cluster. Avg load on > each node is ~5GB. The repair run fine on couple of nodes and sudently get > stuck on random nodes. Upon checking the system.log of impacted node we dont > see much information. > Following are the lines we see in system.log and its there from the point > repair is not making progress - > {code} > INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 > CompactionManager.java:1221 - Anticompacting > [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), > > BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] > INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 > IndexSummaryRedistribution.java:74 - Redistributing index summaries > {code} > When we try to see pending compaction by executing {code}nodetool > compactionstats{code} it hangs as well and doesn't return anything. However > {code}nodetool tpstats{code} show active and pending compaction which never > come down and keep increasing. > {code} > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 221208 0 > 0 > ReadStage 0 01288839 0 > 0 > RequestResponseStage 0 0 104356 0 > 0 > ReadRepairStage 0 0 72 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > HintedHandoff 0 0 46 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor866 68124 0 > 0 > MemtableReclaimMemory 0 0166 0 > 0 > PendingRangeCalculator0 0 38 0 > 0 > GossipStage 0 0 242455 0 > 0 > MigrationStage0 0 0 0 > 0 > MemtablePostFlush 0 0 3682 0 > 0 > ValidationExecutor0 0 2246 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0166 0 > 0 > InternalResponseStage 0 0 8866 0 > 0 > AntiEntropyStage 0 0 15417 0 > 0 > Repair#7 0 0160 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > Native-Transport-Requests 0 0 327334 0 > 0 > Message type Dropped > READ 0 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 0 > COUNTER_MUTATION
[jira] [Updated] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-12655: Description: Hi We are setting up incremental repair on our 18 node cluster. Avg load on each node is ~5GB. The repair run fine on couple of nodes and sudently get stuck on random nodes. Upon checking the system.log of impacted node we dont see much information. Following are the lines we see in system.log and its there from the point repair is not making progress - {code} INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 CompactionManager.java:1221 - Anticompacting [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 IndexSummaryRedistribution.java:74 - Redistributing index summaries INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 IndexSummaryRedistribution.java:74 - Redistributing index summaries {code} When we try to see pending compaction by executing {code}nodetool compactionstats{code} it hangs as well and doesn't return anything. However {code}nodetool tpstats{code} show active and pending compaction which never come down and keep increasing. {code} Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 221208 0 0 ReadStage 0 01288839 0 0 RequestResponseStage 0 0 104356 0 0 ReadRepairStage 0 0 72 0 0 CounterMutationStage 0 0 0 0 0 HintedHandoff 0 0 46 0 0 MiscStage 0 0 0 0 0 CompactionExecutor866 68124 0 0 MemtableReclaimMemory 0 0166 0 0 PendingRangeCalculator0 0 38 0 0 GossipStage 0 0 242455 0 0 MigrationStage0 0 0 0 0 MemtablePostFlush 0 0 3682 0 0 ValidationExecutor0 0 2246 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0166 0 0 InternalResponseStage 0 0 8866 0 0 AntiEntropyStage 0 0 15417 0 0 Repair#7 0 0160 0 0 CacheCleanupExecutor 0 0 0 0 0 Native-Transport-Requests 0 0 327334 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {code} The only solution we have is bounce the node and all the pending compactions started getting processed immediately and get processed in 5 - 10 minutes. This is a road blocker issue for us and and help in this matter would be highly appreciated. was: Hi We are setting up incremental repair on our 18 node cluster. Avg load on each node is ~5GB. The repair run fine on couple of nodes and sudently get stuck on random nodes. Upon checking the system.log of impacted node we dont see much information. Following are the lines which stick from the point repair is not making progress - {code} INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 CompactionManager.java:1221 - Anticompacting [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 IndexSummaryRedistribution.java:74 - Redistributing index summaries INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 IndexSummaryRedistribution.java:74 - Redistributing index summaries {code} When we try to see pending compaction by executing
[jira] [Created] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes
Navjyot Nishant created CASSANDRA-12655: --- Summary: Incremental repair & compaction hang on random nodes Key: CASSANDRA-12655 URL: https://issues.apache.org/jira/browse/CASSANDRA-12655 Project: Cassandra Issue Type: Bug Components: Compaction Environment: CentOS Linux release 7.1.1503 (Core) RAM - 64GB HEAP - 16GB Load on each node - ~5GB Cassandra Version - 2.2.5 Reporter: Navjyot Nishant Priority: Blocker Hi We are setting up incremental repair on our 18 node cluster. Avg load on each node is ~5GB. The repair run fine on couple of nodes and sudently get stuck on random nodes. Upon checking the system.log of impacted node we dont see much information. Following are the lines which stick from the point repair is not making progress - {code} INFO [CompactionExecutor:3490] 2016-09-16 11:14:44,236 CompactionManager.java:1221 - Anticompacting [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'), BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')] INFO [IndexSummaryManager:1] 2016-09-16 11:14:49,954 IndexSummaryRedistribution.java:74 - Redistributing index summaries INFO [IndexSummaryManager:1] 2016-09-16 12:14:49,961 IndexSummaryRedistribution.java:74 - Redistributing index summaries {code} When we try to see pending compaction by executing {code}nodetool compactionstats{code} it hangs as well and doesn't return anything. However {code}nodetool tpstats{code} show active and pending compaction which never come down and keep increasing. {code} Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 221208 0 0 ReadStage 0 01288839 0 0 RequestResponseStage 0 0 104356 0 0 ReadRepairStage 0 0 72 0 0 CounterMutationStage 0 0 0 0 0 HintedHandoff 0 0 46 0 0 MiscStage 0 0 0 0 0 CompactionExecutor866 68124 0 0 MemtableReclaimMemory 0 0166 0 0 PendingRangeCalculator0 0 38 0 0 GossipStage 0 0 242455 0 0 MigrationStage0 0 0 0 0 MemtablePostFlush 0 0 3682 0 0 ValidationExecutor0 0 2246 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0166 0 0 InternalResponseStage 0 0 8866 0 0 AntiEntropyStage 0 0 15417 0 0 Repair#7 0 0160 0 0 CacheCleanupExecutor 0 0 0 0 0 Native-Transport-Requests 0 0 327334 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {code} The only solution we have is bounce the node and all the pending compactions started getting processed immediately and get processed in 5 - 10 minutes. This is a road blocker issue for us and and help in this matter would be highly appreciated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10057) RepairMessageVerbHandler.java:95 - Cannot start multiple repair sessions over the same sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136158#comment-15136158 ] Navjyot Nishant commented on CASSANDRA-10057: - Guys, is this fix also applicable for V2.2.4? We are running into same issues in addition to following error: {code} [2016-02-07 01:00:17,477] Repair session 26c3c380-cd36-11e5-abde-1d9681013b36 for range (2881082214758500134,2995213907575973035] failed with error [repair #26c3c380-cd36-11e5-abde-1d9681013b36 on /, (2881082214758500134,2995213907575973035]] Validation failed in / (progress: 74%) --- --- [2016-02-07 01:00:17,815] Some repair failed. {code} We tried bouncing the nodes which help with error "Multiple repair session" but we still face "Validation failed error" on a specific Keyspace. As per https://support.datastax.com/hc/en-us/articles/205256895--Validation-failed-when-running-a-nodetool-repair We did tried to scrub the table on the node it is complaining but again it started complaining for some other nodes. Any help in this regard would be much appreciated. Thanks!! > RepairMessageVerbHandler.java:95 - Cannot start multiple repair sessions over > the same sstables > --- > > Key: CASSANDRA-10057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10057 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Amazon Linux: 3.14.48-33.39.amzn1.x86_64 > java version "1.7.0_85" > OpenJDK Runtime Environment (amzn-2.6.1.3.61.amzn1-x86_64 u85-b01) > OpenJDK 64-Bit Server VM (build 24.85-b03, mixed mode) > Cassandra RPM: cassandra22-2.2.0-1.noarch >Reporter: Victor Trac >Assignee: Yuki Morishita > Fix For: 2.2.2, 3.0 beta 2 > > > I bootstrapped a DC2 by restoring the snapshots from DC1 into equivalent > nodes in DC2. Everything comes up just fine, but when I tried to run a > {code}repair -dcpar -j4{code} in DC2, I got this error: > {code} > [root@cassandra-i-2677cce3 ~]# nodetool repair -dcpar -j4 > [2015-08-12 15:56:05,682] Nothing to repair for keyspace 'system_auth' > [2015-08-12 15:56:05,949] Starting repair command #4, repairing keyspace > crawl with repair options (parallelism: dc_parallel, primary range: false, > incremental: true, job threads: 4, ColumnFamilies: [], dataCenters: [], > hosts: [], # of ranges: 2275) > [2015-08-12 15:59:33,050] Repair session 1b8d7810-410b-11e5-b71c-71288cf05b1d > for range (-1630840392403060839,-1622173360499444177] finished (progress: 0%) > [2015-08-12 15:59:33,284] Repair session 1b92a830-410b-11e5-b71c-71288cf05b1d > for range (-2766833977081486018,-2766120936176524808] failed with error Could > not create snapshot at /10.20.144.15 (progress: 0%) > [2015-08-12 15:59:35,543] Repair session 1b8fe910-410b-11e5-b71c-71288cf05b1d > for range (5127720400742928658,5138864412691114632] finished (progress: 0%) > [2015-08-12 15:59:36,040] Repair session 1b960390-410b-11e5-b71c-71288cf05b1d > for range (749871306972906628,751065038788146229] failed with error Could not > create snapshot at /10.20.144.15 (progress: 0%) > [2015-08-12 15:59:36,454] Repair session 1b9455e0-410b-11e5-b71c-71288cf05b1d > for range (-8769666365699147423,-8767955202550789015] finished (progress: 0%) > [2015-08-12 15:59:38,765] Repair session 1b97b140-410b-11e5-b71c-71288cf05b1d > for range (-4434580467371714601,-4433394767535421669] finished (progress: 0%) > [2015-08-12 15:59:41,520] Repair session 1b99d420-410b-11e5-b71c-71288cf05b1d > for range (-1085112943862424751,-1083156277882030877] finished (progress: 0%) > [2015-08-12 15:59:43,806] Repair session 1b9da4b0-410b-11e5-b71c-71288cf05b1d > for range (2125359121242932804,2126816999370470831] failed with error Could > not create snapshot at /10.20.144.15 (progress: 0%) > [2015-08-12 15:59:43,874] Repair session 1b9ba8e0-410b-11e5-b71c-71288cf05b1d > for range (-7469857353178912795,-7459624955099554284] finished (progress: 0%) > [2015-08-12 15:59:48,384] Repair session 1b9fa080-410b-11e5-b71c-71288cf05b1d > for range (-8005238987831093686,-8005057803798566519] finished (progress: 0%) > [2015-08-12 15:59:48,392] Repair session 1ba17540-410b-11e5-b71c-71288cf05b1d > for range (7291056720707652994,7292508243124389877] failed with error Could > not create snapshot at /10.20.144.15 (progress: 0%) > {code} > It seems like now that all 4 threads ran into an error, the repair process > just sits forever. > Looking at 10.20.144.15, I see this: > {code} > ERROR [AntiEntropyStage:2] 2015-08-12 15:59:35,965 > RepairMessageVerbHandler.java:95 - Cannot start multiple repair sessions over > the same sstables > ERROR [AntiEntropyStage:2] 2015-08-12 15:59:35,966 > RepairMessageVerbHandler.java:153 - Got error, removing parent repair
[jira] [Commented] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115648#comment-15115648 ] Navjyot Nishant commented on CASSANDRA-11063: - Thanks Marcus.. Yes we did identified this data modelling issue and wonking to fix this, but as you also mentioned this shouldn't fail like this anyways, do we have a workaround to fix this? Our system.log is full of this error.. Errors are unstoppable even when no compaction is running. At this point we just want to stop this spam. > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-11063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11063 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 2.1.9 on RHEL >Reporter: Navjyot Nishant > Labels: Compaction, thread > > Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with > error we are getting. But we are getting this with Cassandra 2.1.9 when > autocompaction is running it keeps throwing following errors, we are unsure > if its a bug or can be resolved, please suggest. > {code} > WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - > Compacting large partition gccatlgsvcks/category_name_dedup:66611300 > (138152195 bytes) > ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 > - Exception in thread Thread[CompactionExecutor:1,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > {code} > h3. Additional info: > *cfstats is running fine for that table...* > {code} > ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup > Keyspace: gccatlgsvcks > Read Count: 0 > Read Latency: NaN ms. > Write Count: 0 > Write Latency: NaN ms. > Pending Flushes: 0 > Table: category_name_dedup > SSTable count: 6 > Space used (live): 836314727 > Space used (total): 836314727 > Space used by snapshots (total): 3621519 > Off heap memory used (total): 6930368 > SSTable Compression Ratio: 0.03725358753117693 > Number of keys (estimate): 3004 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 0 > Local read count: 0 > Local read latency: NaN ms > Local write count: 0 > Local write latency: NaN ms > Pending flushes: 0 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.0 > Bloom filter space used: 5240 > Bloom filter off heap memory used: 5192 > Index summary off heap memory used: 1200 >
[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-11063: Labels: Compaction thread (was: ) > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-11063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11063 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 2.1.9 on RHEL >Reporter: Navjyot Nishant > Labels: Compaction, thread > > Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with > error we are getting. But we are getting this with Cassandra 2.1.9 when > autocompaction is running it keeps throwing following errors, we are unsure > if its a bug or can be resolved, please suggest. > WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - > Compacting large partition gccatlgsvcks/category_name_dedup:66611300 > (138152195 bytes) > ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 > - Exception in thread Thread[CompactionExecutor:1,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Additional info: > cfstats is running fine for that table... > ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup > Keyspace: gccatlgsvcks > Read Count: 0 > Read Latency: NaN ms. > Write Count: 0 > Write Latency: NaN ms. > Pending Flushes: 0 > Table: category_name_dedup > SSTable count: 6 > Space used (live): 836314727 > Space used (total): 836314727 > Space used by snapshots (total): 3621519 > Off heap memory used (total): 6930368 > SSTable Compression Ratio: 0.03725358753117693 > Number of keys (estimate): 3004 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 0 > Local read count: 0 > Local read latency: NaN ms > Local write count: 0 > Local write latency: NaN ms > Pending flushes: 0 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.0 > Bloom filter space used: 5240 > Bloom filter off heap memory used: 5192 > Index summary off heap memory used: 1200 > Compression metadata off heap memory used: 6923976 > Compacted partition minimum bytes: 125 > Compacted partition maximum bytes: 30753941057 > Compacted partition mean bytes: 8352388 > Average live cells per slice (last five minutes): 0.0 > Maximum live cells per slice (last five minutes): 0.0 >
[jira] [Commented] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114280#comment-15114280 ] Navjyot Nishant commented on CASSANDRA-11063: - *Update:* We did executed major compaction against this table which got completed successfully and at this point there is no compaction running but This error is continuously getting logged in system.log. We tried bouncing the node which did not helped at all. {code} ~ $ nodetool compactionstats pending tasks: 0 {code} > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-11063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11063 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 2.1.9 on RHEL >Reporter: Navjyot Nishant > Labels: Compaction, thread > > Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with > error we are getting. But we are getting this with Cassandra 2.1.9 when > autocompaction is running it keeps throwing following errors, we are unsure > if its a bug or can be resolved, please suggest. > {code} > WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - > Compacting large partition gccatlgsvcks/category_name_dedup:66611300 > (138152195 bytes) > ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 > - Exception in thread Thread[CompactionExecutor:1,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > {code} > h3. Additional info: > *cfstats is running fine for that table...* > {code} > ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup > Keyspace: gccatlgsvcks > Read Count: 0 > Read Latency: NaN ms. > Write Count: 0 > Write Latency: NaN ms. > Pending Flushes: 0 > Table: category_name_dedup > SSTable count: 6 > Space used (live): 836314727 > Space used (total): 836314727 > Space used by snapshots (total): 3621519 > Off heap memory used (total): 6930368 > SSTable Compression Ratio: 0.03725358753117693 > Number of keys (estimate): 3004 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 0 > Local read count: 0 > Local read latency: NaN ms > Local write count: 0 > Local write latency: NaN ms > Pending flushes: 0 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.0 > Bloom filter space used: 5240 > Bloom filter off heap memory used: 5192 > Index summary off heap memory used: 1200 > Compression metadata
[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-11063: Description: Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. {code} WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - Compacting large partition gccatlgsvcks/category_name_dedup:66611300 (138152195 bytes) ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] {code} h3. Additional info: *cfstats is running fine for that table...* {code} ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup Keyspace: gccatlgsvcks Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: category_name_dedup SSTable count: 6 Space used (live): 836314727 Space used (total): 836314727 Space used by snapshots (total): 3621519 Off heap memory used (total): 6930368 SSTable Compression Ratio: 0.03725358753117693 Number of keys (estimate): 3004 Memtable cell count: 0 Memtable data size: 0 Memtable off heap memory used: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 5240 Bloom filter off heap memory used: 5192 Index summary off heap memory used: 1200 Compression metadata off heap memory used: 6923976 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 30753941057 Compacted partition mean bytes: 8352388 Average live cells per slice (last five minutes): 0.0 Maximum live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 {code} *cfhistograms is also running fine...* {code} ~ $ nodetool cfhistograms gccatlgsvcks category_name_dedup gccatlgsvcks/category_name_dedup histograms Percentile SSTables Write Latency Read LatencyPartition Size Cell Count (micros) (micros) (bytes) 50% 0.00 0.00 0.00 1109 20 75% 0.00 0.00 0.00
[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-11063: Description: Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. {code} WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - Compacting large partition gccatlgsvcks/category_name_dedup:66611300 (138152195 bytes) ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] {code} Additional info: cfstats is running fine for that table... ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup Keyspace: gccatlgsvcks Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: category_name_dedup SSTable count: 6 Space used (live): 836314727 Space used (total): 836314727 Space used by snapshots (total): 3621519 Off heap memory used (total): 6930368 SSTable Compression Ratio: 0.03725358753117693 Number of keys (estimate): 3004 Memtable cell count: 0 Memtable data size: 0 Memtable off heap memory used: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 5240 Bloom filter off heap memory used: 5192 Index summary off heap memory used: 1200 Compression metadata off heap memory used: 6923976 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 30753941057 Compacted partition mean bytes: 8352388 Average live cells per slice (last five minutes): 0.0 Maximum live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 cfhistograms is also running fine... ~ $ nodetool cfhistograms gccatlgsvcks category_name_dedup gccatlgsvcks/category_name_dedup histograms Percentile SSTables Write Latency Read LatencyPartition Size Cell Count (micros) (micros) (bytes) 50% 0.00 0.00 0.00 1109 20 75% 0.00 0.00 0.00 2299 42 95%
[jira] [Commented] (CASSANDRA-8028) Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113715#comment-15113715 ] Navjyot Nishant commented on CASSANDRA-8028: Hi All, We are getting similar issue while autocompaction is running on few of our nodes. Following is the error being logged, can someone please suggest what is causing this and how to resolve it? We use Cassandra 2.1.9. Please let me know if further information is required. Error: ERROR [CompactionExecutor:3] 2016-01-23 11:54:50,198 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:3,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Unable to compute when histogram overflowed > --- > > Key: CASSANDRA-8028 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8028 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Linux >Reporter: Gianluca Borello >Assignee: Carl Yeksigian > Fix For: 2.1.3 > > Attachments: 8028-2.1-clean.txt, 8028-2.1-v2.txt, 8028-2.1.txt, > 8028-trunk.txt, sstable-histogrambuster.tar.bz2 > > > It seems like with 2.1.0 histograms can't be computed most of the times: > $ nodetool cfhistograms draios top_files_by_agent1 > nodetool: Unable to compute when histogram overflowed > See 'nodetool help' or 'nodetool help '. > I can probably find a way to attach a .cql script to reproduce it, but I > suspect it must be obvious to replicate it as it happens on more than 50% of > my column families. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-11063: Description: Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - Compacting large partition gccatlgsvcks/category_name_dedup:66611300 (138152195 bytes) ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Additional info: cfstats is running fine for that table... ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup Keyspace: gccatlgsvcks Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: category_name_dedup SSTable count: 6 Space used (live): 836089073 Space used (total): 836089073 Space used by snapshots (total): 3621519 Off heap memory used (total): 6925736 SSTable Compression Ratio: 0.03725398763856016 Number of keys (estimate): 3004 Memtable cell count: 0 Memtable data size: 0 Memtable off heap memory used: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 5240 Bloom filter off heap memory used: 5192 Index summary off heap memory used: 1200 Compression metadata off heap memory used: 6919344 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 30753941057 Compacted partition mean bytes: 8352388 Average live cells per slice (last five minutes): 0.0 Maximum live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 was: Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. ERROR [CompactionExecutor:3] 2016-01-23 11:52:50,197 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:3,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed
[jira] [Commented] (CASSANDRA-8028) Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113748#comment-15113748 ] Navjyot Nishant commented on CASSANDRA-8028: I have created https://issues.apache.org/jira/browse/CASSANDRA-11063 to track this issue. > Unable to compute when histogram overflowed > --- > > Key: CASSANDRA-8028 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8028 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Linux >Reporter: Gianluca Borello >Assignee: Carl Yeksigian > Fix For: 2.1.3 > > Attachments: 8028-2.1-clean.txt, 8028-2.1-v2.txt, 8028-2.1.txt, > 8028-trunk.txt, sstable-histogrambuster.tar.bz2 > > > It seems like with 2.1.0 histograms can't be computed most of the times: > $ nodetool cfhistograms draios top_files_by_agent1 > nodetool: Unable to compute when histogram overflowed > See 'nodetool help' or 'nodetool help '. > I can probably find a way to attach a .cql script to reproduce it, but I > suspect it must be obvious to replicate it as it happens on more than 50% of > my column families. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
Navjyot Nishant created CASSANDRA-11063: --- Summary: Unable to compute ceiling for max when histogram overflowed Key: CASSANDRA-11063 URL: https://issues.apache.org/jira/browse/CASSANDRA-11063 Project: Cassandra Issue Type: Bug Components: Compaction Environment: Cassandra 2.1.9 on RHEL Reporter: Navjyot Nishant Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. ERROR [CompactionExecutor:3] 2016-01-23 11:52:50,197 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:3,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1 at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1. at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2. at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.j at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2. at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navjyot Nishant updated CASSANDRA-11063: Description: Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with error we are getting. But we are getting this with Cassandra 2.1.9 when autocompaction is running it keeps throwing following errors, we are unsure if its a bug or can be resolved, please suggest. WARN [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - Compacting large partition gccatlgsvcks/category_name_dedup:66611300 (138152195 bytes) ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Additional info: cfstats is running fine for that table... ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup Keyspace: gccatlgsvcks Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: category_name_dedup SSTable count: 6 Space used (live): 836314727 Space used (total): 836314727 Space used by snapshots (total): 3621519 Off heap memory used (total): 6930368 SSTable Compression Ratio: 0.03725358753117693 Number of keys (estimate): 3004 Memtable cell count: 0 Memtable data size: 0 Memtable off heap memory used: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 5240 Bloom filter off heap memory used: 5192 Index summary off heap memory used: 1200 Compression metadata off heap memory used: 6923976 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 30753941057 Compacted partition mean bytes: 8352388 Average live cells per slice (last five minutes): 0.0 Maximum live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 cfhistograms is also running fine... ~ $ nodetool cfhistograms gccatlgsvcks category_name_dedup gccatlgsvcks/category_name_dedup histograms Percentile SSTables Write Latency Read LatencyPartition Size Cell Count (micros) (micros) (bytes) 50% 0.00 0.00 0.00 1109 20 75% 0.00 0.00 0.00 2299 42 95% 0.00