Hi,

This one is old, do you still need help there? Sorry we missed it.


   1. What Cassandra version do you use?
   2. What does "nodetool tpstats" show you. Any dropped or pending message?
   3. Is your error a full heap memory issue or native one?
   4. What configurations did you change from default and you think might
   be related to this (memtable size, GC config, bloom filters...)?


the cluster starts flapping between being down and up


Using AWS, even more with small instances make sure to use
phi_convict_threshold set to about 12, just in case. This will prevent node
from flapping that much.

C*heers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-04-12 19:53 GMT+02:00 Bo Finnerup Madsen <bo.gunder...@gmail.com>:

> Hi,
>
> We have an application that reads data from a set of external sources and
> loads them into our cassandra cluster. The load goes ok for some time
> (~24h) and then some servers in the cluster starts flapping between being
> down and up, and finally they go out of memory.
> The cluster consists of 5 m4.xlarge machines with 16gb memory, cassandra
> has an 8gb heap. All machines have a high load while data is being written,
> with a load between 6 and 20.
>
> I have tried sifting through the information available from nodetool, but
> I am unable to find anything helping me determine what is causing the oom.
> I am quite new to cassandra, so I might very well overlook the obvious. So
> any pointers on how to proceed with identifying the problem will be much
> appriciated :)
>
> In the following I have included information from 10.61.70.110 when it was
> flapping.
>
> Status for ddp keyspace(only keyspace containing any real data):
>
> -----------------------------------------------------------------------------
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address       Load       Tokens       Owns (effective)  Host ID
>                         Rack
> UN  10.61.70.108  65.97 GB   256          59,1%
> de79a554-9296-4575-8b79-2089f92069cd  rack1
> UN  10.61.70.110  58.95 GB   256          63,3%
> 310460f6-b7ce-45a7-be63-a7dd409f6b17  rack1
> UN  10.61.70.72   58.17 GB   256          60,3%
> 44fd4f8e-18cd-4487-8174-3a22fb9ed24f  rack1
> UN  10.61.70.107  58.69 GB   256          58,5%
> f8118fc2-e340-45db-a06e-a5842107d6c8  rack1
> UN  10.61.70.64   68 GB      256          58,7%
> 84bee9fe-2adc-48aa-915c-f43d972f5a2f  rack1
>
> -----------------------------------------------------------------------------
>
>
> Snippet from system.log:
>
> -----------------------------------------------------------------------------
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,547 MessagingService.java:980
> - MUTATION messages were dropped in last 5000 ms: 5776 for internal timeout
> and 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,547 StatusLogger.java:52 -
> Pool Name                    Active   Pending      Completed   Blocked  All
> Time Blocked
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,551 StatusLogger.java:56 -
> MutationStage                    32   4881705    17826061870         0
>               0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,552 StatusLogger.java:56 -
> ViewMutationStage                 0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,552 StatusLogger.java:56 -
> ReadStage                         0         0        3266887         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,553 StatusLogger.java:56 -
> RequestResponseStage              0         0      389429305         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,553 StatusLogger.java:56 -
> ReadRepairStage                   0         0         322804         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
> CounterMutationStage              0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
> MiscStage                         0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
> CompactionExecutor                4        54          31305         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
> MemtableReclaimMemory             0         0           3310         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
> PendingRangeCalculator            0         0             10         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
> GossipStage                       0         0         338170         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
> SecondaryIndexManagement          0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,556 StatusLogger.java:56 -
> HintsDispatcher                   1         4           6264         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,558 StatusLogger.java:56 -
> MigrationStage                    0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,559 StatusLogger.java:56 -
> MemtablePostFlush                 0         0           3451         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,560 StatusLogger.java:56 -
> ValidationExecutor                0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,561 StatusLogger.java:56 -
> Sampler                           0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,563 StatusLogger.java:56 -
> MemtableFlushWriter               0         0           3310         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,564 StatusLogger.java:56 -
> InternalResponseStage             0         0        1873184         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,565 StatusLogger.java:56 -
> AntiEntropyStage                  0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,565 StatusLogger.java:56 -
> CacheCleanupExecutor              0         0              0         0
>             0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:56 -
> Native-Transport-Requests         3         2      212872837         0
>          1796
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:66 -
> CompactionManager                 4        32
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:78 -
> MessagingService                n/a       0/0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,567 StatusLogger.java:88 -
> Cache Type                     Size                 Capacity
> KeysToSave
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:90 -
> KeyCache                   88654200                104857600
>        all
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:96 -
> RowCache                          0                        0
>        all
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:103 -
> Table                       Memtable ops,data
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
> ddp.fingerprint_by_content_uuid_mv      99624,17726363
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
> ddp.sync                            4747,1551
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
> ddp.log_portal                            0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
> ddp.meta_data                    2083,3829424
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
> ddp.configuration                   199,13915
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
> ddp.uuids_by_related_uuid     414431,17721698
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
> ddp.file_by_file_id                       0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,571 StatusLogger.java:106 -
> ddp.fingerprint_by_content_type_mv       83007,3317002
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,571 StatusLogger.java:106 -
> ddp.heartbeat                     35855,13407
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,574 StatusLogger.java:106 -
> ddp.concept                               0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,575 StatusLogger.java:106 -
> ddp.classification_scheme                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
> ddp.rendering_relations_mv        9665,1201123
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
> ddp.file                             41,19826
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
> ddp.rendering_relations       276049,13465909
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
> ddp.semantic_group                        0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
> ddp.rendering                  23344,68861340
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
> ddp.thesauri                             44,8
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
> ddp.file_download                         0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
> ddp.uuids_by_related_uuid_mv     172394,14524712
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
> ddp.fingerprint               124884,54955730
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
> ddp.ddp_status                            0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
> ddp.log_portal_mv                         0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
> system_distributed.parent_repair_history                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,579 StatusLogger.java:106 -
> system_distributed.repair_history                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,579 StatusLogger.java:106 -
> system.compaction_history             20,4595
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
> system.hints                              0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
> system.schema_aggregates                  0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
> system.IndexInfo                          0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,581 StatusLogger.java:106 -
> system.schema_columnfamilies                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,581 StatusLogger.java:106 -
> system.schema_triggers                    0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
> system.size_estimates            50400,764904
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
> system.schema_functions                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
> system.paxos                              0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
> system.views_builds_in_progress                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
> system.built_views                        0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
> system.peer_events                        0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
> system.range_xfers                        0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
> system.peers                              0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,584 StatusLogger.java:106 -
> system.batches                188441,33975764
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,584 StatusLogger.java:106 -
> system.schema_keyspaces                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
> system.schema_usertypes                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
> system.local                              0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
> system.sstable_activity            1376,24807
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,587 StatusLogger.java:106 -
> system.available_ranges                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,587 StatusLogger.java:106 -
> system.batchlog                           0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,588 StatusLogger.java:106 -
> system.schema_columns                     0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,589 StatusLogger.java:106 -
> system_schema.columns                     0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,589 StatusLogger.java:106 -
> system_schema.types                       0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
> system_schema.indexes                     0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
> system_schema.keyspaces                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
> system_schema.dropped_columns                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
> system_schema.aggregates                  0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
> system_schema.triggers                    0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
> system_schema.tables                      0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
> system_schema.views                       0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
> system_schema.functions                   0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
> system_auth.roles                         0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
> system_auth.role_members                  0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
> system_auth.resource_role_permissons_index                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
> system_auth.role_permissions                 0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
> system_traces.sessions                    0,0
> INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
> system_traces.events                      0,0
>
> -----------------------------------------------------------------------------
>
> compactionstats:
>
> -----------------------------------------------------------------------------
> pending tasks: 32
>                                      id   compaction type   keyspace
>             table   completed       total    unit   progress
>    3bf89310-00d1-11e6-a5a3-f125ce747d55        Compaction        ddp
> rendering_relations     1,09 GB     5,43 GB   bytes     20,10%
>    9c5bd6b1-00d4-11e6-a5a3-f125ce747d55        Compaction        ddp
>         meta_data   243,64 MB   338,56 MB   bytes     71,96%
>    50308cb0-00d2-11e6-a5a3-f125ce747d55        Compaction        ddp
> uuids_by_related_uuid     1,37 GB     2,17 GB   bytes     63,11%
>    e8965d90-00c3-11e6-a5a3-f125ce747d55        Compaction        ddp
>         rendering    19,02 GB    24,09 GB   bytes     78,96%
>
> -----------------------------------------------------------------------------
>
> Thank you in advance :)
>
> Yours sincerely,
>   Bo Madsen
>

Reply via email to