All,

We have a Cassandra cluster which seems to be struggling a bit. I have one node 
which crashes continually, and others which crash sporadically. When they crash 
it's with a JVM couldn't allocate memory, even though there's heaps available. 
I suspect it's because one table which is very big. (500GB) which has on the 
order of 500K-700K files in its directory. When I delete the directory contents 
on the crashing node and ran a repair, the nodes around this node crashed while 
streaming the data. Here is the relevant bits from the crash file and 
environment.

Any help would be appreciated.

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing 
reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2671), pid=1104, tid=139950342317824
#
# JRE version: Java(TM) SE Runtime Environment (8.0_20-b26) (build 1.8.0_20-b26)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.20-b23 mixed mode linux-amd64 
compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#

---------------  T H R E A D  ---------------

Current thread (0x00007f4acabb1800):  JavaThread "Thread-13" [_thread_new, 
id=19171, stack(0x00007f48ba6ca000,0x00007f48ba70b000)]

Stack: [0x00007f48ba6ca000,0x00007f48ba70b000],  sp=0x00007f48ba709a50,  free 
space=254k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa76cea]  VMError::report_and_die()+0x2ca
V  [libjvm.so+0x4e52fb]  report_vm_out_of_memory(char const*, int, unsigned 
long, VMErrorType, char const*)+0x8b
V  [libjvm.so+0x8e4ec3]  os::Linux::commit_memory_impl(char*, unsigned long, 
bool)+0x103
V  [libjvm.so+0x8e4f8c]  os::pd_commit_memory(char*, unsigned long, bool)+0xc
V  [libjvm.so+0x8dce4a]  os::commit_memory(char*, unsigned long, bool)+0x2a
V  [libjvm.so+0x8e33af]  os::pd_create_stack_guard_pages(char*, unsigned 
long)+0x7f
V  [libjvm.so+0xa21bde]  JavaThread::create_stack_guard_pages()+0x5e
V  [libjvm.so+0xa29954]  JavaThread::run()+0x34
V  [libjvm.so+0x8e75f8]  java_start(Thread*)+0x108
C  [libpthread.so.0+0x79d1]


Memory: 4k page, physical 131988232k(694332k free), swap 37748728k(37748728k 
free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (25.20-b23) for linux-amd64 JRE 
(1.8.0_20-b26), built on Jul 30 2014 13:13:52 by "java_re" with gcc 4.3.0 
20080428 (Red Hat 4.3.0-8)

time: Fri Dec 19 14:37:29 2014
elapsed time: 2303 seconds (0d 0h 38m 23s)

OS:Red Hat Enterprise Linux Server release 6.5 (Santiago)

uname:Linux 2.6.32-431.5.1.el6.x86_64 #1 SMP Fri Jan 10 14:46:43 EST 2014 x86_64
libc:glibc 2.12 NPTL 2.12
rlimit: STACK 10240k, CORE 0k, NPROC 8192, NOFILE 65536, AS infinity
load average:4.18 4.79 4.54

/proc/meminfo:
MemTotal:       131988232 kB
MemFree:          694332 kB
Buffers:          837584 kB
Cached:         51002896 kB
SwapCached:            0 kB
Active:         93953028 kB
Inactive:       32850628 kB
Active(anon):   70851112 kB
Inactive(anon):  4713848 kB
Active(file):   23101916 kB
Inactive(file): 28136780 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      37748728 kB
SwapFree:       37748728 kB
Dirty:             75752 kB
Writeback:             0 kB
AnonPages:      74963768 kB
Mapped:           739884 kB
Shmem:            601592 kB
Slab:            3460252 kB
SReclaimable:    3170124 kB
SUnreclaim:       290128 kB
KernelStack:       36224 kB
PageTables:       189772 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    169736960 kB
Committed_AS:   92208740 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      492032 kB
VmallocChunk:   34291733296 kB
HardwareCorrupted:     0 kB
AnonHugePages:  67717120 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        5056 kB
DirectMap2M:     2045952 kB
DirectMap1G:    132120576 kB

Before you say It's a ulimit issue:
[501]> ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1030998
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 8192
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Here's the filecount on one of the nodes for this very big table:
loosterw@NODE:/env/datacache/data/cassandra/data/datastore/bigtable-e58925706a3c11e4ba63adfbd009c4d6
> ls | wc -l
588636

Thanks,

Leon



This email, including any attachments, is confidential. If you are not the 
intended recipient, you must not disclose, distribute or use the information in 
this email in any way. If you received this email in error, please notify the 
sender immediately by return email and delete the message. Unless expressly 
stated otherwise, the information in this email should not be regarded as an 
offer to sell or as a solicitation of an offer to buy any financial product or 
service, an official confirmation of any transaction, or as an official 
statement of the entity sending this message. Neither Macquarie Group Limited, 
nor any of its subsidiaries, guarantee the integrity of any emails or attached 
files and are not responsible for any changes made to them by any other person.

Reply via email to