[ceph-users] Re: Host failure trigger " Cannot allocate memory"

Amudhan P Tue, 10 Sep 2019 15:41:55 -0700

I am also getting this error msg in one node when other host is down.

ceph -s
Traceback (most recent call last):
  File "/usr/bin/ceph", line 130, in <module>
    import rados
ImportError: libceph-common.so.0: cannot map zero-fill pages




On Tue, Sep 10, 2019 at 4:39 PM Amudhan P <[email protected]> wrote:

> Its a test cluster each node with a single OSD and 4GB RAM.
>
> On Tue, Sep 10, 2019 at 3:42 PM Ashley Merrick <[email protected]>
> wrote:
>
>> What's specs ate the machines?
>>
>> Recovery work will use more memory the general clean operation and looks
>> like your maxing out the available memory on the machines during CEPH
>> trying to recover.
>>
>>
>>
>> ---- On Tue, 10 Sep 2019 18:10:50 +0800 * [email protected]
>> <[email protected]> * wrote ----
>>
>> I have also found below error in dmesg.
>>
>> [332884.028810] systemd-journald[6240]: Failed to parse kernel command
>> line, ignoring: Cannot allocate memory
>> [332885.054147] systemd-journald[6240]: Out of memory.
>> [332894.844765] systemd[1]: systemd-journald.service: Main process
>> exited, code=exited, status=1/FAILURE
>> [332897.199736] systemd[1]: systemd-journald.service: Failed with result
>> 'exit-code'.
>> [332906.503076] systemd[1]: Failed to start Journal Service.
>> [332937.909198] systemd[1]: ceph-crash.service: Main process exited,
>> code=exited, status=1/FAILURE
>> [332939.308341] systemd[1]: ceph-crash.service: Failed with result
>> 'exit-code'.
>> [332949.545907] systemd[1]: systemd-journald.service: Service has no
>> hold-off time, scheduling restart.
>> [332949.546631] systemd[1]: systemd-journald.service: Scheduled restart
>> job, restart counter is at 7.
>> [332949.546781] systemd[1]: Stopped Journal Service.
>> [332949.566402] systemd[1]: Starting Journal Service...
>> [332950.190332] systemd[1]: [email protected]: Main process exited,
>> code=killed, status=6/ABRT
>> [332950.190477] systemd[1]: [email protected]: Failed with result
>> 'signal'.
>> [332950.842297] systemd-journald[6249]: File
>> /var/log/journal/8f2559099bf54865adc95e5340d04447/system.journal corrupted
>> or uncleanly shut down, renaming and replacing.
>> [332951.019531] systemd[1]: Started Journal Service.
>>
>> On Tue, Sep 10, 2019 at 3:04 PM Amudhan P <[email protected]> wrote:
>>
>> Hi,
>>
>> I am using ceph version 13.2.6 (mimic) on test setup trying with cephfs.
>>
>> My current setup:
>> 3 nodes, 1 node contain two bricks and other 2 nodes contain single brick
>> each.
>>
>> Volume is a 3 replica, I am trying to simulate node failure.
>>
>> I powered down one host and started getting msg in other systems when
>> running any command
>> "-bash: fork: Cannot allocate memory" and system not responding to
>> commands.
>>
>> what could be the reason for this?
>> at this stage, I could able to read some of the data stored in the volume
>> and some just waiting for IO.
>>
>> output from "sudo ceph -s"
>>   cluster:
>>     id:     7c138e13-7b98-4309-b591-d4091a1742b4
>>     health: HEALTH_WARN
>>             1 osds down
>>             2 hosts (3 osds) down
>>             Degraded data redundancy: 5313488/7970232 objects degraded
>> (66.667%), 64 pgs degraded
>>
>>   services:
>>     mon: 1 daemons, quorum mon01
>>     mgr: mon01(active)
>>     mds: cephfs-tst-1/1/1 up  {0=mon01=up:active}
>>     osd: 4 osds: 1 up, 2 in
>>
>>   data:
>>     pools:   2 pools, 64 pgs
>>     objects: 2.66 M objects, 206 GiB
>>     usage:   421 GiB used, 3.2 TiB / 3.6 TiB avail
>>     pgs:     5313488/7970232 objects degraded (66.667%)
>>              64 active+undersized+degraded
>>
>>   io:
>>     client:   79 MiB/s rd, 24 op/s rd, 0 op/s wr
>>
>> output from : sudo ceph osd df
>> ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS
>>  0   hdd 1.81940        0     0 B     0 B     0 B     0    0   0
>>  3   hdd 1.81940        0     0 B     0 B     0 B     0    0   0
>>  1   hdd 1.81940  1.00000 1.8 TiB 211 GiB 1.6 TiB 11.34 1.00   0
>>  2   hdd 1.81940  1.00000 1.8 TiB 210 GiB 1.6 TiB 11.28 1.00  64
>>                     TOTAL 3.6 TiB 421 GiB 3.2 TiB 11.31
>> MIN/MAX VAR: 1.00/1.00  STDDEV: 0.03
>>
>> regards
>> Amudhan
>>
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
>>
>>

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Host failure trigger " Cannot allocate memory"

Reply via email to