Re: [ceph-users] Ceph inside Docker containers inside VirtualBox

2019-04-18 Thread Siegfried Höllrigl

Hi !

I am not 100% sure, but i think, --net=host does not propagate /dev/ 
inside the conatiner.


From the Error Message :

2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !


I whould say, you should add something like --device=/dev/vdd to the 
docker run command for the osd.


Br

Am 18.04.2019 um 14:46 schrieb Varun Singh:

Hi,
I am trying to setup Ceph through Docker inside a VM. My host machine
is Mac. My VM is an Ubuntu 18.04. Docker version is 18.09.5, build
e8ff056.
I am following the documentation present on ceph/daemon Docker Hub
page. The idea is, if I spawn docker containers as mentioned on the
page, I should get a ceph setup without KV store. I am not worried
about KV store as I just want to try it out. Following are the
commands I am firing to bring the containers up:

Monitor:
docker run -d --net=host -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e
CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon

Manager:
docker run -d --net=host -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr

OSD:
docker run -d --net=host --pid=host --privileged=true -v
/etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e
OSD_DEVICE=/dev/vdd ceph/daemon osd

 From the above commands I am able to spawn monitor and manager
properly. I verified this by firing this command on both monitor and
manager containers:
sudo docker exec d1ab985 ceph -s

I get following outputs for both:

   cluster:
 id: 14a6e40a-8e54-4851-a881-661a84b3441c
 health: HEALTH_OK

   services:
 mon: 1 daemons, quorum serverceph-VirtualBox (age 62m)
 mgr: serverceph-VirtualBox(active, since 56m)
 osd: 0 osds: 0 up, 0 in

   data:
 pools:   0 pools, 0 pgs
 objects: 0 objects, 0 B
 usage:   0 B used, 0 B / 0 B avail
 pgs:

However when I try to bring up OSD using above command, it doesn't
work. Docker logs show this output:
2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: static:
does not generate config
2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !

I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE env var.
I know there are five different ways to spawning the OSD, but I am not
able to figure out which one would be suitable for a simple
deployment. If you could please let me know how to spawn OSDs using
Docker, it would help a lot.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-13 Thread Siegfried Höllrigl

Hi !

We have now successfully upgraded (from 12.2.10) to 12.2.11.

Seems to be quite stable. (Using RBD, CephFS and RadosGW)

Most of our OSDs are still on Filestore.

Should we set the "pglog_hardlimit" (as it mus not be unset anymore) ?

What exactly will this limit ?

Are there any risks ?

Any pre-checks recommended ?

Br,


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ubuntu18 and RBD Kernel Module

2018-08-28 Thread Siegfried Höllrigl

Hi !

We are running a ceph 12.2.7 Cluster and use it for RBDs.

We have now a few new servers installed with Ubuntu 18.
The default kernel version is v4.15.0.

When we create a new rbd and map/xfs-format/mount it, everything looks fine.
But if we want to map/mount a rbd that has already data in it, it takes a
very long time (>5minutes) - sometimes to map, sometimes to mount it.

There seems to be a process taking 100% of a cpu core during that "hang":
 3103 root  20   0   0  0  0 R 100.0  0.0   0:04.65 
kworker/11:1


With the "ukuu" tool, we have tested some other kernel versions :
v4.16.18 - same behavior
v4.18.5  - same behavior

And then an older kernel :
4.4.152-0404152-generic - rbd map/mount/umount/unmap - looks fine !

In the ceph.conf there is the line "rbd default features = 3" already 
(on all Servers).


Is there a need to further debug this, or did we miss some 
parameter/feature that

needs to be set differently on newer Kernels ?

Br,


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Luminous - OSD constantly crashing caused by corrupted placement group

2018-05-23 Thread Siegfried Höllrigl

Hi !

We have now deleted all snapshots of the pool in question.

With "ceph pg dump" we can see that pg 5.9b has a SNAPTRIMQ_LEN of 27826.

All other PGs have 0.

It looks like this value does not decrease. LAST_SCRUB and 
LAST_DEEP_SCRUB  are both from 2018-04-24. Almost 1 month ago.



OSD still crashing a while after we start it. OSD Log :

*** Caught signal (Aborted) **

and

/build/ceph-12.2.5/src/osd/PrimaryLogPG.cc: 358: FAILED assert(p != 
recovery_info.ss.clone_snaps.end())



Any Ideas howto fix this ? Is there a way to "force" the snaptrim of the 
pg in question ? Or anyother way to "clean" this pg ?


We have searched a lot in the mail archives but couldnt find anything 
that could help us in that case.



Br,



Am 17.05.2018 um 00:12 schrieb Gregory Farnum:
On Wed, May 16, 2018 at 6:49 AM Siegfried Höllrigl 
<siegfried.hoellr...@xidras.com 
<mailto:siegfried.hoellr...@xidras.com>> wrote:


Hi Greg !

Thank you for your fast reply.

We have now deleted the PG on OSD.130 like you suggested and
started it :

ceph-s-06 # ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-130/ --pgid 5.9b --op remove --force
  marking collection for removal
setting '_remove' omap key
finish_remove_pgs 5.9b_head removing 5.9b
Remove successful
ceph-s-06 # systemctl start ceph-osd@130.service

The cluster recovered again until it came to the PG 5.9b. Then
OSD.130
crashed again. -> No Change

So we wanted to start the other way and export the PG from the
primary
(healthy) OSD. (OSD.19) but that fails:

root@ceph-s-03:/tmp5.9b# ceph-objectstore-tool --op export --pgid
5.9b
--data-path /var/lib/ceph/osd/ceph-19 --file /tmp5.9b/5.9b.export
OSD has the store locked

But we don't want to stop OSD.19 on this server because this Pool has
size=3 and size_min=2.
(this would make pg5.9b inaccessable)


I'm a bit confused. Are you saying that
1) the ceph-objectstore-tool you pasted there successfully removed pg 
5.9b from osd.130 (as it appears), AND
2) pg 5.9b was active with one of the other nodes as primary, so all 
data remained available, AND
3) when pg 5.9b got backfilled into osd.130, osd.130 crashed again? 
(But the other OSDs kept the PG fully available, without crashing?)


That sequence of events is *deeply* confusing and I really don't 
understand how it might happen.


Sadly I don't think you can grab a PG for export without stopping the 
OSD in question.



When we query the pg, we can see a lot of "snap_trimq".
Can this be cleaned somehow, even if the pg is undersized and
degraded ?


I *think* the PG will keep trimming snapshots even if 
undersized+degraded (though I don't remember for sure), but snapshot 
trimming is often heavily throttled and I'm not aware of any way to 
specifically push one PG to the front. If you're interested in 
speeding snaptrimming up you can search the archives or check the docs 
for the appropriate config options.

-Greg


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Luminous - OSD constantly crashing caused by corrupted placement group

2018-05-17 Thread Siegfried Höllrigl

Am 17.05.2018 um 00:12 schrieb Gregory Farnum:



I'm a bit confused. Are you saying that
1) the ceph-objectstore-tool you pasted there successfully removed pg 
5.9b from osd.130 (as it appears), AND

Yes. The process ceph-osd for osd.130 was not runnin in that phase.
2) pg 5.9b was active with one of the other OSDs as primary, so all 
data remained available, AND
Yes. pg 5.9b is active all of the time (on two other OSDs). I think 
OSD.19 is the primary for that pg.

"ceph pg 5.9b query" thells me :
.
    "up": [
    19,
    166
    ],
    "acting": [
    19,
    166
    ],
    "actingbackfill": [
    "19",
    "166"
    ],


3) when pg 5.9b got backfilled into osd.130, osd.130 crashed again? 
(But the other OSDs kept the PG fully available, without crashing?)

Yes.

It crashes again with the following lines in the osd log :
    -2> 2018-05-16 11:11:59.639980 7fe812ffd700  5 -- 
10.7.2.141:6800/173031 >> 10.7.2.49:6836/3920 conn(0x5619ed76c000 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=24047 cs=1 l=0). rx 
osd.19 seq 24 0x5619eebd6d00 pg_backfill(progress 5.9b e 505567/505567 
lb 5:d97d84eb:::rbd_data.112913b238e1f29.0ba3:56c06) v3
    -1> 2018-05-16 11:11:59.639995 7fe812ffd700  1 -- 
10.7.2.141:6800/173031 <== osd.19 10.7.2.49:6836/3920 24  
pg_backfill(progress 5.9b e 505567/505567 lb 
5:d97d84eb:::rbd_data.112913b238e1f29.0ba3:56c06) v3  
955+0+0 (3741758263 0 0) 0x5619eebd6d00 con 0x5619ed76c000
 0> 2018-05-16 11:11:59.645952 7fe7fe7eb700 -1 
/build/ceph-12.2.5/src/osd/PrimaryLogPG.cc: In function 'virtual void 
PrimaryLogPG::on_local_recover(const hobject_t&, const 
ObjectRecoveryInfo&, ObjectContextRef, bool, ObjectStore::Transaction*)' 
thread 7fe7fe7eb700 time 2018-05-16 11:11:59.640238
/build/ceph-12.2.5/src/osd/PrimaryLogPG.cc: 358: FAILED assert(p != 
recovery_info.ss.clone_snaps.end())


 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) 
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x102) [0x5619c11b1a02]
 2: (PrimaryLogPG::on_local_recover(hobject_t const&, 
ObjectRecoveryInfo const&, std::shared_ptr, bool, 
ObjectStore::Transaction*)+0xd63) [0x5619c0d1f873]
 3: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, 
PushReplyOp*, ObjectStore::Transaction*)+0x2da) [0x5619c0eb15ca]
 4: 
(ReplicatedBackend::_do_push(boost::intrusive_ptr)+0x12e) 
[0x5619c0eb17fe]
 5: 
(ReplicatedBackend::_handle_message(boost::intrusive_ptr)+0x2c1) 
[0x5619c0ec0d71]
 6: (PGBackend::handle_message(boost::intrusive_ptr)+0x50) 
[0x5619c0dcc440]
 7: (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
ThreadPool::TPHandle&)+0x543) [0x5619c0d30853]
 8: (OSD::dequeue_op(boost::intrusive_ptr, 
boost::intrusive_ptr, ThreadPool::TPHandle&)+0x3a9) 
[0x5619c0ba7539]
 9: (PGQueueable::RunVis::operator()(boost::intrusive_ptr 
const&)+0x57) [0x5619c0e50f37]
 10: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x1047) [0x5619c0bd5847]
 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) 
[0x5619c11b67f4]

 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5619c11b9830]
 13: (()+0x76ba) [0x7fe8173746ba]
 14: (clone()+0x6d) [0x7fe8163eb41d]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.




That sequence of events is *deeply* confusing and I really don't 
understand how it might happen.


Sadly I don't think you can grab a PG for export without stopping the 
OSD in question.



When we query the pg, we can see a lot of "snap_trimq".
Can this be cleaned somehow, even if the pg is undersized and
degraded ?


I *think* the PG will keep trimming snapshots even if 
undersized+degraded (though I don't remember for sure), but snapshot 
trimming is often heavily throttled and I'm not aware of any way to 
specifically push one PG to the front. If you're interested in 
speeding snaptrimming up you can search the archives or check the docs 
for the appropriate config options.

-Greg


Ok. I think we should try that next.

Thank you !





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Luminous - OSD constantly crashing caused by corrupted placement group

2018-05-15 Thread Siegfried Höllrigl



Hi !

We have upgraded our Ceph cluster (3 Mon Servers, 9 OSD Servers, 190 
OSDs total) From 10.2.10 to Ceph 12.2.4 and then to 12.2.5.
(A mixture of Ubuntu 14 and 16 with the Repos from 
https://download.ceph.com/debian-luminous/)


Now we have the Problem that One ODS is crashing again and again 
(approx. once per day). systemd restarts it.


We could now propably identify the problem. It looks like one placement 
group (5.9b) causes the crash.
It seems like it doesnt matter if it is running on a filestore or a 
bluestore osd.

We could even break it down to some RBDs that were in this pool.
They are already deleted, but it looks like there are some objects on 
the osd left, but we cant delete them :



rados -p rbd ls > radosrbdls.txt
echo radosrbdls.txt | grep -vE "($(rados -p rbd ls | grep rbd_header | 
grep -o "\.[0-9a-f]*" | sed -e :a -e '$!N; s/\n/|/; ta' -e 
's/\./\\./g'))" | grep -E '(rbd_data|journal|rbd_object_map)'

rbd_data.112913b238e1f29.0e3f
rbd_data.112913b238e1f29.09d2
rbd_data.112913b238e1f29.0ba3

rados -p rbd rm rbd_data.112913b238e1f29.0e3f
error removing rbd>rbd_data.112913b238e1f29.0e3f: (2) No 
such file or directory

rados -p rbd rm rbd_data.112913b238e1f29.09d2
error removing rbd>rbd_data.112913b238e1f29.09d2: (2) No 
such file or directory

rados -p rbd rm rbd_data.112913b238e1f29.0ba3
error removing rbd>rbd_data.112913b238e1f29.0ba3: (2) No 
such file or directory


In the "current" directory of the osd there are a lot more files with 
this rbd prefix.
Is there any chance to delete these obviously orpahed stuff before the 
pg becomes healthy ?

(it is running now at only 2 of 3 osds)

What else could cause such a crash ?


We attatch (hopefully all) of the relevant logs.



  -103> 2018-05-14 13:01:50.514850 7f389894c700  5 -- 10.7.2.141:6801/139719 >> 
10.7.2.49:0/2866 conn(0x55a13fd0d000 :6801 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=453 cs=1 l=1). rx osd.60 seq 
2720 0x55a13e7bac00 osd_ping(ping e502962 stamp 2018-05-14 13:01:50.511610) v4
  -102> 2018-05-14 13:01:50.514878 7f389894c700  1 -- 10.7.2.141:6801/139719 
<== osd.60 10.7.2.49:0/2866 2720  osd_ping(ping e502962 stamp 2018-05-14 
13:01:50.511610) v4  2004+0+0 (1134770966 0 0) 0x55a13e7bac00 con 
0x55a13fd0d000
  -101> 2018-05-14 13:01:50.514896 7f389894c700  1 -- 10.7.2.141:6801/139719 
--> 10.7.2.49:0/2866 -- osd_ping(ping_reply e502962 stamp 2018-05-14 
13:01:50.511610) v4 -- 0x55a13fd27200 con 0
  -100> 2018-05-14 13:01:50.525876 7f389894c700  5 -- 10.7.2.141:6801/139719 >> 
10.7.2.144:0/2988 conn(0x55a13f2dd000 :6801 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=865 cs=1 l=1). rx osd.179 seq 
2652 0x55a13e442600 osd_ping(ping e502962 stamp 2018-05-14 13:01:50.531899) v4
   -99> 2018-05-14 13:01:50.525902 7f389894c700  1 -- 10.7.2.141:6801/139719 
<== osd.179 10.7.2.144:0/2988 2652  osd_ping(ping e502962 stamp 2018-05-14 
13:01:50.531899) v4  2004+0+0 (3454691771 0 0) 0x55a13e442600 con 
0x55a13f2dd000
   -98> 2018-05-14 13:01:50.525917 7f389894c700  1 -- 10.7.2.141:6801/139719 
--> 10.7.2.144:0/2988 -- osd_ping(ping_reply e502962 stamp 2018-05-14 
13:01:50.531899) v4 -- 0x55a13fd27200 con 0
   -97> 2018-05-14 13:01:50.526649 7f389914d700  5 -- 10.0.0.28:6801/139719 >> 
10.0.0.24:0/2988 conn(0x55a13f2de800 :6801 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=869 cs=1 l=1). rx osd.179 seq 
2652 0x55a17bd8a200 osd_ping(ping e502962 stamp 2018-05-14 13:01:50.531899) v4
   -96> 2018-05-14 13:01:50.526675 7f389914d700  1 -- 10.0.0.28:6801/139719 <== 
osd.179 10.0.0.24:0/2988 2652  osd_ping(ping e502962 stamp 2018-05-14 
13:01:50.531899) v4  2004+0+0 (3454691771 0 0) 0x55a17bd8a200 con 
0x55a13f2de800
   -95> 2018-05-14 13:01:50.526688 7f389914d700  1 -- 10.0.0.28:6801/139719 --> 
10.0.0.24:0/2988 -- osd_ping(ping_reply e502962 stamp 2018-05-14 
13:01:50.531899) v4 -- 0x55a13e43ec00 con 0
   -94> 2018-05-14 13:01:50.546508 7f389994e700  5 -- 10.7.2.141:6800/139719 >> 
10.7.2.50:6802/2519 conn(0x55a13e724000 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=18716 cs=1 l=0). rx osd.47 
seq 4894 0x55a13ec9d000 MOSDScrubReserve(3.111 REQUEST e502962) v1
   -93> 2018-05-14 13:01:50.546537 7f389994e700  1 -- 10.7.2.141:6800/139719 
<== osd.47 10.7.2.50:6802/2519 4894  MOSDScrubReserve(3.111 REQUEST 
e502962) v1  43+0+0 (327031511 0 0) 0x55a13ec9d000 con 0x55a13e724000
   -92> 2018-05-14 13:01:50.546655 7f3883138700  1 -- 10.7.2.141:6800/139719 
--> 10.7.2.50:6802/2519 -- MOSDScrubReserve(3.111 REJECT e502962) v1 -- 
0x55a13e8fd200 con 0
   -91> 2018-05-14 13:01:50.547685 7f389994e700  5 -- 10.7.2.141:6800/139719 >> 
10.7.2.50:6802/2519 conn(0x55a13e724000 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=18716 cs=1 l=0). rx osd.47 
seq 4895 0x55a13e8fd200 MOSDScrubReserve(3.111 RELEASE e502962) v1
   -90> 2018-05-14 13:01:50.547714 7f389994e700  1 --