Re: [ceph-users] Sudden RADOS Gateway issues caused by missing xattrs
On 02/16/2014 09:22 PM, Sage Weil wrote: Hi Wido, On Sun, 16 Feb 2014, Wido den Hollander wrote: On 02/16/2014 06:49 PM, Gregory Farnum wrote: Did you maybe upgrade that box to v0.67.6? This sounds like one of the bugs Sage mentioned in it. No, I checked it again. Version is: ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1) All machines in the cluster are on that version. Are you sure none of the running ceph-osd processes aren't 0.67.6? Maybe check 'ceph daemon osd.NNN version'... I double-verified it again, but they are all running 0.67.5 Since for example osd.25 is down right now I can't run 'ceph daemon', but the md5sum of /usr/bin/ceph-osd is the same as on the other machines which are all on 0.67.5 Auto updates with Apt are not enabled, so there is no way these machines could be running 0.67.6 So I'm still confused. Wido sage Wido -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Feb 16, 2014 at 4:23 AM, Wido den Hollander w...@42on.com wrote: Hi, Yesterday I got a notification that a RGW setup was having issues with objects suddenly giving errors (403 and 404) when trying to access them. I started digging and after cranking up the logs with 'debug rados' and 'debug rgw' set to 20 I found what caused RGW to throw a error: librados: Objecter returned from getxattrs r=-2 Using ceph osd map .rgw.buckets object I found which OSDs were primary for that object's PG and I saw that they all came from one machine which got a clean shutdown and start just 24 hours before that. After taking that machine out of production the other OSDs took over and RGW started serving the objects again, but I'm confused. The underlying filesystem is XFS and all 6 filesystems were clean and healthy. Like I said, the machine only got a clean shutdown 24 hours before that due to a physical migration, but that's all. Did anybody see this before? Suddenly the xattrs for those objects were gone. This was with Ceph 0.67.5 -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Block Devices and OpenStack
Hi, Can I see your ceph.conf? I suspect that [client.cinder] and [client.glance] sections are missing. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Jean, Here is the output for ceph auth list for client.cinder client.cinder key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images Here is the output of ceph -s: ashish@ceph-client:~$ ceph -s cluster afa13fcd-f662-4778-8389-85047645d034 health HEALTH_OK monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, quorum 0 ceph-node1 osdmap e37: 3 osds: 3 up, 3 in pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects 106 MB used, 9076 MB / 9182 MB avail 576 active+clean I created all the keyrings and copied as suggested by the guide. On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com wrote: Hi, what do you get when you run a 'ceph auth list' command for the user name (client.cinder) you created for cinder? Are the caps and the key for this user correct? No typo in the hostname in the cinder.conf file (host=) ? Did you copy the keyring to the cinder running cinder (can’t really say from your output and there is no ceph-s command to check the monitor names)? It could just be a typo in the ceph auth get-or-create command that’s causing it. Rgds JC On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Cephers, I am trying to configure ceph rbd as backend for cinder and glance by following the steps mentioned in: http://ceph.com/docs/master/rbd/rbd-openstack/ Before I start all openstack services are running normally and ceph cluster health shows HEALTH_OK But once I am done with all steps and restart openstack services, cinder-volume fails to start and throws an error. 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in check_for_setup_error 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__ 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in _connect_to_rados 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect() 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise make_ex(ret, error calling connect) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling connect: error code 95 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 2014-02-16 00:01:42.591 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered during initialization of driver: RBDDriver 2014-02-16 00:01:42.592 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster 2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most recent call last): 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host 2014-02-16 00:01:42.592 TRACE cinder.volume.manager self.driver.check_for_setup_error() 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in check_for_setup_error 2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise exception.VolumeBackendAPIException(data=msg) 2014-02-16 00:01:42.592 TRACE cinder.volume.manager VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster Here is the content of my /etc/ceph in openstack node: ashish@ubuntu:/etc/ceph$ ls -lrt total 16 -rw-r--r-- 1 cinder cinder 229 Feb 15 23:45 ceph.conf -rw-r--r-- 1 glance glance 65 Feb 15 23:46 ceph.client.glance.keyring -rw-r--r-- 1 cinder cinder 65 Feb 15 23:47 ceph.client.cinder.keyring -rw-r--r-- 1 cinder cinder 72 Feb 15 23:47
[ceph-users] Journal thoughts
Hi All, I've been looking, but haven't been able to find any detailed documentation about the journal usage on OSDs. Does anyone have any detailed docs they could share? My initial questions are: Is the journal always write-only? (except under recovery) I'm using BTRFS, in the default layout, which I'm thinking is very inefficient as it basically forces the discs to seek all the time. (journal partition at start of disc) Is there a documented process to relocate the journal, without re-creating the OSD? What have other people done to optimize the journal without purchasing SSD's? On another point, I'm running on HP Microservers (slow CPU - two cores) with 5 discs - 1x OS, 4x OSD... I've currently got separate OSDs, however have high load due to having more OSD's than cores in the box. I'm thinking of JBOD'ing the OSD discs into pairs using LVM (different sized disks) so I have only two OSD's, does anyone have any opinions on the merits of this? Also has anyone seen any CPU usage comparisons of XFS vs EXT4 vs BTRFS? Obviously I know I'm running an enterprise system on a shoe string, however I'm keen to use this as a test bed to get comfortable with ceph before recommending it in a real production environment, and I think optimizing and understanding it here could have great benefits when I scale out. Lots of questsions, and as ever any insight would be appreciated on any of the points! Regards Alex -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Error Adding Keyring Entries
Could someone help me with the following error when I try to add keyring entries: # ceph -k /etc/ceph/ceph.client.admin.keyring auth add client.radosgw.gateway -i /etc/ceph/keyring.radosgw.gateway Error EINVAL: entity client.radosgw.gateway exists but key does not match # Best, G. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ReAsk: how to tell ceph-mon to listen on a specific address only
Hi, that's by design. The monitor always listens to the public side only, if a public network is defined. If you want everything in the cluster network, just don't specify a seperate public/cluster network. But that's all documented in great detail at http://ceph.com/docs/master/rados/configuration/network-config-ref/ Best regards, Kurt Ron Gage mailto:r...@rongage.org 16. Februar 2014 19:57 Hi everyone: I am still trying unsuccessfully to implement a test array for a POC. It is still failing to set up - specifically, the admin keyring is not getting set up. Setup is 4 x OSD, 1 x Mon/Mgr. The Mon machine is the only one that is multi-homed - eth0 on a private subnet for internal Ceph communications and eth1 on a so-called public interface. The problem is that ceph-mon is creating the listener on the public interface and ceph-deploy is trying to talk to it on the private interface. [ceph@cm my-cluster]$ sudo netstat -ln Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp0 0 172.24.12.91:6789 0.0.0.0:* LISTEN tcp0 0 :::22 :::*LISTEN udp0 0 0.0.0.0:68 0.0.0.0:* Active UNIX domain sockets (only servers) Proto RefCnt Flags Type State I-Node Path unix 2 [ ACC ] STREAM LISTENING 6706 @/com/ubuntu/upstart unix 2 [ ACC ] STREAM LISTENING 20490 /var/run/ceph/ceph-mon.cm.asok [ceph@cm my-cluster]$ cat ceph.conf [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 10.0.0.6 mon_initial_members = cm fsid = a7e0fd33-1f75-46f0-be00-152601c8fbf2 mon host = 10.0.0.6 cluster network = 10.0.0.0/24 public network = 172.24.0.0/16 debug mon = 10 debug ms = 1 [mon.a] host = cm mon addr = 10.0.0.6:6789 [ceph@cm my-cluster]$ How can I fix this? Thanks! Ron ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Error Adding Keyring Entries
I managed to solve my problem by deleting the key from the list and re-adding it! Best, G. On Mon, 17 Feb 2014 10:46:36 +0200, Georgios Dimitrakakis wrote: Could someone help me with the following error when I try to add keyring entries: # ceph -k /etc/ceph/ceph.client.admin.keyring auth add client.radosgw.gateway -i /etc/ceph/keyring.radosgw.gateway Error EINVAL: entity client.radosgw.gateway exists but key does not match # Best, G. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem starting RADOS Gateway
Could someone check this: http://pastebin.com/DsCh5YPm and let me know what am I doing wrong? Best, G. On Sat, 15 Feb 2014 20:27:16 +0200, Georgios Dimitrakakis wrote: 1) ceph -s is working as expected # ceph -s cluster c465bdb2-e0a5-49c8-8305-efb4234ac88a health HEALTH_OK monmap e1: 1 mons at {master=192.168.0.10:6789/0}, election epoch 1, quorum 0 master mdsmap e111: 1/1/1 up {0=master=up:active} osdmap e114: 2 osds: 2 up, 2 in pgmap v414: 1200 pgs, 14 pools, 10596 bytes data, 67 objects 500 GB used, 1134 GB / 1722 GB avail 1200 active+clean 2) In /etc/ceph I have the following files # ls -l total 20 -rw-r--r-- 1 root root 64 Feb 14 17:10 ceph.client.admin.keyring -rw-r--r-- 1 root root 401 Feb 15 16:57 ceph.conf -rw-r--r-- 1 root root 196 Feb 14 20:26 ceph.log -rw-r--r-- 1 root root 120 Feb 15 11:08 keyring.radosgw.gateway -rwxr-xr-x 1 root root 92 Dec 21 00:47 rbdmap 3) ceph.conf content is the following # cat ceph.conf [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 192.168.0.10 mon_initial_members = master fsid = c465bdb2-e0a5-49c8-8305-efb4234ac88a [client.radosgw.gateway] host = master keyring = /etc/ceph/keyring.radosgw.gateway rgw socket path = /tmp/radosgw.sock log file = /var/log/ceph/radosgw.log 4) And all the keys that exist are the following: # ceph auth list installed auth entries: mds.master key: xx== caps: [mds] allow caps: [mon] allow profile mds caps: [osd] allow rwx osd.0 key: xx== caps: [mon] allow profile osd caps: [osd] allow * osd.1 key: xx== caps: [mon] allow profile osd caps: [osd] allow * client.admin key: xx== caps: [mds] allow caps: [mon] allow * caps: [osd] allow * client.bootstrap-mds key: xx== caps: [mon] allow profile bootstrap-mds client.bootstrap-osd key: AQBWLf5SGBAyBRAAzLwi5OXsAuR5vdo8hs+2zw== caps: [mon] allow profile bootstrap-osd client.radosgw.gateway key: xx== caps: [mon] allow rw caps: [osd] allow rwx I still don't get what is wrong... G. On Sat, 15 Feb 2014 16:27:41 +0100, Udo Lembke wrote: Hi, does ceph -s also stuck on missing keyring? Do you have an keyring like: cat /etc/ceph/keyring [client.admin] key = AQCdkHZR2NBYMBAATe/rqIwCI96LTuyS3gmMXp== Or do you have anothe defined keyring in ceph.conf? global-section - keyring = /etc/ceph/keyring The key is in ceph - see ceph auth get-key client.admin AQCdkHZR2NBYMBAATe/rqIwCI96LTuyS3gmMXp== or ceph auth list for all keys. Key-genaration is doing by get-or-create key like this (but in this case for bootstap-osd): ceph auth get-or-create-key client.bootstrap-osd mon allow profile bootstrap-osd Udo On 15.02.2014 15:35, Georgios Dimitrakakis wrote: Dear all, I am following this guide http://ceph.com/docs/master/radosgw/config/ to setup Object Storage on CentOS 6.5. My problem is that when I try to start the service as indicated here: http://ceph.com/docs/master/radosgw/config/#restart-services-and-start-the-gateway I get nothing # service ceph-radosgw start Starting radosgw instance(s)... and if I check if the service is running obviously it is not! # service ceph-radosgw status /usr/bin/radosgw is not running. If I try to start it manually without using the service command I get the following: # /usr/bin/radosgw -d -c /etc/ceph/ceph.conf --debug_ms 10 2014-02-15 16:03:38.709235 7fb65ba64820 0 ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60), process radosgw, pid 24619 2014-02-15 16:03:38.709249 7fb65ba64820 -1 WARNING: libcurl doesn't support curl_multi_wait() 2014-02-15 16:03:38.709252 7fb65ba64820 -1 WARNING: cross zone / region transfer performance may be affected 2014-02-15 16:03:38.713898 7fb65ba64820 10 -- :/0 ready :/0 2014-02-15 16:03:38.714323 7fb65ba64820 1 -- :/0 messenger.start 2014-02-15 16:03:38.714434 7fb65ba64820 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication 2014-02-15 16:03:38.714440 7fb65ba64820 0 librados: client.admin initialization error (2) No such file or directory 2014-02-15 16:03:38.714463 7fb65ba64820 10 -- :/1024619 shutdown :/1024619 2014-02-15 16:03:38.714468 7fb65ba64820 1 -- :/1024619 mark_down_all 2014-02-15 16:03:38.714477 7fb65ba64820 10 -- :/1024619 wait: waiting for dispatch queue 2014-02-15 16:03:38.714406 7fb64b5fe700 10 -- :/1024619 reaper_entry start 2014-02-15 16:03:38.714506 7fb64b5fe700 10 -- :/1024619 reaper 2014-02-15 16:03:38.714522 7fb64b5fe700 10 -- :/1024619 reaper done 2014-02-15 16:03:38.714764 7fb65ba64820 10 --
[ceph-users] Important note for sender / Важное сообщение для отправителя (was: ceph-users Digest, Vol 13, Issue 14)
Dear sender, If you wish I read and respond to this e-mail for sure, please, build subject like KUDRYAVTSEV/Who wrote/Subject. for example, KUDRYAVTSEV/Bitworks/Some subject there... Best wishes, Ivan Kudryavtsev ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] eu.ceph.com rsync issues resolved
Hi all, I just noticed that eu.ceph.com had some stale data since rsync wasn't running with the --delete option. I've just added it to the sync script and it's syncing right now, shouldn't take that much time and should finish within the hour. Btw, nice to see that ceph.com now also has a -record, that means that both locations are available through IPv6 :-) -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests from rados bench with small writes
On 02/16/2014 05:18 PM, Sage Weil wrote: Good catch! It sounds like what is needed here is for the deb and rpm packages to add /var/lib/ceph to the PRUNEPATHS in /etc/updatedb.conf. Unfortunately there isn't a /etc/updatedb.conf.d type file, so that promises to be annoying. Has anyone done this before? No, I haven't, but I've seen this before. With Puppet I also overwrite this file. Btw, I suggest we also contact Canonical to add 'ceph' to PRUNEFS, otherwise clients will start indexing CephFS filesystems later. Wido sage On Sun, 16 Feb 2014, Dan van der Ster wrote: After some further digging I realized that updatedb was running over the pgs, indexing all the objects. (According to iostat, updatedb was keeping the indexed disk 100% busy!) Oops! Since the disks are using the deadline elevator (which by default prioritizes reads over writes, and gives writes a deadline of 5 seconds!), it is perhaps conceivable (yet still surprising) that the queues on a few disks were so full of reads that the writes were starved for many 10s of seconds. I've killed updatedb everywhere and now the rados bench below isn't triggering slow requests. So now I'm planning to tune deadline so it doesn't prioritize reads so much, namely by decreasing write_expire to equal read_expire at 500ms, and setting writes_starved to 1. Initial tests are showing that this further decreases latency a bit -- but my hope is that this will eliminate the possibility of a very long tail of writes. I hope that someone will chip in if they've already been down this path and has advice/warnings. Cheers, dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On Sat, Feb 15, 2014 at 11:48 PM, Dan van der Ster daniel.vanders...@cern.ch wrote: Dear Ceph experts, We've found that a single client running rados bench can drive other users, ex. RBD users, into slow requests. Starting with a cluster that is not particularly busy, e.g. : 2014-02-15 23:14:33.714085 mon.0 xx:6789/0 725224 : [INF] pgmap v6561996: 27952 pgs: 27952 active+clean; 66303 GB data, 224 TB used, 2850 TB / 3075 TB avail; 4880KB /s rd, 28632KB/s wr, 271op/s We then start a rados bench writing many small objects: rados bench -p test 60 write -t 500 -b 1024 --no-cleanup which gives these results (note the 60s max latency!!): Total time run: 86.351424 Total writes made: 91425 Write size: 1024 Bandwidth (MB/sec): 1.034 Stddev Bandwidth: 1.26486 Max bandwidth (MB/sec): 7.14941 Min bandwidth (MB/sec): 0 Average Latency: 0.464847 Stddev Latency: 3.04961 Max latency: 66.4363 Min latency: 0.003188 30 seconds into this bench we start seeing slow requests, not only from bench writes but also some poor RBD clients, e.g.: 2014-02-15 23:16:02.820507 osd.483 xx:6804/46799 2201 : [WRN] slow request 30.195634 seconds old, received at 2014-02-15 23:15:32.624641: osd_sub_op(client.18535427.0:3922272 4.d42 4eb00d42/rbd_data.11371325138b774.6577/head//4 [] v 42083'71453 snapset=0=[]:[] snapc=0=[]) v7 currently commit sent During a longer, many-hour instance of this small write test, some of these RBD slow writes became very user visible, with disk flushes being blocked long enough (120s) for the VM kernels to start complaining. A rados bench from a 10Gig-e client writing 4MB objects doesn't have the same long tail of latency, namely: # rados bench -p test 60 write -t 500 --no-cleanup ... Total time run: 62.811466 Total writes made: 8553 Write size: 4194304 Bandwidth (MB/sec): 544.678 Stddev Bandwidth: 173.163 Max bandwidth (MB/sec): 1000 Min bandwidth (MB/sec): 0 Average Latency: 3.50719 Stddev Latency: 0.309876 Max latency: 8.04493 Min latency: 0.166138 and there are zero slow requests, at least during this 60s duration. While the vast majority of small writes are completing with a reasonable sub-second latency, what is causing the very long tail seen by a few writes?? -- 60-120s!! Can someone advise us where to look in the perf dump, etc... to find which resource/queue is being exhausted during these tests? Oh yeah, we're running latest dumpling stable, 0.67.5, on the servers. Best Regards, Thanks in advance! Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Unable to add monitor nodes
Hi, I am new user of ceph. I have installed a three node cluster following the ceph document. I have added OSDs and initial monitor. But while adding additional monitors, I am receiving this error as shown below. user1@cephadmin:~/my-cluster$ ceph-deploy mon create cephnode2 [ceph_deploy.cli][INFO ] Invoked (1.3.5): /usr/bin/ceph-deploy mon create cephnode2 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts cephnode2 [ceph_deploy.mon][DEBUG ] detecting platform for host cephnode2 ... [cephnode2][DEBUG ] connected to host: cephnode2 [cephnode2][DEBUG ] detect platform information from remote host [cephnode2][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 12.04 precise [cephnode2][DEBUG ] determining if provided host has same hostname in remote [cephnode2][DEBUG ] get remote short hostname [cephnode2][DEBUG ] deploying mon to cephnode2 [cephnode2][DEBUG ] get remote short hostname [cephnode2][DEBUG ] remote hostname: cephnode2 [cephnode2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [cephnode2][DEBUG ] create the mon path if it does not exist [cephnode2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-cephnode2/done [cephnode2][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-cephnode2/done [cephnode2][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-cephnode2.mon.keyring [cephnode2][DEBUG ] create the monitor keyring file [cephnode2][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i cephnode2 --keyring /var/lib/ceph/tmp/ceph-cephnode2.mon.keyring [cephnode2][DEBUG ] ceph-mon: set fsid to b3d4e423-25a2-4380-8595-8b3fae4f8806 [cephnode2][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-cephnode2 for mon.cephnode2 [cephnode2][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-cephnode2.mon.keyring [cephnode2][DEBUG ] create a done file to avoid re-doing the mon deployment [cephnode2][DEBUG ] create the init path if it does not exist [cephnode2][DEBUG ] locating the `service` executable... [cephnode2][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=cephnode2 [cephnode2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.cephnode2.asok mon_status [cephnode2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [cephnode2][WARNIN] monitor: mon.cephnode2, might not be running yet [cephnode2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.cephnode2.asok mon_status [cephnode2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [cephnode2][WARNIN] cephnode2 is not defined in `mon initial members` [cephnode2][WARNIN] monitor cephnode2 does not exist in monmap [cephnode2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors [cephnode2][WARNIN] monitors may not be able to form quorum - What is the error about ? Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. . __ www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Unable to add monitor nodes
___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Block Devices and OpenStack
Hi Sebastian, Jean; This is my ceph.conf looks like. It was auto generated using ceph-deploy. [global] fsid = afa13fcd-f662-4778-8389-85047645d034 mon_initial_members = ceph-node1 mon_host = 10.0.1.11 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true If I provide admin.keyring file to openstack node (in /etc/ceph) it works fine and issue is gone . Thanks Ashish On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han sebastien@enovance.comwrote: Hi, Can I see your ceph.conf? I suspect that [client.cinder] and [client.glance] sections are missing. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Jean, Here is the output for ceph auth list for client.cinder client.cinder key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images Here is the output of ceph -s: ashish@ceph-client:~$ ceph -s cluster afa13fcd-f662-4778-8389-85047645d034 health HEALTH_OK monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, quorum 0 ceph-node1 osdmap e37: 3 osds: 3 up, 3 in pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects 106 MB used, 9076 MB / 9182 MB avail 576 active+clean I created all the keyrings and copied as suggested by the guide. On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com wrote: Hi, what do you get when you run a 'ceph auth list' command for the user name (client.cinder) you created for cinder? Are the caps and the key for this user correct? No typo in the hostname in the cinder.conf file (host=) ? Did you copy the keyring to the cinder running cinder (can't really say from your output and there is no ceph-s command to check the monitor names)? It could just be a typo in the ceph auth get-or-create command that's causing it. Rgds JC On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Cephers, I am trying to configure ceph rbd as backend for cinder and glance by following the steps mentioned in: http://ceph.com/docs/master/rbd/rbd-openstack/ Before I start all openstack services are running normally and ceph cluster health shows HEALTH_OK But once I am done with all steps and restart openstack services, cinder-volume fails to start and throws an error. 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in check_for_setup_error 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__ 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in _connect_to_rados 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect() 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise make_ex(ret, error calling connect) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling connect: error code 95 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 2014-02-16 00:01:42.591 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered during initialization of driver: RBDDriver 2014-02-16 00:01:42.592 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster 2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most recent call last): 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host 2014-02-16 00:01:42.592 TRACE cinder.volume.manager self.driver.check_for_setup_error() 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in check_for_setup_error 2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise exception.VolumeBackendAPIException(data=msg) 2014-02-16 00:01:42.592 TRACE
Re: [ceph-users] Block Devices and OpenStack
Hi, If cinder-volume fails to connect and putting the admin keyring works it means that cinder is not configured properly. Please also try to add the following: [client.cinder] keyring = path-to-keyring Same for Glance. Btw: ceph.conf doesn’t need to be own by Cinder, just let mod +r and keep root as owner. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 17 Feb 2014, at 14:48, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Sebastian, Jean; This is my ceph.conf looks like. It was auto generated using ceph-deploy. [global] fsid = afa13fcd-f662-4778-8389-85047645d034 mon_initial_members = ceph-node1 mon_host = 10.0.1.11 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true If I provide admin.keyring file to openstack node (in /etc/ceph) it works fine and issue is gone . Thanks Ashish On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han sebastien@enovance.com wrote: Hi, Can I see your ceph.conf? I suspect that [client.cinder] and [client.glance] sections are missing. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Jean, Here is the output for ceph auth list for client.cinder client.cinder key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images Here is the output of ceph -s: ashish@ceph-client:~$ ceph -s cluster afa13fcd-f662-4778-8389-85047645d034 health HEALTH_OK monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, quorum 0 ceph-node1 osdmap e37: 3 osds: 3 up, 3 in pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects 106 MB used, 9076 MB / 9182 MB avail 576 active+clean I created all the keyrings and copied as suggested by the guide. On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com wrote: Hi, what do you get when you run a 'ceph auth list' command for the user name (client.cinder) you created for cinder? Are the caps and the key for this user correct? No typo in the hostname in the cinder.conf file (host=) ? Did you copy the keyring to the cinder running cinder (can’t really say from your output and there is no ceph-s command to check the monitor names)? It could just be a typo in the ceph auth get-or-create command that’s causing it. Rgds JC On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Cephers, I am trying to configure ceph rbd as backend for cinder and glance by following the steps mentioned in: http://ceph.com/docs/master/rbd/rbd-openstack/ Before I start all openstack services are running normally and ceph cluster health shows HEALTH_OK But once I am done with all steps and restart openstack services, cinder-volume fails to start and throws an error. 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in check_for_setup_error 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__ 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in _connect_to_rados 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect() 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise make_ex(ret, error calling connect) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling connect: error code 95 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 2014-02-16 00:01:42.591 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered during initialization of driver: RBDDriver 2014-02-16 00:01:42.592 ERROR cinder.volume.manager
[ceph-users] ceph hadoop using ambari
I posted this to ceph-devel-owner before seeing that this is the correct place to post. My company is trying to evaluate virtualized hdfs clusters using ceph as a drop-in replacement for staging and development following http://ceph.com/docs/master/cephfs/hadoop/. We deploy clusters with ambari 1.3.2. I spun up a 10 node cluster with 3 datanodes, name, secondary, 3 zookeepers, ambari master, and accumulo master. Our process is This was likely the cause of shutdown errors. Should do 1. Run ambari install 2. shut down all ambari services 3. push modified core-site.xml to datanodes, name, secondary 4. restart ambari services I am getting errors /usr/lib/hadoop/bin/hadoop-daemon.sh: Permission denied in the ambari console error log from the command: su - hdfs -c 'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode' I think this is an ambari issue, but I¹m wondering 1. Is there a detailed guide of using ambari with ceph-hadoop, or has anyone tried it? 2. Is there a script or list of log files useful for debugging ceph issues in general? thanks, kesten ps. I have opened a gist via https://gist.github.com/darKoram/9051450 and an issue on the horton forums at http://hortonworks.com/community/forums/topic/ambari-restart-services-give- bash-usrlibhadoopbinhadoop-daemon-sh-permiss/#post-48793 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph hadoop using ambari
Hi Kesten, It's a little difficult to tell what the source of the problem is, but looking at the gist you referenced, I don't see anything that would indicate that Ceph is causing the issue. For instance, hadoop-mapred-tasktracker-xxx-yyy-hdfs01.log looks like Hadoop daemons are having problems conneting to each other. Finding out what command in hadoop-daemon.sh is causing the permission errors might be informative, but I don't have any experience with Ambari. On Mon, Feb 17, 2014 at 9:23 AM, Kesten Broughton kbrough...@21ct.com wrote: I posted this to ceph-devel-owner before seeing that this is the correct place to post. My company is trying to evaluate virtualized hdfs clusters using ceph as a drop-in replacement for staging and development following http://ceph.com/docs/master/cephfs/hadoop/. We deploy clusters with ambari 1.3.2. I spun up a 10 node cluster with 3 datanodes, name, secondary, 3 zookeepers, ambari master, and accumulo master. Our process is This was likely the cause of shutdown errors. Should do 1. Run ambari install 2. shut down all ambari services 3. push modified core-site.xml to datanodes, name, secondary 4. restart ambari services I am getting errors /usr/lib/hadoop/bin/hadoop-daemon.sh: Permission denied in the ambari console error log from the command: su - hdfs -c 'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode' I think this is an ambari issue, but I¹m wondering 1. Is there a detailed guide of using ambari with ceph-hadoop, or has anyone tried it? 2. Is there a script or list of log files useful for debugging ceph issues in general? thanks, kesten ps. I have opened a gist via https://gist.github.com/darKoram/9051450 and an issue on the horton forums at http://hortonworks.com/community/forums/topic/ambari-restart-services-give- bash-usrlibhadoopbinhadoop-daemon-sh-permiss/#post-48793 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD flapping during recovery
I had some issues with OSD flapping after 2 days of recovery. It appears to be related to swapping, even though I have plenty of RAM for the number of OSDs I have. The cluster was completely unusable, and I ended up rebooting all the nodes. It's been great ever since, but I'm assuming it will happen again. Details are below, but I'm wondering if anybody has any idea what happened? I noticed some lumpy data distribution on my OSDs. Following the advice on the mailling list, I increased the pg_num and pgp_num to the values from the formula. .rgw.buckets is the only large pool, so I increased pg_num and pgp_num from 128 to 2048 on that one pool. Cluster status changes to HEALTH_WARN, there were 1920 PGs with state active+remapped+wait_backfill, and 32% of the objects were degraded. Recovery was slow, and we were having some performance issues. I lowered osd_max_backfills from 10 to 2, and osd_recovery_op_priority from 10 to 2. This didn't slow the recovery down much, but made my application much more responsive. My journals are on the OSD disks (no SSDs). I believe the osd_max_backfills was the more important change, but it's much slower to test than the osd_recovery_op_priority change. Aside from those two, my notes say I changed and reverted osd_disk_threads, osd_op_threads, osd_recovery_threads. All changes were pushed out using ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config set osd_max_backfills 2 I watched the cluster on and off over the weekend. Ceph was steadily recovering. It was down to ~900 PGs in active+remapped+wait_backfill, with 17% of objects degraded. A few OSDs have been marked down and recovered, so a few tens of PGs are in state active+degraded+remapped+wait_backfill and active+degraded+remapped+backfilling. I was poking around, and I noticed kswapd was using betwen 5% and 30% CPU on all nodes. It was bursty, peaking at 30% CPU usage for about 5sec out of every 30sec. Swap usage wasn't increasing, and kswapd appeared to be doing a lot of nothing. My machines have 8 OSDs, and 36GB of RAM. top said that all machines were caching 30GB of data. The 8 ceph-osd daemons are using 0.5GB to 1.2GB of RAM. I don't have the exact numbers, but I believe they were using about 5GB for all 8 ceph-osd daemons. A few hours later, and the OSDs really started flapping. They're being voted unresponsive and marked down faster than they can rejoin. At one point, a third of the OSDs were marked down. ceph -w is complaining about hundreds of slow requests greater than 900 seconds. Most RGW accesses are failing with HTTP timeouts. kswapd is using a consistent 33% CPU on all nodes, with no variance that I can see. To add insult, the cluster was running a scrub and a deep scrub. I eventually rebooted all nodes in the cluster, one at a time. Once quorum reestablished, recovery proceeded at the original speed. The OSDs are responding, and all my RGW requests are returning in a reasonable amount of time. There are no complaints of slow requests in ceph -w. kswapd is using 0% of the CPU. I'm running Ceph 0.72.2 on Ubuntu 12.04.4, with kernel 3.5.0-37-generic #58~precise1-Ubuntu SMP. I monitor the running version as well as the installed version, so I know that all daemons were restarted after the 0.72.1 - 0.72.2 upgrade. That happened on Jan 22nd. Any idea what happened? I'm assuming it will happen again if recovery takes long enough. -- *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com *Central Desktop. Work together in ways you never thought possible.* Connect with us Website http://www.centraldesktop.com/ | Twitter http://www.twitter.com/centraldesktop | Facebook http://www.facebook.com/CentralDesktop | LinkedIn http://www.linkedin.com/groups?gid=147417 | Blog http://cdblog.centraldesktop.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD flapping during recovery
On Mon, 17 Feb 2014 11:24:42 -0800 Craig Lewis wrote: [kswapd going bersek] Any idea what happened? I'm assuming it will happen again if recovery takes long enough. You're running into a well known, but poorly rectified (if at all) kernel problem, there is little Ceph has to do with it other than doing what is supposed to, move large amounts of data around. Check out: https://bugzilla.redhat.com/show_bug.cgi?id=712019 and linked from there: https://lkml.org/lkml/2013/3/17/50 Ubuntu has this bug logged as well: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/721896 Regards, Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Maximum realistic latency between Mons, MDS, and OSDs
Hi all, I've been playing with Ceph across high latency high speed links, with a range of results. In general, Ceph MDS, monitors, and OSDs are solid across thousand kilometre network links. Jitter is low, latency is predictable, and capacity of the network is well beyond what the servers can push. I get the obvious expected slowdowns to do with quorum and other communications, the cluster remaining reliable and in sync. What is the generally accepted maximum latency for a usable Ceph cluster between nodes? I realise it's quite an open ended question ,with quite a number of ifs and buts about it. I am however interested to hear what people have done in production and accepted, and what tweaks have been done. For reference, I have an experimental cluster running across 12ms between the nodes, and while the IOPS itself is down, I'm able to write in at a few hundred Mbps which would cover many use cases. The latency was the obvious issue. David Jericho Senior Systems Administrator AARNet Pty Ltd t. +61 (0) 7 3317 9576 m. +61 423 027 185 e. david.jeri...@aarnet.edu.au w. www.aarnet.edu.au street address: Ground Floor, 143 Coronation Drive, Milton, QLD, 4064, Australia Please consider the environment before printing this email. important This email and any files transmitted with it are confidential, and the rights of confidentiality in such information are not waived or lost by its mistaken delivery to you. Any dissemination, copying, use or disclosure of the email and/or such files without the permission of AARNet, or the sender, is strictly prohibited. If you have received this email in error, please contact the sender immediately and delete all copies of this transmission. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com