[ceph-users] How do you replace an OSD?
I just got my small Ceph cluster running. I run 6 OSDs on the same server to basically replace mdraid. I have tried to simulate a hard drive (OSD) failure: removed the OSD (out+stop), zapped it, and then prepared and activated it. It worked, but I ended up with one extra OSD (and the old one still showing in the ceph -w output). I guess this is not how I am supposed to do it? Documentation recommends manually editing the configuration, however, there are no osd entries in my /etc/ceph/ceph.conf So what would be the best way to replace a failed OSD? Dmitry ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy and journal on separate disk
Hi. Yes, i'm zapped all disks before. More about my situation: sdaa - one of disk for data: 3 TB with GPT partition table. sda - ssd drive with manual created partitions (10 GB) for journal with MBR partition table. === fdisk -l /dev/sda Disk /dev/sda: 480.1 GB, 480103981056 bytes 255 heads, 63 sectors/track, 58369 cylinders, total 937703088 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00033624 Device Boot Start End Blocks Id System /dev/sda1204819531775 9764864 83 Linux /dev/sda21953177639061503 9764864 83 Linux /dev/sda33906150458593279 9765888 83 Linux /dev/sda47812505697656831 9765888 83 Linux === If i'm executed ceph-deploy osd prepare without journal options - it's ok: ceph@ceph-admin:~$ ceph-deploy disk zap ceph001:sdaa ceph001:sda1 [ceph_deploy.osd][DEBUG ] zapping /dev/sdaa on ceph001 [ceph_deploy.osd][DEBUG ] zapping /dev/sda1 on ceph001 ceph@ceph-admin:~$ ceph-deploy osd prepare ceph001:sdaa [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaa: [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001 [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaa journal None activate False root@ceph001:~# gdisk -l /dev/sdaa GPT fdisk (gdisk) version 0.8.1 Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. Disk /dev/sdaa: 5860533168 sectors, 2.7 TiB Logical sector size: 512 bytes Disk identifier (GUID): 575ACF17-756D-47EC-828B-2E0A0B8ED757 Partition table holds up to 128 entries First usable sector is 34, last usable sector is 5860533134 Partitions will be aligned on 2048-sector boundaries Total free space is 4061 sectors (2.0 MiB) Number Start (sector)End (sector) Size Code Name 1 2099200 5860533134 2.7 TiB ceph data 22048 2097152 1023.0 MiB ceph journal Problems start, when i'm try create journal on separate drive: ceph@ceph-admin:~$ ceph-deploy disk zap ceph001:sdaa ceph001:sda1 [ceph_deploy.osd][DEBUG ] zapping /dev/sdaa on ceph001 [ceph_deploy.osd][DEBUG ] zapping /dev/sda1 on ceph001 ceph@ceph-admin:~$ ceph-deploy osd prepare ceph001:sdaa:sda1 [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaa:/dev/sda1 [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001 [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaa journal /dev/sda1 activate False [ceph_deploy.osd][ERROR ] ceph-disk-prepare -- /dev/sdaa /dev/sda1 returned 1 Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.fZQxiz']' returned non-zero exit status 32 ceph-deploy: Failed to create 1 OSDs -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: Monday, August 12, 2013 11:39 PM To: Pavel Timoschenkov Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] ceph-deploy and journal on separate disk Did you try using ceph-deploy disk zap ceph001:sdaa first? -Sam On Mon, Aug 12, 2013 at 6:21 AM, Pavel Timoschenkov pa...@bayonetteas.onmicrosoft.com wrote: Hi. I have some problems with create journal on separate disk, using ceph-deploy osd prepare command. When I try execute next command: ceph-deploy osd prepare ceph001:sdaa:sda1 where: sdaa - disk for ceph data sda1 - partition on ssd drive for journal I get next errors: == == ceph@ceph-admin:~$ ceph-deploy osd prepare ceph001:sdaa:sda1
Re: [ceph-users] Ceph instead of RAID
On 08/13/2013 09:23 AM, Jeffrey 'jf' Lim wrote: Anyway, I thought what if instead of RAID-10 I use ceph? All 6 disks will be local, so I could simply create 6 local OSDs + a monitor, right? Is there anything I need to watch out for in such configuration? You can do that. Although it's nice to play with and everything, I wouldn't recommend doing it. It will give you more pain than pleasure. How so? Care to elaborate? Ceph is a complex system, built for clusters. It does some stuff in software that is otherwhise done in hardware (raid controllers). The nature of the complexity of a cluster system is a lot of overhead compared to a local raid [whatever] system, and latency of disk i/o will naturally suffer a bit. An OSD needs about 300 MB of RAM (may vary on your PGs), times 6 is a waste of nearly 2 GB of RAM (compared to a local RAID). Also ceph is young, and it does indeed have some bugs. RAID is old, and very mature. Although I rely on ceph on a productive cluster, too, it is way harder to maintain than a simple local raid. When a disk fails in ceph you don't have to worry about your data, which is a good thing, but you have to worry about the rebuilding (which isn't too hard, but at least you need to know SOMETHING about ceph), with (hardware) RAID you simply replace the disk, and it will be rebuilt. Others will find more reasons why this is not the best idea for a production system. Don't get me wrong, I'm a big supporter of ceph, but only for clusters, not for single systems. wogri -jf -- He who settles on the idea of the intelligent man as a static entity only shows himself to be a fool. Every nonfree program has a lord, a master -- and if you use the program, he is your master. --Richard Stallman -- DI (FH) Wolfgang Hennerbichler Software Development Unit Advanced Computing Technologies RISC Software GmbH A company of the Johannes Kepler University Linz IT-Center Softwarepark 35 4232 Hagenberg Austria Phone: +43 7236 3343 245 Fax: +43 7236 3343 250 wolfgang.hennerbich...@risc-software.at http://www.risc-software.at ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph instead of RAID
This will be a single server configuration, the goal is to replace mdraid, hence I tried to use localhost (nothing more will be added to the cluster). Are you saying it will be less fault tolerant than a RAID-10? Ceph is a distributed object store. If you stay within a single machine, keep using a local RAID solution (hardware or software). Why would you want to make this switch? I do not think RAID-10 on 6 3TB disks is going to be reliable at all. I have simulated several failures, and it looks like a rebuild will take a lot of time. Funnily, during one of these experiments, another drive failed, and I had lost the entire array. Good luck recovering from that... I feel that Ceph is better than mdraid because: 1) When ceph cluster is far from being full, 'rebuilding' will be much faster vs mdraid 2) You can easily change the number of replicas 3) When multiple disks have bad sectors, I suspect ceph will be much easier to recover data from than from mdraid which will simply never finish rebuilding. 4) If we need to migrate data over to a different server with no downtime, we just add more OSDs, wait, and then remove the old ones :-) This is my initial observation though, so please correct me if I am wrong. Dmitry ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph instead of RAID
On 08/13/2013 09:47 AM, Dmitry Postrigan wrote: Why would you want to make this switch? I do not think RAID-10 on 6 3TB disks is going to be reliable at all. I have simulated several failures, and it looks like a rebuild will take a lot of time. Funnily, during one of these experiments, another drive failed, and I had lost the entire array. Good luck recovering from that... good point. I feel that Ceph is better than mdraid because: 1) When ceph cluster is far from being full, 'rebuilding' will be much faster vs mdraid true 2) You can easily change the number of replicas true 3) When multiple disks have bad sectors, I suspect ceph will be much easier to recover data from than from mdraid which will simply never finish rebuilding. maybe not true. also if you have one disk that is starting to be slow (because of upcoming failure), ceph will slow down drastically, and you need to find the failing disk. 4) If we need to migrate data over to a different server with no downtime, we just add more OSDs, wait, and then remove the old ones :-) true. but maybe not as easy and painless as you would expect it to be. also bear in mind that ceph needs a monitor up and running all time. This is my initial observation though, so please correct me if I am wrong. ceph is easier to maintain than most distributed systems I know, but still harder than a local RAID. Keep that in mind. Dmitry Wolfgang ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- DI (FH) Wolfgang Hennerbichler Software Development Unit Advanced Computing Technologies RISC Software GmbH A company of the Johannes Kepler University Linz IT-Center Softwarepark 35 4232 Hagenberg Austria Phone: +43 7236 3343 245 Fax: +43 7236 3343 250 wolfgang.hennerbich...@risc-software.at http://www.risc-software.at ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mounting a pool via fuse
Georg Höllrigl пишет: I'm using ceph 0.61.7. When using ceph-fuse, I couldn't find a way, to only mount one pool. Is there a way to mount a pool - or is it simply not supported? This mean mount as fs? Same as kernel-level cephfs (fuse cephfs = same instance). You cannot mount pool, but can mount filesystem and can map pool to any point of filesystem (file or directory), include root. First, mount ceph via kernel - mount -t ceph (just for cephfs tool syntax compatibility). For example - to /mnt/ceph. Then say ceph df and lookup pool number (not name!), for example pool number is 10. And last: mkdir -p /mnt/ceph/pools/pool1 cephfs /mnt/ceph/pools/pool1 set_layout -p 10 or just (for ceph's root): cephfs /mnt/ceph set_layout -p 10 Next you can unmount kernel-level and mount this point via fuse. PS For ceph developers: trying this for qouta (with ceph osd pool set-quota) semi-working: on quota overflow - nothing limited, but ceph health show warning. In case of no other ways to quota, it may qualified as bug and not too actual only while big number of pools performance limitation. So, FYI. -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Wheezy machine died with problems on osdmap
Hi all, my Debian 7 wheezy machine died with the following in the logs: http://pastebin.ubuntu.com/5981058/ It's using kvm and ceph as an rdb device. ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) Can you give me please some advices? Thanks, Giuseppe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] one pg stuck with 2 unfound pieces
We have a cluster with 10 servers, 64 OSDs and 5 Mons on them. The OSDs are 3TB disk, formatted with btrfs and the servers are either on Ubuntu 12.10 or 13.04. Recently one of the servers (13.04) stood still (due to problems with btrfs - something we have seen a few times). I decided to not try to recover the disks, but reformat them with XFS. I removed the OSDs, reformatted, and re-created them (they got the same OSD numbers) I redid this twice (because I wrongly partioned the disks in the first place) and I ended up with 2 unfound pieces in one pg: root@s2:~# ceph health details HEALTH_WARN 1 pgs degraded; 1 pgs recovering; 1 pgs stuck unclean; recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) pg 0.cfa is stuck unclean for 1004252.309704, current state active+recovering+degraded+remapped, last acting [23,50] pg 0.cfa is active+recovering+degraded+remapped, acting [23,50], 2 unfound recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) root@s2:~# ceph pg 0.cfa query { state: active+recovering+degraded+remapped, epoch: 28197, up: [ 23, 50, 18], acting: [ 23, 50], info: { pgid: 0.cfa, last_update: 28082'7774, last_complete: 23686'7083, log_tail: 14360'4061, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 1, last_epoch_started: 28197, last_epoch_clean: 24810, last_epoch_split: 0, same_up_since: 28195, same_interval_since: 28196, same_primary_since: 26036, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786}, stats: { version: 28082'7774, reported: 28197'41950, state: active+recovering+degraded+remapped, last_fresh: 2013-08-13 14:34:33.057271, last_change: 2013-08-13 14:34:33.057271, last_active: 2013-08-13 14:34:33.057271, last_clean: 2013-08-01 23:50:18.414082, last_became_active: 2013-05-29 13:10:51.366237, last_unstale: 2013-08-13 14:34:33.057271, mapping_epoch: 28195, log_start: 14360'4061, ondisk_log_start: 14360'4061, created: 1, last_epoch_clean: 24810, parent: 0.0, parent_split_bits: 0, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 145307402, num_objects: 2234, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_read: 744, num_read_kb: 410184, num_write: 7774, num_write_kb: 1155438, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 3998, num_bytes_recovered: 278803622, num_keys_recovered: 0}, stat_cat_sum: {}, up: [ 23, 50, 18], acting: [ 23, 50]}, empty: 0, dne: 0, incomplete: 0, last_epoch_started: 28197}, recovery_state: [ { name: Started\/Primary\/Active, enter_time: 2013-08-13 14:34:33.026698, might_have_unfound: [ { osd: 9, status: querying}, { osd: 18, status: querying}, { osd: 50, status: already probed}], recovery_progress: { backfill_target: 50, waiting_on_backfill: 0, backfill_pos: 96220cfa\/1799e82.\/head\/\/0, backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, peer_backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, backfills_in_flight: [], pull_from_peer: [], pushing: []}, scrub: { scrubber.epoch_start: 0, scrubber.active: 0, scrubber.block_writes: 0, scrubber.finalizing: 0, scrubber.waiting_on: 0, scrubber.waiting_on_whom: []}}, { name: Started, enter_time: 2013-08-13 14:34:32.024282}]} I have tried to mark those two pieces as lost, but ceph wouldn't let me (due to the fact that it is
Re: [ceph-users] Ceph instead of RAID
On 08/13/2013 02:56 AM, Dmitry Postrigan wrote: I am currently installing some backup servers with 6x3TB drives in them. I played with RAID-10 but I was not impressed at all with how it performs during a recovery. Anyway, I thought what if instead of RAID-10 I use ceph? All 6 disks will be local, so I could simply create 6 local OSDs + a monitor, right? Is there anything I need to watch out for in such configuration? You can do that. Although it's nice to play with and everything, I wouldn't recommend doing it. It will give you more pain than pleasure. Any specific reason? I just got it up and running, an after simulating some failures, I like it much better than mdraid. Again, this only applies to large arrays (6x3TB in my case). I would not use ceph to replace a RAID-1 array of course, but it looks like a good idea to replace a large RAID10 array with a local ceph installation. The only thing I do not enjoy about ceph is performance. Probably need to do more tweaking, but so far numbers are not very impressive. I have two exactly same servers running same OS, kernel, etc. Each server has 6x 3TB drives (same model and firmware #). Server 1 runs ceph (2 replicas) Server 2 runs mdraid (raid-10) I ran some very basic benchmarks on both servers: dd if=/dev/zero of=/storage/test.bin bs=1M count=10 Ceph: 113 MB/s mdraid: 467 MB/s dd if=/storage/test.bin of=/dev/null bs=1M Ceph: 114 MB/s mdraid: 550 MB/s As you can see, mdraid is by far faster than ceph. It could be by design, or perhaps I am not doing it right. Even despite such difference in speed, I would still go with ceph because *I think* it is more reliable. couple of things: 1) Ceph is doing full data journal writes so is going to eat (at least) half of your write performance right there. 2) Ceph tends to like lots of concurrency. You'll probably see higher numbers with multiple dd reads/writes going at once. 3) Ceph is a lot more complex than something like mdraid. It gives you a lot more power and flexibility but the cost is greater complexity. There are probably things you can tune to get your numbers up, but it could take some work. Having said all of this, my primary test box is a single server and I can get 90MB/s+ per drive out of Ceph (with 24 drives!), but if I was building a production box and never planned to expand to multiple servers, I'd certainly be looking into zfs or btrfs RAID. Mark Dmitry ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Wheezy machine died with problems on osdmap
On Tue, 13 Aug 2013, Giuseppe 'Gippa' Paterno' wrote: Hi all, my Debian 7 wheezy machine died with the following in the logs: http://pastebin.ubuntu.com/5981058/ It's using kvm and ceph as an rdb device. ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) Can you give me please some advices? What kernel version of this? It looks like an old kernel bug. Generally speaking you should be using 3.4 at the very least if you are using the kernel client. sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mounting a pool via fuse
Thank you for the explaination. By mounting as filesystem I'm talking about something similar to this: http://www.sebastien-han.fr/blog/2013/02/11/mount-a-specific-pool-with-cephfs/ Using the kernel module, I can mount a subdirectory into my directory tree - a directory, where I have assigned a pool. Using fuse, I can't mount a subdirectory? By the way setting the layout seems to have a bug: # cephfs /mnt/macm01 set_layout -p 4 Error setting layout: Invalid argument I have to add the -u option, then it works: # cephfs /mnt/mailstore set_layout -p 5 -u 4194304 Kind Regards, Georg On 13.08.2013 12:09, Dzianis Kahanovich wrote: Georg Höllrigl пишет: I'm using ceph 0.61.7. When using ceph-fuse, I couldn't find a way, to only mount one pool. Is there a way to mount a pool - or is it simply not supported? This mean mount as fs? Same as kernel-level cephfs (fuse cephfs = same instance). You cannot mount pool, but can mount filesystem and can map pool to any point of filesystem (file or directory), include root. First, mount ceph via kernel - mount -t ceph (just for cephfs tool syntax compatibility). For example - to /mnt/ceph. Then say ceph df and lookup pool number (not name!), for example pool number is 10. And last: mkdir -p /mnt/ceph/pools/pool1 cephfs /mnt/ceph/pools/pool1 set_layout -p 10 or just (for ceph's root): cephfs /mnt/ceph set_layout -p 10 Next you can unmount kernel-level and mount this point via fuse. PS For ceph developers: trying this for qouta (with ceph osd pool set-quota) semi-working: on quota overflow - nothing limited, but ceph health show warning. In case of no other ways to quota, it may qualified as bug and not too actual only while big number of pools performance limitation. So, FYI. -- Dipl.-Ing. (FH) Georg Höllrigl Technik Xidras GmbH Stockern 47 3744 Stockern Austria Tel: +43 (0) 2983 201 - 30505 Fax: +43 (0) 2983 201 - 930505 Email: georg.hoellr...@xidras.com Web: http://www.xidras.com FN 317036 f | Landesgericht Krems | ATU64485024 VERTRAULICHE INFORMATIONEN! Diese eMail enthält vertrauliche Informationen und ist nur für den berechtigten Empfänger bestimmt. Wenn diese eMail nicht für Sie bestimmt ist, bitten wir Sie, diese eMail an uns zurückzusenden und anschließend auf Ihrem Computer und Mail-Server zu löschen. Solche eMails und Anlagen dürfen Sie weder nutzen, noch verarbeiten oder Dritten zugänglich machen, gleich in welcher Form. Wir danken für Ihre Kooperation! CONFIDENTIAL! This email contains confidential information and is intended for the authorised recipient only. If you are not an authorised recipient, please return the email to us and then delete it from your computer and mail-server. You may neither use nor edit any such emails including attachments, nor make them accessible to third parties in any manner whatsoever. Thank you for your cooperation ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mounting a pool via fuse
On Tue, 13 Aug 2013, Georg H?llrigl wrote: Thank you for the explaination. By mounting as filesystem I'm talking about something similar to this: http://www.sebastien-han.fr/blog/2013/02/11/mount-a-specific-pool-with-cephfs/ Using the kernel module, I can mount a subdirectory into my directory tree - a directory, where I have assigned a pool. Using fuse, I can't mount a subdirectory? ceph-fuse --mount-root /some/path /mnt/ceph should do the trick. By the way setting the layout seems to have a bug: # cephfs /mnt/macm01 set_layout -p 4 Error setting layout: Invalid argument I have to add the -u option, then it works: # cephfs /mnt/mailstore set_layout -p 5 -u 4194304 Curious. Opened a bug! sage Kind Regards, Georg On 13.08.2013 12:09, Dzianis Kahanovich wrote: Georg H?llrigl ?: I'm using ceph 0.61.7. When using ceph-fuse, I couldn't find a way, to only mount one pool. Is there a way to mount a pool - or is it simply not supported? This mean mount as fs? Same as kernel-level cephfs (fuse cephfs = same instance). You cannot mount pool, but can mount filesystem and can map pool to any point of filesystem (file or directory), include root. First, mount ceph via kernel - mount -t ceph (just for cephfs tool syntax compatibility). For example - to /mnt/ceph. Then say ceph df and lookup pool number (not name!), for example pool number is 10. And last: mkdir -p /mnt/ceph/pools/pool1 cephfs /mnt/ceph/pools/pool1 set_layout -p 10 or just (for ceph's root): cephfs /mnt/ceph set_layout -p 10 Next you can unmount kernel-level and mount this point via fuse. PS For ceph developers: trying this for qouta (with ceph osd pool set-quota) semi-working: on quota overflow - nothing limited, but ceph health show warning. In case of no other ways to quota, it may qualified as bug and not too actual only while big number of pools performance limitation. So, FYI. -- Dipl.-Ing. (FH) Georg H?llrigl Technik Xidras GmbH Stockern 47 3744 Stockern Austria Tel: +43 (0) 2983 201 - 30505 Fax: +43 (0) 2983 201 - 930505 Email: georg.hoellr...@xidras.com Web: http://www.xidras.com FN 317036 f | Landesgericht Krems | ATU64485024 VERTRAULICHE INFORMATIONEN! Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich machen, gleich in welcher Form. Wir danken f?r Ihre Kooperation! CONFIDENTIAL! This email contains confidential information and is intended for the authorised recipient only. If you are not an authorised recipient, please return the email to us and then delete it from your computer and mail-server. You may neither use nor edit any such emails including attachments, nor make them accessible to third parties in any manner whatsoever. Thank you for your cooperation ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
Thanks Joao, Is there a doc somewhere on the dependencies? I assume I'll need to setup the tool chain to compile? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
Is there an easy way I can find the age and/or expiration of the service ticket on a particular osd? Is that a file or just kept in ram? -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Tuesday, August 13, 2013 9:01 AM To: Jeppesen, Nelson Cc: ceph-users@lists.ceph.com Subject: RE: [ceph-users] Why is my mon store.db is 220GB? On Tue, 13 Aug 2013, Jeppesen, Nelson wrote: Interesting, So if I change ' auth service ticket ttl' to 172,800, in theory I could go without a monitor for 48 hours? If there are no up/down events, no new clients need to start, no osd recovery going on, then I *think* so. I may be forgetting something. sage -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Monday, August 12, 2013 9:50 PM To: Jeppesen, Nelson Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Why is my mon store.db is 220GB? On Mon, 12 Aug 2013, Jeppesen, Nelson wrote: Joao, (log file uploaded to http://pastebin.com/Ufrxn6fZ) I had some good luck and some bad luck. I copied the store.db to a new monitor, injected a modified monmap and started it up (This is all on the same host.) Very quickly it reached quorum (as far as I can tell) but didn't respond. Running 'ceph -w' just hung, no timeouts or errors. Same thing when restarting an OSD. The last lines of the log file '...ms_verify_authorizer..' are from 'ceph -w' attempts. I restarted everything again and it sat there synchronizing. IO stat reported about 100MB/s, but just reads. I let it sit there for 7 min but nothing happened. Can you do this again with --debug-mon 20 --debug-ms 1? It looks as though the main dispatch thread is blocked (7f71a1aa5700 does nothing after winning the election). It would also be helpful to gdb attach to the running ceph-mon and capture the output from 'thread apply all bt'. Side question, how long can a ceph cluster run without a monitor? I was able to upload files via rados gateway without issue even when the monitor was down. Quite a while, as long as no new processes need to authenticate, and no nodes go up or down. Eventually the authentication keys are going to time out, though (1 hour is the default). sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On Tue, 13 Aug 2013, Jeppesen, Nelson wrote: Is there an easy way I can find the age and/or expiration of the service ticket on a particular osd? Is that a file or just kept in ram? It's just in ram. If you crank up debug auth = 10 you will periodically see it dump the rotating keys and expirations. Ideally the middle one will remain valid, but things won't grind to a halt until they are all expired. sage -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Tuesday, August 13, 2013 9:01 AM To: Jeppesen, Nelson Cc: ceph-users@lists.ceph.com Subject: RE: [ceph-users] Why is my mon store.db is 220GB? On Tue, 13 Aug 2013, Jeppesen, Nelson wrote: Interesting, So if I change ' auth service ticket ttl' to 172,800, in theory I could go without a monitor for 48 hours? If there are no up/down events, no new clients need to start, no osd recovery going on, then I *think* so. I may be forgetting something. sage -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Monday, August 12, 2013 9:50 PM To: Jeppesen, Nelson Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Why is my mon store.db is 220GB? On Mon, 12 Aug 2013, Jeppesen, Nelson wrote: Joao, (log file uploaded to http://pastebin.com/Ufrxn6fZ) I had some good luck and some bad luck. I copied the store.db to a new monitor, injected a modified monmap and started it up (This is all on the same host.) Very quickly it reached quorum (as far as I can tell) but didn't respond. Running 'ceph -w' just hung, no timeouts or errors. Same thing when restarting an OSD. The last lines of the log file '...ms_verify_authorizer..' are from 'ceph -w' attempts. I restarted everything again and it sat there synchronizing. IO stat reported about 100MB/s, but just reads. I let it sit there for 7 min but nothing happened. Can you do this again with --debug-mon 20 --debug-ms 1? It looks as though the main dispatch thread is blocked (7f71a1aa5700 does nothing after winning the election). It would also be helpful to gdb attach to the running ceph-mon and capture the output from 'thread apply all bt'. Side question, how long can a ceph cluster run without a monitor? I was able to upload files via rados gateway without issue even when the monitor was down. Quite a while, as long as no new processes need to authenticate, and no nodes go up or down. Eventually the authentication keys are going to time out, though (1 hour is the default). sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On 13/08/13 09:19, Jeppesen, Nelson wrote: Thanks Joao, Is there a doc somewhere on the dependencies? I assume I’ll need to setup the tool chain to compile? README on the ceph repo has the dependencies. You could also try getting it from the gitbuilders [1], but I'm not sure how you'd go about doing that without installing other packages. [1] - http://gitbuilder.ceph.com/ -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Start Stop OSD
Adding back ceph-users; try not to turn public threads into private ones when the problem hasn't been resolved. On 08/13/2013 04:42 AM, Joshua Young wrote: So I put the journals on their own partitions and they worked just fine. All night they were up doing normal operations. When running initctl list | grep ceph I would get ... ceph-mds-all-starter stop/waiting ceph-mds-all start/running ceph-osd-all start/running ceph-osd-all-starter stop/waiting ceph-all start/running ceph-mon-all start/running ceph-mon-all-starter stop/waiting ceph-mon (ceph/cloud3) start/running, process 1864 ceph-create-keys stop/waiting ceph-osd (ceph/8) start/running, process 2136 ceph-osd (ceph/20) start/running, process 5281 ceph-osd (ceph/15) start/running, process 5292 ceph-osd (ceph/14) start/running, process 2135 ceph-mds stop/waiting This is correct. There are 4 OSDs on this server. Now I have come in today and running ceph -s still says all of my OSDS are up. When I run the same command as above I only see OSD 14. When I go into the logs of one of the others (OSD 15 ) I see this... Does ps agree that only one OSD is left running? 2013-08-13 06:37:48.414775 7ffa2099a7c0 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 16597 2013-08-13 06:37:48.421208 7ffa2099a7c0 0 filestore(/var/lib/ceph/osd/ceph-15) lock_fsid failed to lock /var/lib/ceph/osd/ceph-15/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable 2013-08-13 06:37:48.421246 7ffa2099a7c0 -1 filestore(/var/lib/ceph/osd/ceph-15) FileStore::mount: lock_fsid failed 2013-08-13 06:37:48.421274 7ffa2099a7c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-15: (16) Device or resource busy^[[0m 2013-08-13 06:37:48.445927 7f0fbb6687c0 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 16659 2013-08-13 06:37:48.447470 7f0fbb6687c0 0 filestore(/var/lib/ceph/osd/ceph-15) lock_fsid failed to lock /var/lib/ceph/osd/ceph-15/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable 2013-08-13 06:37:48.447480 7f0fbb6687c0 -1 filestore(/var/lib/ceph/osd/ceph-15) FileStore::mount: lock_fsid failed 2013-08-13 06:37:48.447500 7f0fbb6687c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-15: (16) Device or resource busy^[[0m 2013-08-13 06:37:48.474852 7f28f332c7c0 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 16752 2013-08-13 06:37:48.476695 7f28f332c7c0 0 filestore(/var/lib/ceph/osd/ceph-15) lock_fsid failed to lock /var/lib/ceph/osd/ceph-15/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable 2013-08-13 06:37:48.476707 7f28f332c7c0 -1 filestore(/var/lib/ceph/osd/ceph-15) FileStore::mount: lock_fsid failed 2013-08-13 06:37:48.476728 7f28f332c7c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-15: (16) Device or resource busy^[[0m 2013-08-13 06:37:48.501723 7f84618467c0 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 16845 2013-08-13 06:37:48.503919 7f84618467c0 0 filestore(/var/lib/ceph/osd/ceph-15) lock_fsid failed to lock /var/lib/ceph/osd/ceph-15/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable 2013-08-13 06:37:48.503932 7f84618467c0 -1 filestore(/var/lib/ceph/osd/ceph-15) FileStore::mount: lock_fsid failed 2013-08-13 06:37:48.503955 7f84618467c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-15: (16) Device or resource busy^[[0m 2013-08-13 06:37:48.529665 7f29c2a367c0 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 16944 2013-08-13 06:37:48.531227 7f29c2a367c0 0 filestore(/var/lib/ceph/osd/ceph-15) lock_fsid failed to lock /var/lib/ceph/osd/ceph-15/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable 2013-08-13 06:37:48.531239 7f29c2a367c0 -1 filestore(/var/lib/ceph/osd/ceph-15) FileStore::mount: lock_fsid failed 2013-08-13 06:37:48.531260 7f29c2a367c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-15: (16) Device or resource busy^[[0m So the OSD can't get a lock on its data. You aren't attempting to share devices/partitions for OSD storage as well, are you? What is your cluster configuration? Any idea? Thanks -Original Message- From: Dan Mick [mailto:dan.m...@inktank.com] Sent: Monday, August 12, 2013 5:50 PM To: Joshua Young Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Start Stop OSD On 08/12/2013 04:49 AM, Joshua Young wrote: I have 2 issues that I can not find a solution to. First: I am unable to stop / start any osd by command. I have deployed with ceph-deploy on Ubuntu 13.04 and everything seems to be working find. I have 5 hosts 5 mons and 20 osds. Using initctl list | grep ceph gives me ceph-osd (ceph/15) start/running, process 2122 The fact that only one is output means
Re: [ceph-users] Ceph instead of RAID
Hi, I'd just like to echo what Wolfgang said about ceph being a complex system. I initially started out testing ceph with a setup much like yours. And while it overall performed ok, it was not as good as sw raid on the same machine. Also, as Mark said you'll have at very best half write speeds because of how the journaling works if you do larger continuous writes. Ceph really shines with multiple servers multiple concurrency. My testmachine was running for ½ a year+ (going from argonaut - cuttlefish) and in that process I came to realize that mixing types of disk (and size) was a bad idea (some enterprise SATA, some fast desktop and some green disks) - as speed will be determined by the slowest drive in your setup (that's why they're advocating using similar hw if at all possible I guess). I also experienced all the challenging issues having to deal with a very young technology; osds suddenly refusing to start, pg's going into various incomplete/down/inconsistent states, monitor leveldb running full, monitor dying at weird times and well - I think it is good for a learning experience, but like Wolfgang said I think it is too much hassle for too little gain when you have something like raid10/zfs around. But, by all means, don't let us discourage you if you want to go this route - ceph's unique self-healing ability was what drew me into running a single machine in the first place. Cheers, Martin On Tue, Aug 13, 2013 at 9:32 AM, Wolfgang Hennerbichler wolfgang.hennerbich...@risc-software.at wrote: On 08/13/2013 09:23 AM, Jeffrey 'jf' Lim wrote: Anyway, I thought what if instead of RAID-10 I use ceph? All 6 disks will be local, so I could simply create 6 local OSDs + a monitor, right? Is there anything I need to watch out for in such configuration? You can do that. Although it's nice to play with and everything, I wouldn't recommend doing it. It will give you more pain than pleasure. How so? Care to elaborate? Ceph is a complex system, built for clusters. It does some stuff in software that is otherwhise done in hardware (raid controllers). The nature of the complexity of a cluster system is a lot of overhead compared to a local raid [whatever] system, and latency of disk i/o will naturally suffer a bit. An OSD needs about 300 MB of RAM (may vary on your PGs), times 6 is a waste of nearly 2 GB of RAM (compared to a local RAID). Also ceph is young, and it does indeed have some bugs. RAID is old, and very mature. Although I rely on ceph on a productive cluster, too, it is way harder to maintain than a simple local raid. When a disk fails in ceph you don't have to worry about your data, which is a good thing, but you have to worry about the rebuilding (which isn't too hard, but at least you need to know SOMETHING about ceph), with (hardware) RAID you simply replace the disk, and it will be rebuilt. Others will find more reasons why this is not the best idea for a production system. Don't get me wrong, I'm a big supporter of ceph, but only for clusters, not for single systems. wogri -jf -- He who settles on the idea of the intelligent man as a static entity only shows himself to be a fool. Every nonfree program has a lord, a master -- and if you use the program, he is your master. --Richard Stallman -- DI (FH) Wolfgang Hennerbichler Software Development Unit Advanced Computing Technologies RISC Software GmbH A company of the Johannes Kepler University Linz IT-Center Softwarepark 35 4232 Hagenberg Austria Phone: +43 7236 3343 245 Fax: +43 7236 3343 250 wolfgang.hennerbich...@risc-software.at http://www.risc-software.at ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] one pg stuck with 2 unfound pieces
You can run 'ceph pg 0.cfa mark_unfound_lost revert'. (Revert Lost section of http://ceph.com/docs/master/rados/operations/placement-groups/). -Sam On Tue, Aug 13, 2013 at 6:50 AM, Jens-Christian Fischer jens-christian.fisc...@switch.ch wrote: We have a cluster with 10 servers, 64 OSDs and 5 Mons on them. The OSDs are 3TB disk, formatted with btrfs and the servers are either on Ubuntu 12.10 or 13.04. Recently one of the servers (13.04) stood still (due to problems with btrfs - something we have seen a few times). I decided to not try to recover the disks, but reformat them with XFS. I removed the OSDs, reformatted, and re-created them (they got the same OSD numbers) I redid this twice (because I wrongly partioned the disks in the first place) and I ended up with 2 unfound pieces in one pg: root@s2:~# ceph health details HEALTH_WARN 1 pgs degraded; 1 pgs recovering; 1 pgs stuck unclean; recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) pg 0.cfa is stuck unclean for 1004252.309704, current state active+recovering+degraded+remapped, last acting [23,50] pg 0.cfa is active+recovering+degraded+remapped, acting [23,50], 2 unfound recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) root@s2:~# ceph pg 0.cfa query { state: active+recovering+degraded+remapped, epoch: 28197, up: [ 23, 50, 18], acting: [ 23, 50], info: { pgid: 0.cfa, last_update: 28082'7774, last_complete: 23686'7083, log_tail: 14360'4061, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 1, last_epoch_started: 28197, last_epoch_clean: 24810, last_epoch_split: 0, same_up_since: 28195, same_interval_since: 28196, same_primary_since: 26036, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786}, stats: { version: 28082'7774, reported: 28197'41950, state: active+recovering+degraded+remapped, last_fresh: 2013-08-13 14:34:33.057271, last_change: 2013-08-13 14:34:33.057271, last_active: 2013-08-13 14:34:33.057271, last_clean: 2013-08-01 23:50:18.414082, last_became_active: 2013-05-29 13:10:51.366237, last_unstale: 2013-08-13 14:34:33.057271, mapping_epoch: 28195, log_start: 14360'4061, ondisk_log_start: 14360'4061, created: 1, last_epoch_clean: 24810, parent: 0.0, parent_split_bits: 0, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 145307402, num_objects: 2234, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_read: 744, num_read_kb: 410184, num_write: 7774, num_write_kb: 1155438, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 3998, num_bytes_recovered: 278803622, num_keys_recovered: 0}, stat_cat_sum: {}, up: [ 23, 50, 18], acting: [ 23, 50]}, empty: 0, dne: 0, incomplete: 0, last_epoch_started: 28197}, recovery_state: [ { name: Started\/Primary\/Active, enter_time: 2013-08-13 14:34:33.026698, might_have_unfound: [ { osd: 9, status: querying}, { osd: 18, status: querying}, { osd: 50, status: already probed}], recovery_progress: { backfill_target: 50, waiting_on_backfill: 0, backfill_pos: 96220cfa\/1799e82.\/head\/\/0, backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, peer_backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, backfills_in_flight: [], pull_from_peer: [], pushing: []}, scrub: { scrubber.epoch_start: 0,
Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)
Cool! -Sam On Tue, Aug 13, 2013 at 4:49 AM, Jeff Moskow j...@rtr.com wrote: Sam, Thanks that did it :-) health HEALTH_OK monmap e17: 5 mons at {a=172.16.170.1:6789/0,b=172.16.170.2:6789/0,c=172.16.170.3:6789/0,d=172.16.170.4:6789/0,e=172.16.170.5:6789/0}, election epoch 9794, quorum 0,1,2,3,4 a,b,c,d,e osdmap e23445: 14 osds: 13 up, 13 in pgmap v13552855: 2102 pgs: 2102 active+clean; 531 GB data, 1564 GB used, 9350 GB / 10914 GB avail; 13104KB/s rd, 4007KB/s wr, 560op/s mdsmap e3: 0/0/1 up -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] basic single node set up issue on rhel6
Hi Sijo On Mon, Aug 12, 2013 at 12:26 PM, Mathew, Sijo (KFRM 1) sijo.mat...@credit-suisse.com wrote: Hi, ** ** I have been trying to get ceph installed on a single node. But I’m stuck with the following error. ** ** [host]$ ceph-deploy -v mon create ceph-server-299 Deploying mon, cluster ceph hosts ceph-server-299 Deploying mon to ceph-server-299 Distro RedHatEnterpriseServer codename Santiago, will use sysvinit Traceback (most recent call last): File /usr/bin/ceph-deploy, line 21, in module main() File /usr/lib/python2.6/site-packages/ceph_deploy/cli.py, line 112, in main return args.func(args) File /usr/lib/python2.6/site-packages/ceph_deploy/mon.py, line 234, in mon mon_create(args) File /usr/lib/python2.6/site-packages/ceph_deploy/mon.py, line 138, in mon_create init=init, File /usr/lib/python2.6/site-packages/pushy/protocol/proxy.py, line 255, in lambda (conn.operator(type_, self, args, kwargs)) File /usr/lib/python2.6/site-packages/pushy/protocol/connection.py, line 66, in operator return self.send_request(type_, (object, args, kwargs)) File /usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py, line 323, in send_request return self.__handle(m) File /usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py, line 639, in __handle raise e pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory This looks like a very old version of ceph-deploy. Can you attempt to install a newer version? We released version 1.2 to the Python Package Index and to our repos here: http://ceph.com/packages/ceph-extras/rpm/ If you are familiar with Python install tools you could simply do: `sudo pip install ceph-deploy`, otherwise could you try with the RPM packages? But, you mention the lack of internet connection, so that would mean that for `pip` it would be quite the headache to meet all of ceph-deploy's dependencies. Can you try with the RPMs for version 1.2 and run again? 1.2 had a massive amount of bug fixes and it includes much better logging output. Once you do, paste back the output here so I can take a look. ** ** I saw a similar thread in the archives, but the solution given there doesn’t seem to be that clear. ** ** http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-June/002344.html* *** ** ** I had to install all the rpms separately as the machines that I work with don’t have internet access and “ceph-deploy install“ needs internet access. Could someone suggest what might be wrong here? ** ** Environment: RHEL 6.4, ceph 0.61 ** ** Thanks, Sijo Mathew ** ** ** == Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html == ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
I built the wip-monstore-copy branch with './configure --with-rest-bench --with-debug' and 'make'. It worked and I get all the usual stuff but ceph-monstore-tool is missing. I see code in ./src/tools/. Did I miss something? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy and journal on separate disk
On Tue, Aug 13, 2013 at 3:21 AM, Pavel Timoschenkov pa...@bayonetteas.onmicrosoft.com wrote: Hi. Yes, i'm zapped all disks before. More about my situation: sdaa - one of disk for data: 3 TB with GPT partition table. sda - ssd drive with manual created partitions (10 GB) for journal with MBR partition table. === fdisk -l /dev/sda Disk /dev/sda: 480.1 GB, 480103981056 bytes 255 heads, 63 sectors/track, 58369 cylinders, total 937703088 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00033624 Device Boot Start End Blocks Id System /dev/sda1204819531775 9764864 83 Linux /dev/sda21953177639061503 9764864 83 Linux /dev/sda33906150458593279 9765888 83 Linux /dev/sda47812505697656831 9765888 83 Linux === If i'm executed ceph-deploy osd prepare without journal options - it's ok: ceph@ceph-admin:~$ ceph-deploy disk zap ceph001:sdaa ceph001:sda1 [ceph_deploy.osd][DEBUG ] zapping /dev/sdaa on ceph001 [ceph_deploy.osd][DEBUG ] zapping /dev/sda1 on ceph001 ceph@ceph-admin:~$ ceph-deploy osd prepare ceph001:sdaa [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaa: [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001 [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaa journal None activate False root@ceph001:~# gdisk -l /dev/sdaa GPT fdisk (gdisk) version 0.8.1 Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. Disk /dev/sdaa: 5860533168 sectors, 2.7 TiB Logical sector size: 512 bytes Disk identifier (GUID): 575ACF17-756D-47EC-828B-2E0A0B8ED757 Partition table holds up to 128 entries First usable sector is 34, last usable sector is 5860533134 Partitions will be aligned on 2048-sector boundaries Total free space is 4061 sectors (2.0 MiB) Number Start (sector)End (sector) Size Code Name 1 2099200 5860533134 2.7 TiB ceph data 22048 2097152 1023.0 MiB ceph journal Problems start, when i'm try create journal on separate drive: ceph@ceph-admin:~$ ceph-deploy disk zap ceph001:sdaa ceph001:sda1 [ceph_deploy.osd][DEBUG ] zapping /dev/sdaa on ceph001 [ceph_deploy.osd][DEBUG ] zapping /dev/sda1 on ceph001 ceph@ceph-admin:~$ ceph-deploy osd prepare ceph001:sdaa:sda1 [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaa:/dev/sda1 [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001 [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use. [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaa journal /dev/sda1 activate False [ceph_deploy.osd][ERROR ] ceph-disk-prepare -- /dev/sdaa /dev/sda1 returned 1 Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.fZQxiz']' returned non-zero exit status 32 ceph-deploy: Failed to create 1 OSDs It looks like at some point the filesystem is not passed to the options. Would you mind running the `ceph-disk-prepare` command again but with the --verbose flag? I think that from the output above (correct it if I am mistaken) that would be something like: ceph-disk-prepare --verbose -- /dev/sdaa /dev/sda1 And paste the results back so we can take a look? -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: Monday, August 12, 2013 11:39 PM To: Pavel Timoschenkov Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] ceph-deploy and journal on separate disk Did you try using ceph-deploy disk zap ceph001:sdaa first? -Sam On Mon, Aug 12, 2013
Re: [ceph-users] Why is my mon store.db is 220GB?
Hmm. This sounds very similar to the problem I reported (with debug-mon = 20 and debug ms = 1 logs as of today) on our support site (ticket #438) - Sage, please take a look. On Mon, Aug 12, 2013 at 9:49 PM, Sage Weil s...@inktank.com wrote: On Mon, 12 Aug 2013, Jeppesen, Nelson wrote: Joao, (log file uploaded to http://pastebin.com/Ufrxn6fZ) I had some good luck and some bad luck. I copied the store.db to a new monitor, injected a modified monmap and started it up (This is all on the same host.) Very quickly it reached quorum (as far as I can tell) but didn't respond. Running 'ceph -w' just hung, no timeouts or errors. Same thing when restarting an OSD. The last lines of the log file '...ms_verify_authorizer..' are from 'ceph -w' attempts. I restarted everything again and it sat there synchronizing. IO stat reported about 100MB/s, but just reads. I let it sit there for 7 min but nothing happened. Can you do this again with --debug-mon 20 --debug-ms 1? It looks as though the main dispatch thread is blocked (7f71a1aa5700 does nothing after winning the election). It would also be helpful to gdb attach to the running ceph-mon and capture the output from 'thread apply all bt'. Side question, how long can a ceph cluster run without a monitor? I was able to upload files via rados gateway without issue even when the monitor was down. Quite a while, as long as no new processes need to authenticate, and no nodes go up or down. Eventually the authentication keys are going to time out, though (1 hour is the default). sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
Never mind, I removed --with-rest-bench and it worked. I built the wip-monstore-copy branch with './configure --with-rest-bench --with-debug' and 'make'. It worked and I get all the usual stuff but ceph- monstore-tool is missing. I see code in ./src/tools/. Did I miss something? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Designing an application with Ceph
Hi, I am planning to use Ceph as a database storage for a webmail client/server application, and I am thinking to store the data as key/value pair instead of using any RDBMSs, for speed. The webmail will manage companies, and each company will have many users, users will end/receive emails and store them in their inboxes, kind of like Gmail, but per company. The server will be developed in C, client code in HTML/Javascript and binary client (standalone app) in C++ So, my question is, how would you recommend me to design the backend ? I have thought of these choices: 1. Use Ceph as filesystem and BerkeleyDB as the database engine. Berekley DB uses 2 files per table, so I will have 1 directory per company and a 2 files per each table, I think there will be no more than 20 tables in my whole app. Ceph will be used here as a remote filesystem where BerkeleyDB will do all the data organization. The RADOS interface of Ceph (to store key/pair values) will be not used, since Berkeley DB will write and read to the OSDs directly and Berkeley DB is a key/value pair database. But I have never used a DB one a remote filesystem not sure if it will work well. Advantages of this architecture: quick easy. Disadvantages: lower performance (overhead in CephFS and BerkeleyDB), also I will not be able to write plugins for RADOS in C++ to combine many data modifications in a single call to the server. 2. Use librados C api and write all the 'queries' hardcoded in C specifically for the application. Since the application is pretty standard and is not supposed to change much, I can do this. I would create a RADOS object for each application object (like for example 'user' record, 'email' record, 'chat message' record, etc...). Advantages: high performance. Disadvantages: a bit more to code , specially the data search functions. I am interested in performance, so I am thinking to go for the option 2, what do you think? Can RADOS fully replace a database engine ? (I mean, NoSQL engine, like Berkeley for example) Will appreciate very much your comments. TIA Nulik ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On Mon, 5 Aug 2013, Mike Dawson wrote: Josh, Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock. - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0 - At about 2013-08-05 19:53:51, ran a 'virsh screenshot' Environment is: - Ceph 0.61.7 (client is co-mingled with three OSDs) - rbd cache = true and cache=writeback - qemu 1.4.0 1.4.0+dfsg-1expubuntu4 - Ubuntu Raring with 3.8.0-25-generic This issue is reproducible in my environment, and I'm willing to run any wip branch you need. What else can I provide to help? This looks like a different issue than Oliver's. I see one anomaly in the log, where a rbd io completion is triggered a second time for no apparent reason. I opened a separate bug http://tracker.ceph.com/issues/5955 and pushed wip-5955 that will hopefully shine some light on the weird behavior I saw. Can you reproduce with this branch and debug objectcacher = 20 debug ms = 1 debug rbd = 20 debug finisher = 20 Thanks! sage Thanks, Mike Dawson On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Oliver, (Posted this on the bug too, but:) Your last log revealed a bug in the librados aio flush. A fix is pushed to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest please (with caching off again)? Thanks! sage On Fri, 9 Aug 2013, Oliver Francke wrote: Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestra?e 25a 0 G?tersloh HRB4355 AG G?tersloh Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
Joao, ceph-monstore-tool --mon-store-path /var/lib/ceph/mon/ceph-2 --out /var/lib/ceph/mon/ceph-1 --command store-copy is running now. It hit 52MB very quickly then nothing with lots of disk read, which is what I'd expect. Its reading fast and expect it to finish in 35min. Just to make sure, this won't add a new monitor, just clean it up. So, when it's done I should do the following: mv /var/lib/ceph/mon/ceph-2 /var/lib/ceph/mon/ceph-2.old mv /var/lib/ceph/mon/ceph-1 /var/lib/ceph/mon/ceph-2 service ceph start mon.2 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On 13/08/13 14:46, Jeppesen, Nelson wrote: Joao, ceph-monstore-tool --mon-store-path /var/lib/ceph/mon/ceph-2 --out /var/lib/ceph/mon/ceph-1 --command store-copy is running now. It hit 52MB very quickly then nothing with lots of disk read, which is what I’d expect. Its reading fast and expect it to finish in 35min. Just to make sure, this won’t add a new monitor, just clean it up. So, when it’s done I should do the following: mv /var/lib/ceph/mon/ceph-2 /var/lib/ceph/mon/ceph-2.old mv /var/lib/ceph/mon/ceph-1 /var/lib/ceph/mon/ceph-2 service ceph start mon.2 Correct. The tool just extracts whatever is on one mon store and copies it to another store. The contents should be the same and the monitor should come back to life as if nothing had happened. If for some reason that is not the case, you'll still have the original store readily to be used. Let me know if that happens and I'll be happy to help. -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On 13/08/13 14:46, Jeppesen, Nelson wrote: Joao, ceph-monstore-tool --mon-store-path /var/lib/ceph/mon/ceph-2 --out /var/lib/ceph/mon/ceph-1 --command store-copy is running now. It hit 52MB very quickly then nothing with lots of disk read, which is what I’d expect. Its reading fast and expect it to finish in 35min. Just to make sure, this won’t add a new monitor, just clean it up. So, when it’s done I should do the following: mv /var/lib/ceph/mon/ceph-2 /var/lib/ceph/mon/ceph-2.old mv /var/lib/ceph/mon/ceph-1 /var/lib/ceph/mon/ceph-2 service ceph start mon.2 Sage pointed out that you'll also need to copy the 'keyring' file from the original mon data dir to the new mon data dir. So that would be 'cp /var/lib/ceph/mon/ceph-2/keyring /var/lib/ceph/mon/ceph-1/' You should be good to go then. -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Designing an application with Ceph
2 is certainly an intriguing option. RADOS isn't really a database engine (even a nosql one), but should be able to serve your needs here. Have you seen the omap api available in librados? It allows you to efficiently store key/value pairs attached to a librados object (uses leveldb on the OSDs to actually handle the key/value mapping). One caveat is that the C api is somewhat less complete than the C++ api. That would be pretty easily remedied if there were demand though. -Sam On Tue, Aug 13, 2013 at 2:01 PM, Nulik Nol nulik...@gmail.com wrote: Hi, I am planning to use Ceph as a database storage for a webmail client/server application, and I am thinking to store the data as key/value pair instead of using any RDBMSs, for speed. The webmail will manage companies, and each company will have many users, users will end/receive emails and store them in their inboxes, kind of like Gmail, but per company. The server will be developed in C, client code in HTML/Javascript and binary client (standalone app) in C++ So, my question is, how would you recommend me to design the backend ? I have thought of these choices: 1. Use Ceph as filesystem and BerkeleyDB as the database engine. Berekley DB uses 2 files per table, so I will have 1 directory per company and a 2 files per each table, I think there will be no more than 20 tables in my whole app. Ceph will be used here as a remote filesystem where BerkeleyDB will do all the data organization. The RADOS interface of Ceph (to store key/pair values) will be not used, since Berkeley DB will write and read to the OSDs directly and Berkeley DB is a key/value pair database. But I have never used a DB one a remote filesystem not sure if it will work well. Advantages of this architecture: quick easy. Disadvantages: lower performance (overhead in CephFS and BerkeleyDB), also I will not be able to write plugins for RADOS in C++ to combine many data modifications in a single call to the server. 2. Use librados C api and write all the 'queries' hardcoded in C specifically for the application. Since the application is pretty standard and is not supposed to change much, I can do this. I would create a RADOS object for each application object (like for example 'user' record, 'email' record, 'chat message' record, etc...). Advantages: high performance. Disadvantages: a bit more to code , specially the data search functions. I am interested in performance, so I am thinking to go for the option 2, what do you think? Can RADOS fully replace a database engine ? (I mean, NoSQL engine, like Berkeley for example) Will appreciate very much your comments. TIA Nulik ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
Success! It was pretty quick too, maybe 20-30min. It’s now at 100MB. In a matter of min I was able to add two monitors and now I’m back to three monitors. Thank you again, Joao and Sage! I can sleep at night now knowing that a single node won't take down the cluster anymore ☺ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On 13/08/13 16:13, Jeppesen, Nelson wrote: Success! It was pretty quick too, maybe 20-30min. It’s now at 100MB. In a matter of min I was able to add two monitors and now I’m back to three monitors. Thank you again, Joao and Sage! I can sleep at night now knowing that a single node won't take down the cluster anymore ☺ Hooray! Glad to know everything worked out! :-) -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image
On Aug 12, 2013, at 7:41 PM, Josh Durgin josh.dur...@inktank.com wrote: On 08/12/2013 07:18 PM, PJ wrote: If the target rbd device only map on one virtual machine, format it as ext4 and mount to two places mount /dev/rbd0 /nfs -- for nfs server usage mount /dev/rbd0 /ftp -- for ftp server usage nfs and ftp servers run on the same virtual machine. Will file system (ext4) help to handle the simultaneous access from nfs and ftp? I doubt that'll work perfectly on a normal disk, although rbd should behave the same in this case. Consider what happens when to be some issues when the same files are modified at once by the ftp and nfs servers. You could run ftp on an nfs client on a different machine safely. Modern Linux kernels will do a bind mount when a block device is mounted on 2 different directories. Think directory hard links. Simultaneous access will NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and nfs isn't going produce good results. With file locking 2 nfs clients could coordinate using advisory locking. David Zafman Senior Developer http://www.inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] basic single node set up issue on rhel6
On Tue, Aug 13, 2013 at 4:20 PM, Mathew, Sijo (KFRM 1) sijo.mat...@credit-suisse.com wrote: Hi, ** ** Installed ceph-deploy_1.2.1 via rpm but it looks like it needs pushy=0.5.2, which I couldn’t find in the repository. Please advise. Can you try again? It seems we left out the new pushy requirement out and that should be fixed. ** ** [host]$ ceph-deploy mon create ceph-server-299 Traceback (most recent call last): File /usr/bin/ceph-deploy, line 21, in module main() File /usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py, line 83, in newfunc return f(*a, **kw) File /usr/lib/python2.6/site-packages/ceph_deploy/cli.py, line 85, in main args = parse_args(args=args, namespace=namespace) File /usr/lib/python2.6/site-packages/ceph_deploy/cli.py, line 54, in parse_args for ep in pkg_resources.iter_entry_points('ceph_deploy.cli') File /usr/lib/python2.6/site-packages/pkg_resources.py, line 1947, in load if require: self.require(env, installer) File /usr/lib/python2.6/site-packages/pkg_resources.py, line 1960, in require working_set.resolve(self.dist.requires(self.extras),env,installer))*** * File /usr/lib/python2.6/site-packages/pkg_resources.py, line 550, in resolve raise VersionConflict(dist,req) # XXX put more info here pkg_resources.VersionConflict: (pushy 0.5.1 (/usr/lib/python2.6/site-packages), Requirement.parse('pushy=0.5.2')) ** ** Thanks, Sijo Mathew ** ** *From:* Alfredo Deza [mailto:alfredo.d...@inktank.com] *Sent:* Tuesday, August 13, 2013 3:33 PM *To:* Mathew, Sijo (KFRM 1) *Cc:* ceph-users@lists.ceph.com *Subject:* Re: [ceph-users] basic single node set up issue on rhel6 ** ** Hi Sijo ** ** On Mon, Aug 12, 2013 at 12:26 PM, Mathew, Sijo (KFRM 1) sijo.mat...@credit-suisse.com wrote: Hi, I have been trying to get ceph installed on a single node. But I’m stuck with the following error. [host]$ ceph-deploy -v mon create ceph-server-299 Deploying mon, cluster ceph hosts ceph-server-299 Deploying mon to ceph-server-299 Distro RedHatEnterpriseServer codename Santiago, will use sysvinit Traceback (most recent call last): File /usr/bin/ceph-deploy, line 21, in module main() File /usr/lib/python2.6/site-packages/ceph_deploy/cli.py, line 112, in main return args.func(args) File /usr/lib/python2.6/site-packages/ceph_deploy/mon.py, line 234, in mon mon_create(args) File /usr/lib/python2.6/site-packages/ceph_deploy/mon.py, line 138, in mon_create init=init, File /usr/lib/python2.6/site-packages/pushy/protocol/proxy.py, line 255, in lambda (conn.operator(type_, self, args, kwargs)) File /usr/lib/python2.6/site-packages/pushy/protocol/connection.py, line 66, in operator return self.send_request(type_, (object, args, kwargs)) File /usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py, line 323, in send_request return self.__handle(m) File /usr/lib/python2.6/site-packages/pushy/protocol/baseconnection.py, line 639, in __handle raise e pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory** ** ** ** This looks like a very old version of ceph-deploy. Can you attempt to install a newer version? We released version 1.2 to the Python Package Index and to our repos here: http://ceph.com/packages/ceph-extras/rpm/ If you are familiar with Python install tools you could simply do: `sudo pip install ceph-deploy`, otherwise could you try with the RPM packages?** ** But, you mention the lack of internet connection, so that would mean that for `pip` it would be quite the headache to meet all of ceph-deploy's dependencies. Can you try with the RPMs for version 1.2 and run again? 1.2 had a massive amount of bug fixes and it includes much better logging output. Once you do, paste back the output here so I can take a look. I saw a similar thread in the archives, but the solution given there doesn’t seem to be that clear. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-June/002344.html* *** I had to install all the rpms separately as the machines that I work with don’t have internet access and “ceph-deploy install“ needs internet access. Could someone suggest what might be wrong here? Environment: RHEL 6.4, ceph 0.61 Thanks, Sijo Mathew ** ** == Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
[ceph-users] v0.67 Dumpling released
Another three months have gone by, and the next stable release of Ceph is ready: Dumpling! Thank you to everyone who has contributed to this release! This release focuses on a few major themes since v0.61 (Cuttlefish): * rgw: multi-site, multi-datacenter support for S3/Swift object storage * new RESTful API endpoint for administering the cluster, based on a new and improved management API and updated CLI * mon: stability and performance * osd: stability performance * cephfs: open-by-ino support (for improved NFS reexport) * improved support for Red Hat platforms * use of the Intel CRC32c instruction when available As with previous stable releases, you can upgrade from previous versions of Ceph without taking the entire cluster online, as long as a few simple guidelines are followed. * For Dumpling, we have tested upgrades from both Bobtail and Cuttlefish. If you are running Argonaut, please upgrade to Bobtail and then to Dumpling. * Please upgrade daemons/hosts in the following order: 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors cant talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several small compatibility changes between Cuttlefish and Dumpling, particularly with the CLI interface. Please see the complete release notes for a summary of the changes since v0.66 and v0.61 Cuttlefish, and other possible issues that should be considered before upgrading: http://ceph.com/docs/master/release-notes/#v0-67-dumpling Dumpling is the second Ceph release on our new three-month stable release cycle. We are very pleased to have pulled everything together on schedule. The next stable release, which will be code-named Emperor, is slated for three months from now (beginning of November). You can download v0.67 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.67.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com