Re: [ceph-users] Rsync to object store
I used rclone[0] to sync from filesystem to SWIFT. Although it's a plains SWIFT cluster I'm sure it works with RGW/S3 as well [0] http://rclone.org/ 2016-12-28 22:12 GMT+01:00 Robin H. Johnson: > On Wed, Dec 28, 2016 at 09:31:57PM +0100, Marc Roos wrote: >> Is it possible to rsync to the ceph object store with something like >> this tool of amazon? >> https://aws.amazon.com/customerapps/1771 > That's a service built on top of AWS EC2 that just happens to back > storage into AWS S3. > > There's no fundamental reason it couldn't support Ceph RGW S3, but you'd > need to contact the service provider and work out the details with them > (like running their service close to your RGW instances). > > -- > Robin Hugh Johnson > Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer > E-Mail : robb...@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Check networking first?
On Mon, Aug 3, 2015 at 5:10 PM, Quentin Hartman qhart...@direwolfdigital.com wrote: The problem with this kind of monitoring is that there are so many possible metrics to watch and so many possible ways to watch them. For myself, I'm working on implementing a couple of things: - Watching error counters on servers - Watching error counters on switches - Watching performance I would also check: - link speed (on both servers and switches) - link usage (over 80% issue a warning) .a. -- antonio.mess...@uzh.ch S3IT: Services and Support for Science IThttp://www.s3it.uzh.ch/ University of Zurich Y12 F 84 Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD move after reboot
On Thu, Apr 23, 2015 at 11:18 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote: Dear All, I have multiple disk types (15k 7k) on each ceph node, which I assign to different pools, but have a problem as whenever I reboot a node, the OSD's move in the CRUSH map. I just found out that you can customize the way OSDs are automatically added to the crushmap using an hook script. I have in ceph.conf: osd crush location hook = /usr/local/sbin/sc-ceph-crush-location this will return the correct bucket and root for the specific osd. I also have osd crush update on start = true which should be the default. This way, whenever an OSD starts, it's automatically added to correct bucket. ref: http://ceph.com/docs/master/rados/operations/crush-map/#crush-location .a. P.S. I apologize if you received this message twice, I've sent it from the wrong email address the first time. -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD move after reboot
On Thu, Apr 23, 2015 at 11:18 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote: Dear All, I have multiple disk types (15k 7k) on each ceph node, which I assign to different pools, but have a problem as whenever I reboot a node, the OSD's move in the CRUSH map. I just found out that you can customize the way OSDs are automatically added to the crushmap using an hook script. I have in ceph.conf: osd crush location hook = /usr/local/sbin/sc-ceph-crush-location this will return the correct bucket and root for the specific osd. I also have osd crush update on start = true which should be the default. This way, whenever an OSD starts, it's automatically added to correct bucket. ref: http://ceph.com/docs/master/rados/operations/crush-map/#crush-location .a. -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal
On Wed, Mar 25, 2015 at 6:37 PM, Robert LeBlanc rob...@leblancnet.us wrote: As far as the foreign journal, I would run dd over the journal partition and try it again. It sounds like something didn't get cleaned up from a previous run. I wrote zeros on the journal device re-created the journal with ceph-osd --makejournal, and it seems that the latter command didn't write anything on the partition: starting osd.196 at :/0 osd_data /var/lib/ceph/osd/ceph-196 /var/lib/ceph/osd/ceph-196/journal SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 ac 01 00 00 00 00 00 00 00 20 00 SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 ac 01 00 00 00 00 00 00 00 20 00 HDIO_DRIVE_CMD(identify) failed: Input/output error 2015-04-07 14:07:53.703247 7f0e32433900 -1 journal FileJournal::open: ondisk fsid ---- doesn't match expected 35cd523e-4a74-41ae-908a-f25267a94dac, invalid (someone else's?) journal 2015-04-07 14:07:53.703569 7f0e32433900 -1 filestore(/var/lib/ceph/osd/ceph-196) mount failed to open journal /var/lib/ceph/osd/ceph-196/journal: (22) Invalid argument 2015-04-07 14:07:53.703956 7f0e32433900 -1 ** ERROR: error converting store /var/lib/ceph/osd/ceph-196: (22) Invalid argument Note the fsid 0... line. I've also tried to zap and re-create with ceph-deploy, same results. Maybe it's not writing to the journal file because of the bad missing data error? It's strange thought, that error seems to come from hdparm -W not working on those disks, but I have the same error on *all* of my disks... Any other idea? .a. -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal
On Wed, Mar 25, 2015 at 6:06 PM, Robert LeBlanc rob...@leblancnet.us wrote: I don't know much about ceph-deploy, but I know that ceph-disk has problems automatically adding an SSD OSD when there are journals of other disks already on it. I've had to partition the disk ahead of time and pass in the partitions to make ceph-disk work. This is not my case: the journal is created automatically by ceph-deploy on the same disk, so that for each disk, /dev/sdX1 is the data partition and /dev/sdX2 is the journal partition. This is also what I want: I know there is a performance drop, but I expect it to be mitigated by the cache tier. (and I plan to test both configuration anyway) Also, unless you are sure that the dev devices will be deterministicly named the same each time, I'd recommend you not use /dev/sd* for pointing to your journals. Instead use something that will always be the same, since Ceph with partition the disks with GPT, you can use the partuuid to point to the journal partition and it will always be right. A while back I used this to fix my journal links when I did it wrong. You will want to double check that it will work right for you. no warranty and all that jazz... Thank you for pointing this out, it's an important point. However, the links are actually created using the partuuid. The command I posted in my previous email included the output of a pair of nested readlink in order to get the /dev/sd* names, because in this way it's easier to see if there are duplicates and where :) The output of ls -l /var/lib/ceph/osd/ceph-*/journal is actually: lrwxrwxrwx 1 root root 58 Mar 25 11:38 /var/lib/ceph/osd/ceph-0/journal - /dev/disk/by-partuuid/18305316-96b0-4654-aaad-7aeb891429f6 lrwxrwxrwx 1 root root 58 Mar 25 11:49 /var/lib/ceph/osd/ceph-7/journal - /dev/disk/by-partuuid/a263b19a-cb0d-4b4c-bd81-314619d5755d lrwxrwxrwx 1 root root 58 Mar 25 12:21 /var/lib/ceph/osd/ceph-14/journal - /dev/disk/by-partuuid/79734e0e-87dd-40c7-ba83-0d49695a75fb lrwxrwxrwx 1 root root 58 Mar 25 12:31 /var/lib/ceph/osd/ceph-21/journal - /dev/disk/by-partuuid/73a504bc-3179-43fd-942c-13c6bd8633c5 lrwxrwxrwx 1 root root 58 Mar 25 12:42 /var/lib/ceph/osd/ceph-28/journal - /dev/disk/by-partuuid/ecff10df-d757-4b1f-bef4-88dd84d84ef1 lrwxrwxrwx 1 root root 58 Mar 25 12:52 /var/lib/ceph/osd/ceph-35/journal - /dev/disk/by-partuuid/5be30238-3f07-4950-b39f-f5e4c7305e4c lrwxrwxrwx 1 root root 58 Mar 25 13:02 /var/lib/ceph/osd/ceph-42/journal - /dev/disk/by-partuuid/3cdb65f2-474c-47fb-8d07-83e7518418ff lrwxrwxrwx 1 root root 58 Mar 25 13:12 /var/lib/ceph/osd/ceph-49/journal - /dev/disk/by-partuuid/a47fe2b7-e375-4eea-b7a9-0354a24548dc lrwxrwxrwx 1 root root 58 Mar 25 13:22 /var/lib/ceph/osd/ceph-56/journal - /dev/disk/by-partuuid/fb42b7d6-bc6c-4063-8b73-29beb1f65107 lrwxrwxrwx 1 root root 58 Mar 25 13:33 /var/lib/ceph/osd/ceph-63/journal - /dev/disk/by-partuuid/72aff32b-ca56-4c25-b8ea-ff3aba8db507 lrwxrwxrwx 1 root root 58 Mar 25 13:43 /var/lib/ceph/osd/ceph-70/journal - /dev/disk/by-partuuid/b7c17a75-47cd-401e-b963-afe910612bd6 lrwxrwxrwx 1 root root 58 Mar 25 13:53 /var/lib/ceph/osd/ceph-77/journal - /dev/disk/by-partuuid/2c1c2501-fa82-4fc9-a586-03cc4d68faef lrwxrwxrwx 1 root root 58 Mar 25 14:03 /var/lib/ceph/osd/ceph-84/journal - /dev/disk/by-partuuid/46f619a5-3edf-44e9-99a6-24d98bcd174a lrwxrwxrwx 1 root root 58 Mar 25 14:13 /var/lib/ceph/osd/ceph-91/journal - /dev/disk/by-partuuid/5feef832-dd82-4aa0-9264-dc9496a3f93a lrwxrwxrwx 1 root root 58 Mar 25 14:24 /var/lib/ceph/osd/ceph-98/journal - /dev/disk/by-partuuid/055793a0-99d4-49c4-9698-bd8880c21d9c lrwxrwxrwx 1 root root 58 Mar 25 14:34 /var/lib/ceph/osd/ceph-105/journal - /dev/disk/by-partuuid/20547f26-6ef3-422b-9732-ad8b0b5b5379 lrwxrwxrwx 1 root root 58 Mar 25 14:44 /var/lib/ceph/osd/ceph-112/journal - /dev/disk/by-partuuid/2abea809-59c4-41da-bb52-28ef1911ec43 lrwxrwxrwx 1 root root 58 Mar 25 14:54 /var/lib/ceph/osd/ceph-119/journal - /dev/disk/by-partuuid/d8d15bb8-4b3d-4375-b6e1-62794971df7e lrwxrwxrwx 1 root root 58 Mar 25 15:05 /var/lib/ceph/osd/ceph-126/journal - /dev/disk/by-partuuid/ff6ee2b2-9c33-4902-a5e3-f6e9db5714e9 lrwxrwxrwx 1 root root 58 Mar 25 15:15 /var/lib/ceph/osd/ceph-133/journal - /dev/disk/by-partuuid/9faccb6e-ada9-4742-aa31-eb1308769205 lrwxrwxrwx 1 root root 58 Mar 25 15:25 /var/lib/ceph/osd/ceph-140/journal - /dev/disk/by-partuuid/2df13c88-ee58-4881-a373-a36a09fb6366 lrwxrwxrwx 1 root root 58 Mar 25 15:36 /var/lib/ceph/osd/ceph-147/journal - /dev/disk/by-partuuid/13cda9d1-0fec-40cc-a6fc-7cc56f7ffb78 lrwxrwxrwx 1 root root 58 Mar 25 15:46 /var/lib/ceph/osd/ceph-154/journal - /dev/disk/by-partuuid/5d37bfe9-c0f9-49e0-a951-b0ed04c5de51 lrwxrwxrwx 1 root root 58 Mar 25 15:57 /var/lib/ceph/osd/ceph-161/journal - /dev/disk/by-partuuid/d34f3abb-3fb7-4875-90d3-d2d3836f6e4d lrwxrwxrwx 1 root root 58 Mar 25 16:07 /var/lib/ceph/osd/ceph-168/journal - /dev/disk/by-partuuid/02c3db3e-159c-47d9-8a63-0389ea89fad1 lrwxrwxrwx 1 root root 58 Mar 25 16:16
[ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal
Hi all, I'm trying to install ceph on a 7-nodes preproduction cluster. Each node has 24x 4TB SAS disks (2x dell md1400 enclosures) and 6x 800GB SSDs (for cache tiering, not journals). I'm using Ubuntu 14.04 and ceph-deploy to install the cluster, I've been trying both Firefly and Giant and getting the same error. However, the logs I'm reporting are relative to the Firefly installation. The installation seems to go fine until I try to install the last 2 OSDs (they are SSD disks) of each host. All the OSDs from 0 to 195 are UP and IN, but when I try to deploy the next OSD (no matter what host) ceph-osd daemon won't start. The error I get is: 2015-03-25 17:00:17.130937 7fe231312800 0 ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047), process ceph-osd, pid 20280 2015-03-25 17:00:17.133601 7fe231312800 10 filestore(/var/lib/ceph/osd/ceph-196) dump_stop 2015-03-25 17:00:17.133694 7fe231312800 5 filestore(/var/lib/ceph/osd/ceph-196) basedir /var/lib/ceph/osd/ceph-196 journal /var/lib/ceph/osd/ceph-196/journal 2015-03-25 17:00:17.133725 7fe231312800 10 filestore(/var/lib/ceph/osd/ceph-196) mount fsid is 8c2fa707-750a-4773-8918-a368367d9cf5 2015-03-25 17:00:17.133789 7fe231312800 0 filestore(/var/lib/ceph/osd/ceph-196) mount detected xfs (libxfs) 2015-03-25 17:00:17.133810 7fe231312800 1 filestore(/var/lib/ceph/osd/ceph-196) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs 2015-03-25 17:00:17.135882 7fe231312800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features: FIEMAP ioctl is supported and appears to work 2015-03-25 17:00:17.135892 7fe231312800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2015-03-25 17:00:17.136318 7fe231312800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2015-03-25 17:00:17.136373 7fe231312800 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_feature: extsize is disabled by conf 2015-03-25 17:00:17.136640 7fe231312800 5 filestore(/var/lib/ceph/osd/ceph-196) mount op_seq is 1 2015-03-25 17:00:17.137547 7fe231312800 20 filestore (init)dbobjectmap: seq is 1 2015-03-25 17:00:17.137560 7fe231312800 10 filestore(/var/lib/ceph/osd/ceph-196) open_journal at /var/lib/ceph/osd/ceph-196/journal 2015-03-25 17:00:17.137575 7fe231312800 0 filestore(/var/lib/ceph/osd/ceph-196) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2015-03-25 17:00:17.137580 7fe231312800 10 filestore(/var/lib/ceph/osd/ceph-196) list_collections 2015-03-25 17:00:17.137661 7fe231312800 10 journal journal_replay fs op_seq 1 2015-03-25 17:00:17.137668 7fe231312800 2 journal open /var/lib/ceph/osd/ceph-196/journal fsid 8c2fa707-750a-4773-8918-a368367d9cf5 fs_op_seq 1 2015-03-25 17:00:17.137670 7fe22b8b1700 20 filestore(/var/lib/ceph/osd/ceph-196) sync_entry waiting for max_interval 5.00 2015-03-25 17:00:17.137690 7fe231312800 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5367661056) 2015-03-25 17:00:17.162489 7fe231312800 1 journal _open /var/lib/ceph/osd/ceph-196/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2015-03-25 17:00:17.162502 7fe231312800 10 journal read_header 2015-03-25 17:00:17.172249 7fe231312800 10 journal header: block_size 4096 alignment 4096 max_size 5367660544 2015-03-25 17:00:17.172256 7fe231312800 10 journal header: start 50987008 2015-03-25 17:00:17.172257 7fe231312800 10 journal write_pos 4096 2015-03-25 17:00:17.172259 7fe231312800 10 journal open header.fsid = 942f2d62-dd99-42a8-878a-feea443aaa61 2015-03-25 17:00:17.172264 7fe231312800 -1 journal FileJournal::open: ondisk fsid 942f2d62-dd99-42a8-878a-feea443aaa61 doesn't match expected 8c2fa707-750a-4773-8918-a368367d9cf5, invalid (someone else's?) journal 2015-03-25 17:00:17.172268 7fe231312800 3 journal journal_replay open failed with (22) Invalid argument 2015-03-25 17:00:17.172284 7fe231312800 -1 filestore(/var/lib/ceph/osd/ceph-196) mount failed to open journal /var/lib/ceph/osd/ceph-196/journal: (22) Invalid argument 2015-03-25 17:00:17.172304 7fe22b8b1700 20 filestore(/var/lib/ceph/osd/ceph-196) sync_entry woke after 0.034632 2015-03-25 17:00:17.172330 7fe22b8b1700 10 journal commit_start max_applied_seq 1, open_ops 0 2015-03-25 17:00:17.172333 7fe22b8b1700 10 journal commit_start blocked, all open_ops have completed 2015-03-25 17:00:17.172334 7fe22b8b1700 10 journal commit_start nothing to do 2015-03-25 17:00:17.172465 7fe231312800 -1 ** ERROR: error converting store /var/lib/ceph/osd/ceph-196: (22) Invalid argument I'm attaching the full log of ceph-deploy osd create osd-l2-05:sde and the /var/log/ceph/ceph-osd.196.log, after trying to re-start the osd with increased verbosing, as long as the ceph.conf I'm using. I've also checked if the journal symlinks were correct, and they all
Re: [ceph-users] Giant or Firefly for production
On Sun, Dec 7, 2014 at 1:51 PM, René Gallati c...@gallati.net wrote: Hello Antonio, I use aptly to manage my repositories and mix and match (and snapshot / pin) I didn't know aptly, thank you for mentioning. specific versions and non-standard packages, but as far as I know, the kernel from utopic unicorn is already in the main repositories for trusty and is a 3.16 line. apt-cache policy linux-image-generic-lts-utopic should give you the information about availability in your repository. My information is: linux-image-generic-lts-utopic: Installed: (none) Candidate: 3.16.0.25.19 Version table: 3.16.0.25.19 0 500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages Note the lts in the name, these are officially supported although they don't specifically announce those kernels when they become available or I'm not on the correct mailing lists for that. Generally about one to two month after a new non-lts release, they will be there for the LTS version. I didn't know this either! That's useful, since it's a release kernel _and_ lts. Thank you René, cheers Antonio -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Sun, Dec 7, 2014 at 1:51 PM, René Gallati c...@gallati.net wrote: Hello Antonio, I use aptly to manage my repositories and mix and match (and snapshot / pin) I didn't know aptly, thank you for mentioning. specific versions and non-standard packages, but as far as I know, the kernel from utopic unicorn is already in the main repositories for trusty and is a 3.16 line. apt-cache policy linux-image-generic-lts-utopic should give you the information about availability in your repository. My information is: linux-image-generic-lts-utopic: Installed: (none) Candidate: 3.16.0.25.19 Version table: 3.16.0.25.19 0 500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages Note the lts in the name, these are officially supported although they don't specifically announce those kernels when they become available or I'm not on the correct mailing lists for that. Generally about one to two month after a new non-lts release, they will be there for the LTS version. I didn't know this either! That's useful, since it's a release kernel _and_ lts. Thank you René, cheers Antonio -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that the rbd module in Ubuntu Trusty doesn't seem compatible with giant: feature set mismatch, my 4a042a42 server's 2104a042a42, missing 210 I tried with different ceph osd tunables but nothing seems to fix the issue However, this cluster will be mainly used for OpenStack, and qemu is able to access the rbd volume, so this might not be a big problem for me. .a. -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that the rbd module in Ubuntu Trusty doesn't seem compatible with giant: feature set mismatch, my 4a042a42 server's 2104a042a42, missing 210 I tried with different ceph osd tunables but nothing seems to fix the issue However, this cluster will be mainly used for OpenStack, and qemu is able to access the rbd volume, so this might not be a big problem for me. .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 4:25 PM, David Moreau Simard dmsim...@iweb.com wrote: What are the kernel versions involved ? We have Ubuntu precise clients talking to a Ubuntu trusty cluster without issues - with tunables optimal. 0.88 (Giant) and 0.89 has been working well for us as far the client and Openstack are concerned. This link provides some insight as to the possible problems: Both servers and clients are Ubuntu Trusty. Kernel versions are a bit different: client: 3.13.0-39-generic #66 server: 3.13.0-32-generic #57 ceph version on both: 0.87 http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client Things to look for: - Kernel versions - Cache tiering - Tunables - hashpspool I have already read the blogpost, but I don't have much experience with tunables. From what I understood I am missing: * CEPH_FEATURE_CRUSH_TUNABLES3 * CEPH_FEATURE_CRUSH_V2 but I don't know how to disable them, and I can't see them set in the crushmap I get from ceph osd getcrushmap .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote: This is probably due to the Kernel RBD client not being recent enough. Have you tried upgrading your kernel to a newer version? 3.16 should contain all the relevant features required by Giant. I would rather tune the tunables, as upgrading the kernel would require a reboot of the client. Besides, Ubuntu Trusty does not provide a 3.16 kernel, so I would need to recompile... .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
Thank you James and Nick, On Fri, Dec 5, 2014 at 4:46 PM, Nick Fisk n...@fisk.me.uk wrote: Ok sorry, I thought you had a need for some of the features in Giant, using tunables is probably easier in that case. I'm not sure :) I never played with the tunables before (still running a testbed only) I will test it again with 14.04.2 and default kernel beginning of next year, I prefer to use the official kernel for the production cluster, but since it's going to be deployed Q1-Q2 next year I should be safe. .a. However if you do want to upgrade there are debs available:- http://kernel.ubuntu.com/~kernel-ppa/mainline/ and I believe 3.16 should be available in the 14.04.2 release, which should be released early next year. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Antonio Messina Sent: 05 December 2014 15:38 To: Nick Fisk Cc: ceph-users@lists.ceph.com; Antonio Messina Subject: Re: [ceph-users] Giant or Firefly for production On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote: This is probably due to the Kernel RBD client not being recent enough. Have you tried upgrading your kernel to a newer version? 3.16 should contain all the relevant features required by Giant. I would rather tune the tunables, as upgrading the kernel would require a reboot of the client. Besides, Ubuntu Trusty does not provide a 3.16 kernel, so I would need to recompile... .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote: On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that the rbd module in Ubuntu Trusty doesn't seem compatible with giant: feature set mismatch, my 4a042a42 server's 2104a042a42, missing 210 Can you attach the output of I modified the crushmap and set: tunable chooseleaf_vary_r 0 (it was 1 before) Now the cluster is rebalancing, and since it's on crappy hardware is taking some time. I'm pasting the output of the two commands, but please keep in mind that this is the output *after* I've updated the chooseleaf_vary_r tunable. ceph osd crush show-tunables -f json-pretty { choose_local_tries: 0, choose_local_fallback_tries: 0, choose_total_tries: 50, chooseleaf_descend_once: 1, profile: bobtail, optimal_tunables: 0, legacy_tunables: 0, require_feature_tunables: 1, require_feature_tunables2: 1, require_feature_tunables3: 0, has_v2_rules: 1, has_v3_rules: 0} ceph osd crush dump -f json-pretty I'm attaching it as a text file, as it is quite big and unreadable. However, from the output I see the following tunables: tunables: { choose_local_tries: 0, choose_local_fallback_tries: 0, choose_total_tries: 50, chooseleaf_descend_once: 1, profile: bobtail, optimal_tunables: 0, legacy_tunables: 0, require_feature_tunables: 1, require_feature_tunables2: 1, require_feature_tunables3: 0, has_v2_rules: 1, has_v3_rules: 0}} .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland crushmap.json Description: application/json ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
Hi all, just an update After setting chooseleaf_vary_r to 0 _and_ removing an pool with erasure coding, I was able to run rbd map. Thank you all for the help .a. On Fri, Dec 5, 2014 at 5:07 PM, Antonio Messina antonio.mess...@s3it.uzh.ch wrote: On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote: On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that the rbd module in Ubuntu Trusty doesn't seem compatible with giant: feature set mismatch, my 4a042a42 server's 2104a042a42, missing 210 Can you attach the output of I modified the crushmap and set: tunable chooseleaf_vary_r 0 (it was 1 before) Now the cluster is rebalancing, and since it's on crappy hardware is taking some time. I'm pasting the output of the two commands, but please keep in mind that this is the output *after* I've updated the chooseleaf_vary_r tunable. ceph osd crush show-tunables -f json-pretty { choose_local_tries: 0, choose_local_fallback_tries: 0, choose_total_tries: 50, chooseleaf_descend_once: 1, profile: bobtail, optimal_tunables: 0, legacy_tunables: 0, require_feature_tunables: 1, require_feature_tunables2: 1, require_feature_tunables3: 0, has_v2_rules: 1, has_v3_rules: 0} ceph osd crush dump -f json-pretty I'm attaching it as a text file, as it is quite big and unreadable. However, from the output I see the following tunables: tunables: { choose_local_tries: 0, choose_local_fallback_tries: 0, choose_total_tries: 50, chooseleaf_descend_once: 1, profile: bobtail, optimal_tunables: 0, legacy_tunables: 0, require_feature_tunables: 1, require_feature_tunables2: 1, require_feature_tunables3: 0, has_v2_rules: 1, has_v3_rules: 0}} .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant or Firefly for production
On Fri, Dec 5, 2014 at 5:24 PM, Sage Weil s...@newdream.net wrote: The v2 rule means you have a crush rule for erasure coding. Do you have an EC pool in your cluster? Yes indeed. I didn't know EC pool was incompatible with the current kernel, I only tested it with rados bench and VMs, I guess. The tunables3 feature bit is set because you set the vary_r parameter. This I don't really know where it comes from. I think at a certain point I ran ceph osd crush tunables optimal, and it probably added vary_r, but then I run ceph osd crush tunables firefly and it didn't remove it... is it normal? If you want older kernels to talk to the cluster, you need to avoid the new tunables and features! Well, as I said, I'm not a ceph expert, I didn't even know I enabled features the kernel of the distribution did not support. I guess the problem is that I am using packages from the ceph.com repo, while the kernel comes from ubuntu. However, it's at least curious that when I was running firefly from ubuntu repositories I could create an EC pool, but the kernel was not compatible with EC2 pools... .a. -- antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22 antonio.s.mess...@gmail.com S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] All OSDs don't restart after shutdown
On Thu, Nov 6, 2014 at 12:00 PM, Luca Mazzaferro luca.mazzafe...@rzg.mpg.de wrote: Dear Users, Hi Luca, On the admin-node side the ceph healt command or the ceph -w hangs forever. I'm not a ceph expert either, but this is usually an indication that the monitors are not running. How many MONs are you running? Are they all alive? What's in the mon logs? Also check the time on the mon nodes. cheers, Antonio -- antonio.s.mess...@gmail.com antonio.mess...@uzh.ch +41 (0)44 635 42 22 S3IT: Service and Support for Science IT http://www.s3it.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com