Re: [ceph-users] Rsync to object store

2016-12-28 Thread Antonio Messina
I used rclone[0] to sync from filesystem to SWIFT. Although it's a
plains SWIFT cluster I'm sure it works with RGW/S3 as well

[0] http://rclone.org/

2016-12-28 22:12 GMT+01:00 Robin H. Johnson :
> On Wed, Dec 28, 2016 at 09:31:57PM +0100, Marc Roos wrote:
>> Is it possible to rsync to the ceph object store with something like
>> this tool of amazon?
>> https://aws.amazon.com/customerapps/1771
> That's a service built on top of AWS EC2 that just happens to back
> storage into AWS S3.
>
> There's no fundamental reason it couldn't support Ceph RGW S3, but you'd
> need to contact the service provider and work out the details with them
> (like running their service close to your RGW instances).
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Check networking first?

2015-08-03 Thread Antonio Messina
On Mon, Aug 3, 2015 at 5:10 PM, Quentin Hartman
qhart...@direwolfdigital.com wrote:
 The problem with this kind of monitoring is that there are so many possible
 metrics to watch and so many possible ways to watch them. For myself, I'm
 working on implementing a couple of things:
 - Watching error counters on servers
 - Watching error counters on switches
 - Watching performance

I would also check:

- link speed (on both servers and switches)
- link usage (over 80% issue a warning)

.a.

-- 
antonio.mess...@uzh.ch
S3IT: Services and Support for Science IThttp://www.s3it.uzh.ch/
University of Zurich Y12 F 84
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD move after reboot

2015-04-23 Thread Antonio Messina
On Thu, Apr 23, 2015 at 11:18 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote:
 Dear All,

 I have multiple disk types (15k  7k) on each ceph node, which I assign to
 different pools, but have a problem as whenever I reboot a node, the OSD's
 move in the CRUSH map.

I just found out that you can customize the way OSDs are automatically
added to the crushmap using an hook script.

I have in ceph.conf:

osd crush location hook = /usr/local/sbin/sc-ceph-crush-location

this will return the correct bucket and root for the specific osd.

I also have

osd crush update on start = true

which should be the default.

This way, whenever an OSD starts, it's automatically added to correct bucket.

ref: http://ceph.com/docs/master/rados/operations/crush-map/#crush-location

.a.

P.S. I apologize if you received this message twice, I've sent it from
the wrong email address the first time.

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD move after reboot

2015-04-23 Thread Antonio Messina
On Thu, Apr 23, 2015 at 11:18 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote:
 Dear All,

 I have multiple disk types (15k  7k) on each ceph node, which I assign to
 different pools, but have a problem as whenever I reboot a node, the OSD's
 move in the CRUSH map.

I just found out that you can customize the way OSDs are automatically
added to the crushmap using an hook script.

I have in ceph.conf:

osd crush location hook = /usr/local/sbin/sc-ceph-crush-location

this will return the correct bucket and root for the specific osd.

I also have

osd crush update on start = true

which should be the default.

This way, whenever an OSD starts, it's automatically added to correct bucket.

ref: http://ceph.com/docs/master/rados/operations/crush-map/#crush-location

.a.

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal

2015-04-07 Thread Antonio Messina
On Wed, Mar 25, 2015 at 6:37 PM, Robert LeBlanc rob...@leblancnet.us wrote:
 As far as the foreign journal, I would run dd over the journal
 partition and try it again. It sounds like something didn't get
 cleaned up from a previous run.

I wrote zeros on the journal device re-created the journal with
ceph-osd --makejournal, and it seems that the latter command didn't
write anything on the partition:

starting osd.196 at :/0 osd_data /var/lib/ceph/osd/ceph-196
/var/lib/ceph/osd/ceph-196/journal
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 28 00 00 00
00 20 00 00 00 00 00 00 85 55 ac 01 00 00 00 00 00 00 00 20 00
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 28 00 00 00
00 20 00 00 00 00 00 00 85 55 ac 01 00 00 00 00 00 00 00 20 00
 HDIO_DRIVE_CMD(identify) failed: Input/output error
2015-04-07 14:07:53.703247 7f0e32433900 -1 journal FileJournal::open:
ondisk fsid ---- doesn't match
expected 35cd523e-4a74-41ae-908a-f25267a94dac, invalid (someone
else's?) journal
2015-04-07 14:07:53.703569 7f0e32433900 -1
filestore(/var/lib/ceph/osd/ceph-196) mount failed to open journal
/var/lib/ceph/osd/ceph-196/journal: (22) Invalid argument
2015-04-07 14:07:53.703956 7f0e32433900 -1  ** ERROR: error converting
store /var/lib/ceph/osd/ceph-196: (22) Invalid argument

Note the fsid 0... line. I've also tried to zap and re-create
with ceph-deploy, same results.

Maybe it's not writing to the journal file because of the bad missing
data error? It's strange thought, that error seems to come from
hdparm -W not working on those disks, but I have the same error on
*all* of my disks...

Any other idea?

.a.

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal

2015-03-25 Thread Antonio Messina
On Wed, Mar 25, 2015 at 6:06 PM, Robert LeBlanc rob...@leblancnet.us wrote:
 I don't know much about ceph-deploy,  but I know that ceph-disk has
 problems automatically adding an SSD OSD when there are journals of
 other disks already on it. I've had to partition the disk ahead of
 time and pass in the partitions to make ceph-disk work.

This is not my case: the journal is created automatically by
ceph-deploy on the same disk, so that for each disk, /dev/sdX1 is the
data partition and /dev/sdX2 is the journal partition. This is also
what I want: I know there is a performance drop, but I expect it to be
mitigated by the cache tier. (and I plan to test both configuration
anyway)

 Also, unless you are sure that the dev devices will be deterministicly
 named the same each time, I'd recommend you not use /dev/sd* for
 pointing to your journals. Instead use something that will always be
 the same, since Ceph with partition the disks with GPT, you can use
 the partuuid to point to the journal partition and it will always be
 right. A while back I used this to fix my journal links when I did
 it wrong. You will want to double check that it will work right for
 you. no warranty and all that jazz...

Thank you for pointing this out, it's an important point. However, the
links are actually created using the partuuid. The command I posted in
my previous email included the output of a pair of nested readlink
in order to get the /dev/sd* names, because in this way it's easier to
see if there are duplicates and where :)

The output of ls -l /var/lib/ceph/osd/ceph-*/journal is actually:

lrwxrwxrwx 1 root root 58 Mar 25 11:38
/var/lib/ceph/osd/ceph-0/journal -
/dev/disk/by-partuuid/18305316-96b0-4654-aaad-7aeb891429f6
lrwxrwxrwx 1 root root 58 Mar 25 11:49
/var/lib/ceph/osd/ceph-7/journal -
/dev/disk/by-partuuid/a263b19a-cb0d-4b4c-bd81-314619d5755d
lrwxrwxrwx 1 root root 58 Mar 25 12:21
/var/lib/ceph/osd/ceph-14/journal -
/dev/disk/by-partuuid/79734e0e-87dd-40c7-ba83-0d49695a75fb
lrwxrwxrwx 1 root root 58 Mar 25 12:31
/var/lib/ceph/osd/ceph-21/journal -
/dev/disk/by-partuuid/73a504bc-3179-43fd-942c-13c6bd8633c5
lrwxrwxrwx 1 root root 58 Mar 25 12:42
/var/lib/ceph/osd/ceph-28/journal -
/dev/disk/by-partuuid/ecff10df-d757-4b1f-bef4-88dd84d84ef1
lrwxrwxrwx 1 root root 58 Mar 25 12:52
/var/lib/ceph/osd/ceph-35/journal -
/dev/disk/by-partuuid/5be30238-3f07-4950-b39f-f5e4c7305e4c
lrwxrwxrwx 1 root root 58 Mar 25 13:02
/var/lib/ceph/osd/ceph-42/journal -
/dev/disk/by-partuuid/3cdb65f2-474c-47fb-8d07-83e7518418ff
lrwxrwxrwx 1 root root 58 Mar 25 13:12
/var/lib/ceph/osd/ceph-49/journal -
/dev/disk/by-partuuid/a47fe2b7-e375-4eea-b7a9-0354a24548dc
lrwxrwxrwx 1 root root 58 Mar 25 13:22
/var/lib/ceph/osd/ceph-56/journal -
/dev/disk/by-partuuid/fb42b7d6-bc6c-4063-8b73-29beb1f65107
lrwxrwxrwx 1 root root 58 Mar 25 13:33
/var/lib/ceph/osd/ceph-63/journal -
/dev/disk/by-partuuid/72aff32b-ca56-4c25-b8ea-ff3aba8db507
lrwxrwxrwx 1 root root 58 Mar 25 13:43
/var/lib/ceph/osd/ceph-70/journal -
/dev/disk/by-partuuid/b7c17a75-47cd-401e-b963-afe910612bd6
lrwxrwxrwx 1 root root 58 Mar 25 13:53
/var/lib/ceph/osd/ceph-77/journal -
/dev/disk/by-partuuid/2c1c2501-fa82-4fc9-a586-03cc4d68faef
lrwxrwxrwx 1 root root 58 Mar 25 14:03
/var/lib/ceph/osd/ceph-84/journal -
/dev/disk/by-partuuid/46f619a5-3edf-44e9-99a6-24d98bcd174a
lrwxrwxrwx 1 root root 58 Mar 25 14:13
/var/lib/ceph/osd/ceph-91/journal -
/dev/disk/by-partuuid/5feef832-dd82-4aa0-9264-dc9496a3f93a
lrwxrwxrwx 1 root root 58 Mar 25 14:24
/var/lib/ceph/osd/ceph-98/journal -
/dev/disk/by-partuuid/055793a0-99d4-49c4-9698-bd8880c21d9c
lrwxrwxrwx 1 root root 58 Mar 25 14:34
/var/lib/ceph/osd/ceph-105/journal -
/dev/disk/by-partuuid/20547f26-6ef3-422b-9732-ad8b0b5b5379
lrwxrwxrwx 1 root root 58 Mar 25 14:44
/var/lib/ceph/osd/ceph-112/journal -
/dev/disk/by-partuuid/2abea809-59c4-41da-bb52-28ef1911ec43
lrwxrwxrwx 1 root root 58 Mar 25 14:54
/var/lib/ceph/osd/ceph-119/journal -
/dev/disk/by-partuuid/d8d15bb8-4b3d-4375-b6e1-62794971df7e
lrwxrwxrwx 1 root root 58 Mar 25 15:05
/var/lib/ceph/osd/ceph-126/journal -
/dev/disk/by-partuuid/ff6ee2b2-9c33-4902-a5e3-f6e9db5714e9
lrwxrwxrwx 1 root root 58 Mar 25 15:15
/var/lib/ceph/osd/ceph-133/journal -
/dev/disk/by-partuuid/9faccb6e-ada9-4742-aa31-eb1308769205
lrwxrwxrwx 1 root root 58 Mar 25 15:25
/var/lib/ceph/osd/ceph-140/journal -
/dev/disk/by-partuuid/2df13c88-ee58-4881-a373-a36a09fb6366
lrwxrwxrwx 1 root root 58 Mar 25 15:36
/var/lib/ceph/osd/ceph-147/journal -
/dev/disk/by-partuuid/13cda9d1-0fec-40cc-a6fc-7cc56f7ffb78
lrwxrwxrwx 1 root root 58 Mar 25 15:46
/var/lib/ceph/osd/ceph-154/journal -
/dev/disk/by-partuuid/5d37bfe9-c0f9-49e0-a951-b0ed04c5de51
lrwxrwxrwx 1 root root 58 Mar 25 15:57
/var/lib/ceph/osd/ceph-161/journal -
/dev/disk/by-partuuid/d34f3abb-3fb7-4875-90d3-d2d3836f6e4d
lrwxrwxrwx 1 root root 58 Mar 25 16:07
/var/lib/ceph/osd/ceph-168/journal -
/dev/disk/by-partuuid/02c3db3e-159c-47d9-8a63-0389ea89fad1
lrwxrwxrwx 1 root root 58 Mar 25 16:16

[ceph-users] New deployment: errors starting OSDs: invalid (someone else's?) journal

2015-03-25 Thread Antonio Messina
Hi all,

I'm trying to install ceph on a 7-nodes preproduction cluster. Each
node has 24x 4TB SAS disks (2x dell md1400 enclosures) and 6x 800GB
SSDs (for cache tiering, not journals). I'm using Ubuntu 14.04 and
ceph-deploy to install the cluster, I've been trying both Firefly and
Giant and getting the same error. However, the logs I'm reporting are
relative to the Firefly installation.

The installation seems to go fine until I try to install the last 2
OSDs (they are SSD disks) of each host. All the OSDs from 0 to 195 are
UP and IN, but when I try to deploy the next OSD (no matter what host)
ceph-osd daemon won't start. The error I get is:

2015-03-25 17:00:17.130937 7fe231312800  0 ceph version 0.80.9
(b5a67f0e1d15385bc0d60a6da6e7fc810bde6047), process ceph-osd, pid
20280
2015-03-25 17:00:17.133601 7fe231312800 10
filestore(/var/lib/ceph/osd/ceph-196) dump_stop
2015-03-25 17:00:17.133694 7fe231312800  5
filestore(/var/lib/ceph/osd/ceph-196) basedir
/var/lib/ceph/osd/ceph-196 journal /var/lib/ceph/osd/ceph-196/journal
2015-03-25 17:00:17.133725 7fe231312800 10
filestore(/var/lib/ceph/osd/ceph-196) mount fsid is
8c2fa707-750a-4773-8918-a368367d9cf5
2015-03-25 17:00:17.133789 7fe231312800  0
filestore(/var/lib/ceph/osd/ceph-196) mount detected xfs (libxfs)
2015-03-25 17:00:17.133810 7fe231312800  1
filestore(/var/lib/ceph/osd/ceph-196)  disabling 'filestore replica
fadvise' due to known issues with fadvise(DONTNEED) on xfs
2015-03-25 17:00:17.135882 7fe231312800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
FIEMAP ioctl is supported and appears to work
2015-03-25 17:00:17.135892 7fe231312800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2015-03-25 17:00:17.136318 7fe231312800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
syncfs(2) syscall fully supported (by glibc and kernel)
2015-03-25 17:00:17.136373 7fe231312800  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_feature:
extsize is disabled by conf
2015-03-25 17:00:17.136640 7fe231312800  5
filestore(/var/lib/ceph/osd/ceph-196) mount op_seq is 1
2015-03-25 17:00:17.137547 7fe231312800 20 filestore (init)dbobjectmap: seq is 1
2015-03-25 17:00:17.137560 7fe231312800 10
filestore(/var/lib/ceph/osd/ceph-196) open_journal at
/var/lib/ceph/osd/ceph-196/journal
2015-03-25 17:00:17.137575 7fe231312800  0
filestore(/var/lib/ceph/osd/ceph-196) mount: enabling WRITEAHEAD
journal mode: checkpoint is not enabled
2015-03-25 17:00:17.137580 7fe231312800 10
filestore(/var/lib/ceph/osd/ceph-196) list_collections
2015-03-25 17:00:17.137661 7fe231312800 10 journal journal_replay fs op_seq 1
2015-03-25 17:00:17.137668 7fe231312800  2 journal open
/var/lib/ceph/osd/ceph-196/journal fsid
8c2fa707-750a-4773-8918-a368367d9cf5 fs_op_seq 1
2015-03-25 17:00:17.137670 7fe22b8b1700 20
filestore(/var/lib/ceph/osd/ceph-196) sync_entry waiting for
max_interval 5.00
2015-03-25 17:00:17.137690 7fe231312800 10 journal _open_block_device:
ignoring osd journal size. We'll use the entire block device (size:
5367661056)
2015-03-25 17:00:17.162489 7fe231312800  1 journal _open
/var/lib/ceph/osd/ceph-196/journal fd 20: 5367660544 bytes, block size
4096 bytes, directio = 1, aio = 1
2015-03-25 17:00:17.162502 7fe231312800 10 journal read_header
2015-03-25 17:00:17.172249 7fe231312800 10 journal header: block_size
4096 alignment 4096 max_size 5367660544
2015-03-25 17:00:17.172256 7fe231312800 10 journal header: start 50987008
2015-03-25 17:00:17.172257 7fe231312800 10 journal  write_pos 4096
2015-03-25 17:00:17.172259 7fe231312800 10 journal open header.fsid =
942f2d62-dd99-42a8-878a-feea443aaa61
2015-03-25 17:00:17.172264 7fe231312800 -1 journal FileJournal::open:
ondisk fsid 942f2d62-dd99-42a8-878a-feea443aaa61 doesn't match
expected 8c2fa707-750a-4773-8918-a368367d9cf5, invalid (someone
else's?) journal
2015-03-25 17:00:17.172268 7fe231312800  3 journal journal_replay open
failed with (22) Invalid argument
2015-03-25 17:00:17.172284 7fe231312800 -1
filestore(/var/lib/ceph/osd/ceph-196) mount failed to open journal
/var/lib/ceph/osd/ceph-196/journal: (22) Invalid argument
2015-03-25 17:00:17.172304 7fe22b8b1700 20
filestore(/var/lib/ceph/osd/ceph-196) sync_entry woke after 0.034632
2015-03-25 17:00:17.172330 7fe22b8b1700 10 journal commit_start
max_applied_seq 1, open_ops 0
2015-03-25 17:00:17.172333 7fe22b8b1700 10 journal commit_start
blocked, all open_ops have completed
2015-03-25 17:00:17.172334 7fe22b8b1700 10 journal commit_start nothing to do
2015-03-25 17:00:17.172465 7fe231312800 -1  ** ERROR: error converting
store /var/lib/ceph/osd/ceph-196: (22) Invalid argument

I'm attaching the full log of ceph-deploy osd create osd-l2-05:sde
and the /var/log/ceph/ceph-osd.196.log, after trying to re-start the
osd with increased verbosing, as long as the ceph.conf I'm using.

I've also checked if the journal symlinks were correct, and they all

Re: [ceph-users] Giant or Firefly for production

2014-12-08 Thread Antonio Messina
On Sun, Dec 7, 2014 at 1:51 PM, René Gallati c...@gallati.net wrote:
 Hello Antonio,
 I use aptly to manage my repositories and mix and match (and snapshot / pin)

I didn't know aptly, thank you for mentioning.

 specific versions and non-standard packages, but as far as I know, the
 kernel from utopic unicorn is already in the main repositories for trusty
 and is a 3.16 line.

 apt-cache policy linux-image-generic-lts-utopic

 should give you the information about availability in your repository. My
 information is:

 linux-image-generic-lts-utopic:
   Installed: (none)
   Candidate: 3.16.0.25.19
   Version table:
  3.16.0.25.19 0
 500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64
 Packages

 Note the lts in the name, these are officially supported although they
 don't specifically announce those kernels when they become available or I'm
 not on the correct mailing lists for that. Generally about one to two month
 after a new non-lts release, they will be there for the LTS version.

I didn't know this either! That's useful, since it's a release kernel _and_ lts.

Thank you René,

cheers
Antonio


-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-07 Thread Antonio Messina
On Sun, Dec 7, 2014 at 1:51 PM, René Gallati c...@gallati.net wrote:
 Hello Antonio,
 I use aptly to manage my repositories and mix and match (and snapshot / pin)

I didn't know aptly, thank you for mentioning.

 specific versions and non-standard packages, but as far as I know, the
 kernel from utopic unicorn is already in the main repositories for trusty
 and is a 3.16 line.

 apt-cache policy linux-image-generic-lts-utopic

 should give you the information about availability in your repository. My
 information is:

 linux-image-generic-lts-utopic:
   Installed: (none)
   Candidate: 3.16.0.25.19
   Version table:
  3.16.0.25.19 0
 500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64
 Packages

 Note the lts in the name, these are officially supported although they
 don't specifically announce those kernels when they become available or I'm
 not on the correct mailing lists for that. Generally about one to two month
 after a new non-lts release, they will be there for the LTS version.

I didn't know this either! That's useful, since it's a release kernel _and_ lts.

Thank you René,

cheers

Antonio

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
 On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote:
 Hi Cephers,

 Have anyone of you decided to put Giant into production instead of Firefly?

This is very interesting to me too: we are going to deploy a large
ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that
the rbd module in Ubuntu Trusty doesn't seem compatible with giant:

feature set mismatch, my 4a042a42  server's 2104a042a42, missing
210

I tried with different ceph osd tunables but nothing seems to fix the issue

However, this cluster will be mainly used for OpenStack, and qemu is
able to access the rbd volume, so this might not be a big problem for
me.

.a.

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote:
 Hi Cephers,

 Have anyone of you decided to put Giant into production instead of Firefly?

This is very interesting to me too: we are going to deploy a large
ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that
the rbd module in Ubuntu Trusty doesn't seem compatible with giant:

feature set mismatch, my 4a042a42  server's 2104a042a42, missing
210

I tried with different ceph osd tunables but nothing seems to fix the issue

However, this cluster will be mainly used for OpenStack, and qemu is
able to access the rbd volume, so this might not be a big problem for
me.

.a.

-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:25 PM, David Moreau Simard dmsim...@iweb.com wrote:
 What are the kernel versions involved ?

 We have Ubuntu precise clients talking to a Ubuntu trusty cluster without 
 issues - with tunables optimal.
 0.88 (Giant) and 0.89 has been working well for us as far the client and 
 Openstack are concerned.

 This link provides some insight as to the possible problems:

Both servers and clients are Ubuntu Trusty. Kernel versions are a bit different:

client: 3.13.0-39-generic #66
server: 3.13.0-32-generic #57
ceph version on both: 0.87

 http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client

 Things to look for:
 - Kernel versions
 - Cache tiering
 - Tunables
 - hashpspool

I have already read the blogpost, but I don't have much experience
with tunables.
From what I understood I am missing:

* CEPH_FEATURE_CRUSH_TUNABLES3
* CEPH_FEATURE_CRUSH_V2

but I don't know how to disable them, and I can't see them set in the
crushmap I get from ceph osd getcrushmap

.a.


-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote:
 This is probably due to the Kernel RBD client not being recent enough. Have
 you tried upgrading your kernel to a newer version? 3.16 should contain all
 the relevant features required by Giant.

I would rather tune the tunables, as upgrading the kernel would
require a reboot of the client.
Besides, Ubuntu Trusty does not provide a 3.16 kernel, so I would need
to recompile...

.a.

-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
Thank you James and Nick,

On Fri, Dec 5, 2014 at 4:46 PM, Nick Fisk n...@fisk.me.uk wrote:
 Ok sorry, I thought you had a need for some of the features in Giant, using
 tunables is probably easier in that case.

I'm not sure :) I never played with the tunables before (still running
a testbed only)

I will test it again with 14.04.2 and default kernel beginning of next
year, I prefer to use the official kernel for the production
cluster, but since it's going to be deployed Q1-Q2 next year I should
be safe.

.a.

 However if you do want to upgrade there are debs available:-

 http://kernel.ubuntu.com/~kernel-ppa/mainline/

 and I believe 3.16 should be available in the 14.04.2 release, which should
 be released early next year.

 Nick

 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 Antonio Messina
 Sent: 05 December 2014 15:38
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com; Antonio Messina
 Subject: Re: [ceph-users] Giant or Firefly for production

 On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote:
 This is probably due to the Kernel RBD client not being recent enough.
 Have you tried upgrading your kernel to a newer version? 3.16 should
 contain all the relevant features required by Giant.

 I would rather tune the tunables, as upgrading the kernel would require a
 reboot of the client.
 Besides, Ubuntu Trusty does not provide a 3.16 kernel, so I would need to
 recompile...

 .a.

 --
 antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
 antonio.s.mess...@gmail.com
 S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
 University of Zurich
 Winterthurerstrasse 190
 CH-8057 Zurich Switzerland
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote:
 On Fri, 5 Dec 2014, Antonio Messina wrote:
 On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote:
  Hi Cephers,
 
  Have anyone of you decided to put Giant into production instead of Firefly?

 This is very interesting to me too: we are going to deploy a large
 ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that
 the rbd module in Ubuntu Trusty doesn't seem compatible with giant:

 feature set mismatch, my 4a042a42  server's 2104a042a42, missing
 210

 Can you attach the output of


I modified the crushmap and set:

tunable chooseleaf_vary_r 0

(it was 1 before)
Now the cluster is rebalancing, and since it's on crappy hardware is
taking some time.

I'm pasting the output of the two commands, but please keep in mind
that this is the output *after* I've updated the chooseleaf_vary_r
tunable.

  ceph osd crush show-tunables -f json-pretty

{ choose_local_tries: 0,
  choose_local_fallback_tries: 0,
  choose_total_tries: 50,
  chooseleaf_descend_once: 1,
  profile: bobtail,
  optimal_tunables: 0,
  legacy_tunables: 0,
  require_feature_tunables: 1,
  require_feature_tunables2: 1,
  require_feature_tunables3: 0,
  has_v2_rules: 1,
  has_v3_rules: 0}

  ceph osd crush dump -f json-pretty

I'm attaching it as a text file, as it is quite big and unreadable.
However, from the output I see the following tunables:

  tunables: { choose_local_tries: 0,
  choose_local_fallback_tries: 0,
  choose_total_tries: 50,
  chooseleaf_descend_once: 1,
  profile: bobtail,
  optimal_tunables: 0,
  legacy_tunables: 0,
  require_feature_tunables: 1,
  require_feature_tunables2: 1,
  require_feature_tunables3: 0,
  has_v2_rules: 1,
  has_v3_rules: 0}}

.a.

-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland


crushmap.json
Description: application/json
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
Hi all, just an update

After setting chooseleaf_vary_r to 0 _and_ removing an pool with
erasure coding, I was able to run rbd map.

Thank you all for the help

.a.

On Fri, Dec 5, 2014 at 5:07 PM, Antonio Messina
antonio.mess...@s3it.uzh.ch wrote:
 On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote:
 On Fri, 5 Dec 2014, Antonio Messina wrote:
 On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com 
 wrote:
  Hi Cephers,
 
  Have anyone of you decided to put Giant into production instead of 
  Firefly?

 This is very interesting to me too: we are going to deploy a large
 ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that
 the rbd module in Ubuntu Trusty doesn't seem compatible with giant:

 feature set mismatch, my 4a042a42  server's 2104a042a42, missing
 210

 Can you attach the output of


 I modified the crushmap and set:

 tunable chooseleaf_vary_r 0

 (it was 1 before)
 Now the cluster is rebalancing, and since it's on crappy hardware is
 taking some time.

 I'm pasting the output of the two commands, but please keep in mind
 that this is the output *after* I've updated the chooseleaf_vary_r
 tunable.

  ceph osd crush show-tunables -f json-pretty

 { choose_local_tries: 0,
   choose_local_fallback_tries: 0,
   choose_total_tries: 50,
   chooseleaf_descend_once: 1,
   profile: bobtail,
   optimal_tunables: 0,
   legacy_tunables: 0,
   require_feature_tunables: 1,
   require_feature_tunables2: 1,
   require_feature_tunables3: 0,
   has_v2_rules: 1,
   has_v3_rules: 0}

  ceph osd crush dump -f json-pretty

 I'm attaching it as a text file, as it is quite big and unreadable.
 However, from the output I see the following tunables:

   tunables: { choose_local_tries: 0,
   choose_local_fallback_tries: 0,
   choose_total_tries: 50,
   chooseleaf_descend_once: 1,
   profile: bobtail,
   optimal_tunables: 0,
   legacy_tunables: 0,
   require_feature_tunables: 1,
   require_feature_tunables2: 1,
   require_feature_tunables3: 0,
   has_v2_rules: 1,
   has_v3_rules: 0}}

 .a.

 --
 antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
 antonio.s.mess...@gmail.com
 S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
 University of Zurich
 Winterthurerstrasse 190
 CH-8057 Zurich Switzerland



-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 5:24 PM, Sage Weil s...@newdream.net wrote:
 The v2 rule means you have a crush rule for erasure coding.  Do you have
 an EC pool in your cluster?

Yes indeed. I didn't know EC pool was incompatible with the current
kernel, I only tested it with rados bench and VMs, I guess.

 The tunables3 feature bit is set because you set the vary_r parameter.

This I don't really know where it comes from. I think at a certain
point I ran ceph osd crush tunables optimal, and it probably added
vary_r, but then I run ceph osd crush tunables firefly and it
didn't remove it... is it normal?

 If you want older kernels to talk to the cluster, you need to avoid the
 new tunables and features!

Well, as I said, I'm not a ceph expert, I didn't even know I enabled
features the kernel of the distribution did not support.

I guess the problem is that I am using packages from the ceph.com
repo, while the kernel comes from ubuntu.

However, it's at least curious that when I was running firefly from
ubuntu repositories I could create an EC pool, but the kernel was not
compatible with EC2 pools...

.a.

-- 
antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
antonio.s.mess...@gmail.com
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All OSDs don't restart after shutdown

2014-11-06 Thread Antonio Messina
On Thu, Nov 6, 2014 at 12:00 PM, Luca Mazzaferro
luca.mazzafe...@rzg.mpg.de wrote:
 Dear Users,

Hi Luca,

 On the admin-node side the ceph healt command or the ceph -w hangs forever.

I'm not a ceph expert either, but this is usually an indication that
the monitors are not running.

How many MONs are you running? Are they all alive? What's in the mon
logs? Also check the time on the mon nodes.

cheers,
Antonio

-- 
antonio.s.mess...@gmail.com
antonio.mess...@uzh.ch +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com