I don't know much about ceph-deploy,  but I know that ceph-disk has
problems "automatically" adding an SSD OSD when there are journals of
other disks already on it. I've had to partition the disk ahead of
time and pass in the partitions to make ceph-disk work.

Also, unless you are sure that the dev devices will be deterministicly
named the same each time, I'd recommend you not use /dev/sd* for
pointing to your journals. Instead use something that will always be
the same, since Ceph with partition the disks with GPT, you can use
the partuuid to point to the journal partition and it will always be
right. A while back I used this to "fix" my journal links when I did
it wrong. You will want to double check that it will work right for
you. no warranty and all that jazz...

#convert the /dev/sd* links for journals into UUIDs

for lnk  in $(ls /var/lib/ceph/osd/); do OSD=/var/lib/ceph/osd/$lnk;
DEV=$(readlink $OSD/journal | cut -d'/' -f3); echo $DEV; PUUID=$(ls
-lh /dev/disk/by-partuuid/ | grep $DEV | cut -d' ' -f 9); ln -sf
/dev/disk/by-partuuid/$PUUID $OSD/journal; done

On Wed, Mar 25, 2015 at 10:46 AM, Antonio Messina
<antonio.s.mess...@gmail.com> wrote:
> Hi all,
>
> I'm trying to install ceph on a 7-nodes preproduction cluster. Each
> node has 24x 4TB SAS disks (2x dell md1400 enclosures) and 6x 800GB
> SSDs (for cache tiering, not journals). I'm using Ubuntu 14.04 and
> ceph-deploy to install the cluster, I've been trying both Firefly and
> Giant and getting the same error. However, the logs I'm reporting are
> relative to the Firefly installation.
>
> The installation seems to go fine until I try to install the last 2
> OSDs (they are SSD disks) of each host. All the OSDs from 0 to 195 are
> UP and IN, but when I try to deploy the next OSD (no matter what host)
> ceph-osd daemon won't start. The error I get is:
>
> 2015-03-25 17:00:17.130937 7fe231312800  0 ceph version 0.80.9
> (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047), process ceph-osd, pid
> 20280
> 2015-03-25 17:00:17.133601 7fe231312800 10
> filestore(/var/lib/ceph/osd/ceph-196) dump_stop
> 2015-03-25 17:00:17.133694 7fe231312800  5
> filestore(/var/lib/ceph/osd/ceph-196) basedir
> /var/lib/ceph/osd/ceph-196 journal /var/lib/ceph/osd/ceph-196/journal
> 2015-03-25 17:00:17.133725 7fe231312800 10
> filestore(/var/lib/ceph/osd/ceph-196) mount fsid is
> 8c2fa707-750a-4773-8918-a368367d9cf5
> 2015-03-25 17:00:17.133789 7fe231312800  0
> filestore(/var/lib/ceph/osd/ceph-196) mount detected xfs (libxfs)
> 2015-03-25 17:00:17.133810 7fe231312800  1
> filestore(/var/lib/ceph/osd/ceph-196)  disabling 'filestore replica
> fadvise' due to known issues with fadvise(DONTNEED) on xfs
> 2015-03-25 17:00:17.135882 7fe231312800  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
> FIEMAP ioctl is supported and appears to work
> 2015-03-25 17:00:17.135892 7fe231312800  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
> FIEMAP ioctl is disabled via 'filestore fiemap' config option
> 2015-03-25 17:00:17.136318 7fe231312800  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_features:
> syncfs(2) syscall fully supported (by glibc and kernel)
> 2015-03-25 17:00:17.136373 7fe231312800  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-196) detect_feature:
> extsize is disabled by conf
> 2015-03-25 17:00:17.136640 7fe231312800  5
> filestore(/var/lib/ceph/osd/ceph-196) mount op_seq is 1
> 2015-03-25 17:00:17.137547 7fe231312800 20 filestore (init)dbobjectmap: seq 
> is 1
> 2015-03-25 17:00:17.137560 7fe231312800 10
> filestore(/var/lib/ceph/osd/ceph-196) open_journal at
> /var/lib/ceph/osd/ceph-196/journal
> 2015-03-25 17:00:17.137575 7fe231312800  0
> filestore(/var/lib/ceph/osd/ceph-196) mount: enabling WRITEAHEAD
> journal mode: checkpoint is not enabled
> 2015-03-25 17:00:17.137580 7fe231312800 10
> filestore(/var/lib/ceph/osd/ceph-196) list_collections
> 2015-03-25 17:00:17.137661 7fe231312800 10 journal journal_replay fs op_seq 1
> 2015-03-25 17:00:17.137668 7fe231312800  2 journal open
> /var/lib/ceph/osd/ceph-196/journal fsid
> 8c2fa707-750a-4773-8918-a368367d9cf5 fs_op_seq 1
> 2015-03-25 17:00:17.137670 7fe22b8b1700 20
> filestore(/var/lib/ceph/osd/ceph-196) sync_entry waiting for
> max_interval 5.000000
> 2015-03-25 17:00:17.137690 7fe231312800 10 journal _open_block_device:
> ignoring osd journal size. We'll use the entire block device (size:
> 5367661056)
> 2015-03-25 17:00:17.162489 7fe231312800  1 journal _open
> /var/lib/ceph/osd/ceph-196/journal fd 20: 5367660544 bytes, block size
> 4096 bytes, directio = 1, aio = 1
> 2015-03-25 17:00:17.162502 7fe231312800 10 journal read_header
> 2015-03-25 17:00:17.172249 7fe231312800 10 journal header: block_size
> 4096 alignment 4096 max_size 5367660544
> 2015-03-25 17:00:17.172256 7fe231312800 10 journal header: start 50987008
> 2015-03-25 17:00:17.172257 7fe231312800 10 journal  write_pos 4096
> 2015-03-25 17:00:17.172259 7fe231312800 10 journal open header.fsid =
> 942f2d62-dd99-42a8-878a-feea443aaa61
> 2015-03-25 17:00:17.172264 7fe231312800 -1 journal FileJournal::open:
> ondisk fsid 942f2d62-dd99-42a8-878a-feea443aaa61 doesn't match
> expected 8c2fa707-750a-4773-8918-a368367d9cf5, invalid (someone
> else's?) journal
> 2015-03-25 17:00:17.172268 7fe231312800  3 journal journal_replay open
> failed with (22) Invalid argument
> 2015-03-25 17:00:17.172284 7fe231312800 -1
> filestore(/var/lib/ceph/osd/ceph-196) mount failed to open journal
> /var/lib/ceph/osd/ceph-196/journal: (22) Invalid argument
> 2015-03-25 17:00:17.172304 7fe22b8b1700 20
> filestore(/var/lib/ceph/osd/ceph-196) sync_entry woke after 0.034632
> 2015-03-25 17:00:17.172330 7fe22b8b1700 10 journal commit_start
> max_applied_seq 1, open_ops 0
> 2015-03-25 17:00:17.172333 7fe22b8b1700 10 journal commit_start
> blocked, all open_ops have completed
> 2015-03-25 17:00:17.172334 7fe22b8b1700 10 journal commit_start nothing to do
> 2015-03-25 17:00:17.172465 7fe231312800 -1  ** ERROR: error converting
> store /var/lib/ceph/osd/ceph-196: (22) Invalid argument
>
> I'm attaching the "full" log of "ceph-deploy osd create osd-l2-05:sde"
> and the /var/log/ceph/ceph-osd.196.log, after trying to re-start the
> osd with increased verbosing, as long as the ceph.conf I'm using.
>
> I've also checked if the "journal" symlinks were correct, and they all
> point to different devices:
>
> root@osd-l2-05:~# ls -1 $(readlink -e  $(readlink -e
> /var/lib/ceph/osd/ceph-*/journal))|sort | uniq -c
>       1 /dev/sda2
>       1 /dev/sdaa2
>       1 /dev/sdab2
>       1 /dev/sdac2
>       1 /dev/sdad2
>       1 /dev/sdae2
>       1 /dev/sdb2
>       1 /dev/sdc2
>       1 /dev/sdd2
>       1 /dev/sde2
>       1 /dev/sdf2
>       1 /dev/sdh2
>       1 /dev/sdi2
>       1 /dev/sdj2
>       1 /dev/sdk2
>       1 /dev/sdl2
>       1 /dev/sdm2
>       1 /dev/sdn2
>       1 /dev/sdo2
>       1 /dev/sdp2
>       1 /dev/sdq2
>       1 /dev/sdr2
>       1 /dev/sds2
>       1 /dev/sdt2
>       1 /dev/sdu2
>       1 /dev/sdv2
>       1 /dev/sdw2
>       1 /dev/sdx2
>       1 /dev/sdy2
>       1 /dev/sdz2
>
>
> On a side note: if I reboot the node, it might happen that the osd 196
> actually starts, but then some other OSD (that was UP before the
> reboot), will not start anymore. In this case I get a totally
> different error on the OSD that was up before the reboot and down
> after:
>
> 2015-03-25 17:11:38.112462 7fe6524f5700 -1 os/FileJournal.cc: In
> function 'int FileJournal::write_aio_bl(off64_t&, ceph::bufferlist&,
> uint64_t)' thread 7fe6524f5700 time 2015-03-25 17:11:38.110877
> os/FileJournal.cc: 1337: FAILED assert(0 == "io_submit got unexpected error")
>
>  ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)
>  1: (FileJournal::write_aio_bl(long&, ceph::buffer::list&, unsigned
> long)+0x799) [0x95ae59]
>  2: (FileJournal::do_aio_write(ceph::buffer::list&)+0x1ef) [0x95b4cf]
>  3: (FileJournal::write_thread_entry()+0x823) [0x961873]
>  4: (FileJournal::Writer::entry()+0xd) [0x8913cd]
>  5: (()+0x8182) [0x7fe657c7b182]
>  6: (clone()+0x6d) [0x7fe6563ee47d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> (plus a very long trace that you can find in attach, as 'ceph-osd.35.log')
>
> I think there is some kind of "conflict" between the journal of two
> OSDs, but I honestly can't figure out why this is happening and how,
> since the journal links are all different. The only think I noticed is
> that the problem started only when I tried to create the OSD on the
> SSD devices, but this might be only partially relevant, because I
> added the SSDs only in the end.
>
> Hardware setup:
> 7x Dell R630 w/ PERC H730P (for SSDs) and PERC H830, both in "jbod" mode
> 2x MD1400 w/ 12x SAS Seagate 4TB ST4000NM0023
> 6x SSD Toshiba 800GB PX03SNF080
>
> Thank you in advance,
> Antonio
>
>
> --
> antonio.s.mess...@gmail.com
> antonio.mess...@uzh.ch                     +41 (0)44 635 42 22
> S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
> University of Zurich
> Winterthurerstrasse 190
> CH-8057 Zurich Switzerland
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to