from:"David"

Have you used strace on the du command to see what it's spending its time
doing?

On Thu, Feb 28, 2019, 8:45 PM Glen Baars 
wrote:

> Hello Wido,
>
> The cluster layout is as follows:
>
> 3 x Monitor hosts ( 2 x 10Gbit bonded )
> 9 x OSD hosts (
> 2 x 10Gbit bonded,
> LSI cachecade and write cache drives set to single,
> All HDD in this pool,
> no separate DB / WAL. With the write cache and the SSD read cache on the
> LSI card it seems to perform well.
> 168 OSD disks
>
> No major increase in OSD disk usage or CPU usage. The RBD DU process uses
> 100% of a single 2.4Ghz core while running - I think that is the limiting
> factor.
>
> I have just tried removing most of the snapshots for that volume ( from 14
> snapshots down to 1 snapshot ) and the rbd du command now takes around 2-3
> minutes.
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Wido den Hollander 
> Sent: Thursday, 28 February 2019 5:05 PM
> To: Glen Baars ; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Mimic 13.2.4 rbd du slowness
>
>
>
> On 2/28/19 9:41 AM, Glen Baars wrote:
> > Hello Wido,
> >
> > I have looked at the libvirt code and there is a check to ensure that
> fast-diff is enabled on the image and only then does it try to get the real
> disk usage. The issue for me is that even with fast-diff enabled it takes
> 25min to get the space usage for a 50TB image.
> >
> > I had considered turning off fast-diff on the large images to get
> > around to issue but I think that will hurt my snapshot removal times (
> > untested )
> >
>
> Can you tell a bit more about the Ceph cluster? HDD? SSD? DB and WAL on
> SSD?
>
> Do you see OSDs spike in CPU or Disk I/O when you do a 'rbd du' on these
> images?
>
> Wido
>
> > I can't see in the code any other way of bypassing the disk usage check
> but I am not that familiar with the code.
> >
> > ---
> > if (volStorageBackendRBDUseFastDiff(features)) {
> > VIR_DEBUG("RBD image %s/%s has fast-diff feature enabled. "
> >   "Querying for actual allocation",
> >   def->source.name, vol->name);
> >
> > if (virStorageBackendRBDSetAllocation(vol, image, &info) < 0)
> > goto cleanup;
> > } else {
> > vol->target.allocation = info.obj_size * info.num_objs; }
> > --
> >
> > Kind regards,
> > Glen Baars
> >
> > -Original Message-
> > From: Wido den Hollander 
> > Sent: Thursday, 28 February 2019 3:49 PM
> > To: Glen Baars ;
> > ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Mimic 13.2.4 rbd du slowness
> >
> >
> >
> > On 2/28/19 2:59 AM, Glen Baars wrote:
> >> Hello Ceph Users,
> >>
> >> Has anyone found a way to improve the speed of the rbd du command on
> large rbd images? I have object map and fast diff enabled - no invalid
> flags on the image or it's snapshots.
> >>
> >> We recently upgraded our Ubuntu 16.04 KVM servers for Cloudstack to
> Ubuntu 18.04. The upgrades libvirt to version 4. When libvirt 4 adds an rbd
> pool it discovers all images in the pool and tries to get their disk usage.
> We are seeing a 50TB image take 25min. The pool has over 300TB of images in
> it and takes hours for libvirt to start.
> >>
> >
> > This is actually a pretty bad thing imho. As a lot of images people will
> be using do not have fast-diff enabled (images from the past) and that will
> kill their performance.
> >
> > Isn't there a way to turn this off in libvirt?
> >
> > Wido
> >
> >> We can replicate the issue without libvirt by just running a rbd du on
> the large images. The limiting factor is the cpu on the rbd du command, it
> uses 100% of a single core.
> >>
> >> Our cluster is completely bluestore/mimic 13.2.4. 168 OSDs, 12 Ubuntu
> 16.04 hosts.
> >>
> >> Kind regards,
> >> Glen Baars
> >> This e-mail is intended solely for the benefit of the addressee(s) and
> any other named recipient. It is confidential and may contain legally
> privileged or confidential information. If you are not the recipient, any
> use, distribution, disclosure or copying of this e-mail is prohibited. The
> confidentiality and legal privilege attached to this communication is not
> waived or lost by reason of the mistaken transmission or delivery to you.
> If you have received this e-mail in error, please notify us immediately.
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> > This e-mail is intended solely for the benefit of the addressee(s) and
> any other named recipient. It is confidential and may contain legally
> privileged or confidential information. If you are not the recipient, any
> use, distribution, disclosure or copying of this e-mail is prohibited. The
> confidentiality and legal privilege attached to this communication is not
> waived or lost by reason of the mistaken transmission or delivery to you.
> If you have received this e-mail

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

Why are you making the same rbd to multiple servers?

On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov  wrote:

> On Wed, Feb 27, 2019 at 12:00 PM Thomas <74cmo...@gmail.com> wrote:
> >
> > Hi,
> > I have noticed an error when writing to a mapped RBD.
> > Therefore I unmounted the block device.
> > Then I tried to unmap it w/o success:
> > ld2110:~ # rbd unmap /dev/rbd0
> > rbd: sysfs write failed
> > rbd: unmap failed: (16) Device or resource busy
> >
> > The same block device is mapped on another client and there are no
> issues:
> > root@ld4257:~# rbd info hdb-backup/ld2110
> > rbd image 'ld2110':
> > size 7.81TiB in 2048000 objects
> > order 22 (4MiB objects)
> > block_name_prefix: rbd_data.3cda0d6b8b4567
> > format: 2
> > features: layering
> > flags:
> > create_timestamp: Fri Feb 15 10:53:50 2019
> > root@ld4257:~# rados -p hdb-backup  listwatchers rbd_data.3cda0d6b8b4567
> > error listing watchers hdb-backup/rbd_data.3cda0d6b8b4567: (2) No such
> > file or directory
> > root@ld4257:~# rados -p hdb-backup  listwatchers
> rbd_header.3cda0d6b8b4567
> > watcher=10.76.177.185:0/1144812735 client.21865052 cookie=1
> > watcher=10.97.206.97:0/4023931980 client.18484780
> > cookie=18446462598732841027
> >
> >
> > Question:
> > How can I force to unmap the RBD on client ld2110 (= 10.76.177.185)?
>
> Hi Thomas,
>
> It appears that /dev/rbd0 is still open on that node.
>
> Was the unmount successful?  Which filesystem (ext4, xfs, etc)?
>
> What is the output of "ps aux | grep rbd" on that node?
>
> Try lsof, fuser, check for LVM volumes and multipath -- these have been
> reported to cause this issue previously:
>
>   http://tracker.ceph.com/issues/12763
>
> Thanks,
>
> Ilya
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG Calculations Issue

Those numbers look right for a pool only containing 10% of your data. Now
continue to calculate the pg counts for the remaining 90% of your data.

On Wed, Feb 27, 2019, 12:17 PM Krishna Venkata 
wrote:

> Greetings,
>
>
> I am having issues in the way PGs are calculated in
> https://ceph.com/pgcalc/ [Ceph PGs per Pool Calculator ] and the formulae
> mentioned in the site.
>
> Below are my findings
>
> The formula to calculate PGs as mentioned in the https://ceph.com/pgcalc/
>  :
>
> 1.  Need to pick the highest value from either of the formulas
>
> *(( Target PGs per OSD ) x ( OSD # ) x ( %Data ))/(size)*
>
> Or
>
> *( OSD# ) / ( Size )*
>
> 2.  The output value is then rounded to the nearest power of 2
>
>1. If the nearest power of 2 is more than 25% below the original
>value, the next higher power of 2 is used.
>
>
>
> Based on the above procedure, we calculated PGs for 25, 32 and 64 OSDs
>
> *Our Dataset:*
>
> *%Data:* 0.10
>
> *Target PGs per OSD:* 100
>
> *OSDs* 25, 32 and 64
>
>
>
> *For 25 OSDs*
>
>
>
> (100*25* (0.10/100))/(3) = 0.833
>
>
>
> ( 25 ) / ( 3 ) = 8.33
>
>
>
> 1. Raw pg num 8.33  ( Since we need to pick the highest of (0.833, 8.33))
>
> 2. max pg 16 ( For, 8.33 the nearest power of 2 is 16)
>
> 3. 16 > 2.08  ( 25 % of 8.33 is 2.08 which is more than 25% the power of 2)
>
>
>
> So 16 PGs
>
> ü  GUI Calculator gives the same value and matches with Formula.
>
>
>
> *For 32 OSD*
>
>
>
> (100*32*(0.10/100))/3 = 1.066
>
> ( 32 ) / ( 3 ) = 10.66
>
>
>
> 1. Raw pg num 10.66 ( Since we need to pick the highest of (1.066, 10.66))
>
> 2. max pg 16 ( For, 10.66 the nearest power of 2 is 16)
>
> 3.  16 > 2.655 ( 25 % of 10.66 is 2.655 which is more than 25% the power
> of 2)
>
>
>
> So 16 PGs
>
> û  GUI Calculator gives different value (32 PGs) which doesn’t match with
> Formula.
>
>
>
> *For 64 OSD*
>
>
>
> (100 * 64 * (0.10/100))/3 = 2.133
>
> ( 64 ) / ( 3 ) 21.33
>
>
>
> 1. Raw pg num 21.33 ( Since we need to pick the highest of (2.133, 21.33))
>
> 2. max pg 32 ( For, 21.33 the nearest power of 2 is 32)
>
> 3. 32 > 5.3325 ( 25 % of 21.33 is 5.3325 which is more than 25% the power
> of 2)
>
>
>
> So 32 PGs
>
> û  GUI Calculator gives different value (64 PGs) which doesn’t match with
> Formula.
>
>
>
> We checked the PG calculator logic from [
> https://ceph.com/pgcalc_assets/pgcalc.js ] which is not matching from
> above formulae.
>
>
>
> Can someone Guide/reference us to correct formulae to calculate PGs.
>
>
>
> Thanks in advance.
>
>
>
> Regards,
>
> Krishna Venkata
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] redirect log to syslog and disable log to stderr

You can always set it in your ceph.conf file and restart the mgr daemon.

On Tue, Feb 26, 2019, 1:30 PM Alex Litvak 
wrote:

> Dear Cephers,
>
> In mimic 13.2.2
> ceph tell mgr.* injectargs --log-to-stderr=false
> Returns an error (no valid command found ...).  What is the correct way to
> inject mgr configuration values?
>
> The same command works on mon
>
> ceph tell mon.* injectargs --log-to-stderr=false
>
>
> Thank you in advance,
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Right way to delete OSD from cluster?

The reason is that an osd still contributes to the host weight in the crush
map even while it is marked out. When you out and then purge, the purging
operation removed the osd from the map and changes the weight of the host
which changes the crush map and data moves. By weighting the osd to 0.0,
the hosts weight is already the same it will be when you purge the osd.
Weighting to 0.0 is definitely the best option for removing storage if you
can trust the data on the osd being removed.

On Tue, Feb 26, 2019, 3:19 AM Fyodor Ustinov  wrote:

> Hi!
>
> Thank you so much!
>
> I do not understand why, but your variant really causes only one rebalance
> compared to the "osd out".
>
> - Original Message -
> From: "Scottix" 
> To: "Fyodor Ustinov" 
> Cc: "ceph-users" 
> Sent: Wednesday, 30 January, 2019 20:31:32
> Subject: Re: [ceph-users] Right way to delete OSD from cluster?
>
> I generally have gone the crush reweight 0 route
> This way the drive can participate in the rebalance, and the rebalance
> only happens once. Then you can take it out and purge.
>
> If I am not mistaken this is the safest.
>
> ceph osd crush reweight  0
>
> On Wed, Jan 30, 2019 at 7:45 AM Fyodor Ustinov  wrote:
> >
> > Hi!
> >
> > But unless after "ceph osd crush remove" I will not got the undersized
> objects? That is, this is not the same thing as simply turning off the OSD
> and waiting for the cluster to be restored?
> >
> > - Original Message -
> > From: "Wido den Hollander" 
> > To: "Fyodor Ustinov" , "ceph-users" <
> ceph-users@lists.ceph.com>
> > Sent: Wednesday, 30 January, 2019 15:05:35
> > Subject: Re: [ceph-users] Right way to delete OSD from cluster?
> >
> > On 1/30/19 2:00 PM, Fyodor Ustinov wrote:
> > > Hi!
> > >
> > > I thought I should first do "ceph osd out", wait for the end
> relocation of the misplaced objects and after that do "ceph osd purge".
> > > But after "purge" the cluster starts relocation again.
> > >
> > > Maybe I'm doing something wrong? Then what is the correct way to
> delete the OSD from the cluster?
> > >
> >
> > You are not doing anything wrong, this is the expected behavior. There
> > are two CRUSH changes:
> >
> > - Marking it out
> > - Purging it
> >
> > You could do:
> >
> > $ ceph osd crush remove osd.X
> >
> > Wait for all good
> >
> > $ ceph osd purge X
> >
> > The last step should then not initiate any data movement.
> >
> > Wido
> >
> > > WBR,
> > > Fyodor.
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> T: @Thaumion
> IG: Thaumion
> scot...@gmail.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-28 Thread David C

On Wed, Feb 27, 2019 at 11:35 AM Hector Martin 
wrote:

> On 27/02/2019 19:22, David C wrote:
> > Hi All
> >
> > I'm seeing quite a few directories in my filesystem with rctime years in
> > the future. E.g
> >
> > ]# getfattr -d -m ceph.dir.* /path/to/dir
> > getfattr: Removing leading '/' from absolute path names
> > # file:  path/to/dir
> > ceph.dir.entries="357"
> > ceph.dir.files="1"
> > ceph.dir.rbytes="35606883904011"
> > ceph.dir.rctime="1851480065.090"
> > ceph.dir.rentries="12216551"
> > ceph.dir.rfiles="10540827"
> > ceph.dir.rsubdirs="1675724"
> > ceph.dir.subdirs="356"
> >
> > That's showing a last modified time of 2 Sept 2028, the day and month
> > are also wrong.
>
> Obvious question: are you sure the date/time on your cluster nodes and
> your clients is correct? Can you track down which files (if any) have
> the ctime in the future by following the rctime down the filesystem tree?
>

Times are all correct on the nodes and CephFS clients however the fs is
being exported over NFS. It's possible some NFS clients have the wrong time
although I'm reasonably confident they are all correct as the machines are
synced to local time servers and they use AD for auth, things wouldn't work
if the time was that wildly out of sync.

Good idea on checking down the tree. I've found the offending files but
can't find any explanation as to why they have a modified date so far in
the future.

For example one dir is "/.config/caja/" in a users home dir. The files in
this dir are all wildly different, the modified times are 1984, 1997,
2028...

It certainly feels like a MDS issue to me. I've used the recursive stats
since Jewel and I've never seen this before.

Any ideas?

> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Cephfs recursive stats | rctime in the future

2019-02-27 Thread David C

Hi All

I'm seeing quite a few directories in my filesystem with rctime years in
the future. E.g

]# getfattr -d -m ceph.dir.* /path/to/dir
getfattr: Removing leading '/' from absolute path names
# file:  path/to/dir
ceph.dir.entries="357"
ceph.dir.files="1"
ceph.dir.rbytes="35606883904011"
ceph.dir.rctime="1851480065.090"
ceph.dir.rentries="12216551"
ceph.dir.rfiles="10540827"
ceph.dir.rsubdirs="1675724"
ceph.dir.subdirs="356"

That's showing a last modified time of 2 Sept 2028, the day and month are
also wrong.

Most dirs are still showing the correct rctime.

I've used the recursive stats for a few years now and they've always been
reliable. The last major changes I made to this cluster was an update to
Luminous 12.2.10, moving the metadata pool to an SSD backed pool and the
addition of a second Cephfs data pool.

I have just received a scrub error this morning with 1 inconsistent pg but
I've been noticing the incorrect rctimes for a while a now so not sure if
that's related.

Any help much appreciated

Thanks
David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Usenix Vault 2019

2019-02-24 Thread David Turner

There is a scheduled birds of a feather for Ceph tomorrow night, but I also
noticed that there are only trainings tomorrow. Unless you are paying more
for those, you likely don't have much to do on Monday. That's the boat I'm
in. Is anyone interested in getting together tomorrow in Boston during the
training day?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Configuration about using nvme SSD

2019-02-24 Thread David Turner

One thing that's worked for me to get more out of nvmes with Ceph is to
create multiple partitions on the nvme with an osd on each partition. That
way you get more osd processes and CPU per nvme device. I've heard of
people using up to 4 partitions like this.

On Sun, Feb 24, 2019, 10:25 AM Vitaliy Filippov  wrote:

> > We can get 513558 IOPS in 4K read per nvme by fio but only 45146 IOPS
> > per OSD.by rados.
>
> Don't expect Ceph to fully utilize NVMe's, it's software and it's slow :)
> some colleagues tell that SPDK works out of the box, but almost doesn't
> increase performance, because the userland-kernel interaction isn't the
> bottleneck currently, it's Ceph code itself. I also tried once, but I
> couldn't make it work. When I have some spare NVMe's I'll make another
> attempt.
>
> So... try it and share your results here :) we're all interested.
>
> --
> With best regards,
>Vitaliy Filippov
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Doubts about backfilling performance

2019-02-23 Thread David Turner

Jewel is really limited on the settings you can tweak for backfilling [1].
Luminous and Mimic have a few more knobs. An option you can do, though, is
to use osd_crush_initial_weight found [2] here. With this setting you set
your initial crush weight for new osds to 0.0 and gradually increase them
to what you want them to be. This doesn't help with already added osds, but
can help in the future.

[1]
http://docs.ceph.com/docs/jewel/rados/configuration/osd-config-ref/#backfilling
[2] http://docs.ceph.com/docs/jewel/rados/configuration/pool-pg-config-ref/

On Sat, Feb 23, 2019, 6:08 AM Fabio Abreu  wrote:

> Hello everybody,
>
> I try to improve the backfilling proccess without impact my client I/O,
> that is a painfull thing  when i putted a new osd in my environment.
>
> I look some options like osd backfill scan max , Can I improve the
> performance if I reduce this ?
>
> Someome recommend parameter to study in my scenario.
>
> My environment is jewel 10.2.7 .
>
> Best Regards,
> Fabio Abreu
> --
> Atenciosamente,
> Fabio Abreu Reis
> http://fajlinux.com.br
> *Tel : *+55 21 98244-0161
> *Skype : *fabioabreureis
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread David Turner

Mon disks don't have journals, they're just a folder on a filesystem on a
disk.

On Fri, Feb 22, 2019, 6:40 AM M Ranga Swami Reddy 
wrote:

> ceph mons looks fine during the recovery.  Using  HDD with SSD
> journals. with recommeded CPU and RAM numbers.
>
> On Fri, Feb 22, 2019 at 4:40 PM David Turner 
> wrote:
> >
> > What about the system stats on your mons during recovery? If they are
> having a hard time keeping up with requests during a recovery, I could see
> that impacting client io. What disks are they running on? CPU? Etc.
> >
> > On Fri, Feb 22, 2019, 6:01 AM M Ranga Swami Reddy 
> wrote:
> >>
> >> Debug setting defaults are using..like 1/5 and 0/5 for almost..
> >> Shall I try with 0 for all debug settings?
> >>
> >> On Wed, Feb 20, 2019 at 9:17 PM Darius Kasparavičius 
> wrote:
> >> >
> >> > Hello,
> >> >
> >> >
> >> > Check your CPU usage when you are doing those kind of operations. We
> >> > had a similar issue where our CPU monitoring was reporting fine < 40%
> >> > usage, but our load on the nodes was high mid 60-80. If it's possible
> >> > try disabling ht and see the actual cpu usage.
> >> > If you are hitting CPU limits you can try disabling crc on messages.
> >> > ms_nocrc
> >> > ms_crc_data
> >> > ms_crc_header
> >> >
> >> > And setting all your debug messages to 0.
> >> > If you haven't done you can also lower your recovery settings a
> little.
> >> > osd recovery max active
> >> > osd max backfills
> >> >
> >> > You can also lower your file store threads.
> >> > filestore op threads
> >> >
> >> >
> >> > If you can also switch to bluestore from filestore. This will also
> >> > lower your CPU usage. I'm not sure that this is bluestore that does
> >> > it, but I'm seeing lower cpu usage when moving to bluestore + rocksdb
> >> > compared to filestore + leveldb .
> >> >
> >> >
> >> > On Wed, Feb 20, 2019 at 4:27 PM M Ranga Swami Reddy
> >> >  wrote:
> >> > >
> >> > > Thats expected from Ceph by design. But in our case, we are using
> all
> >> > > recommendation like rack failure domain, replication n/w,etc, still
> >> > > face client IO performance issues during one OSD down..
> >> > >
> >> > > On Tue, Feb 19, 2019 at 10:56 PM David Turner <
> drakonst...@gmail.com> wrote:
> >> > > >
> >> > > > With a RACK failure domain, you should be able to have an entire
> rack powered down without noticing any major impact on the clients.  I
> regularly take down OSDs and nodes for maintenance and upgrades without
> seeing any problems with client IO.
> >> > > >
> >> > > > On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy <
> swamire...@gmail.com> wrote:
> >> > > >>
> >> > > >> Hello - I have a couple of questions on ceph cluster stability,
> even
> >> > > >> we follow all recommendations as below:
> >> > > >> - Having separate replication n/w and data n/w
> >> > > >> - RACK is the failure domain
> >> > > >> - Using SSDs for journals (1:4ratio)
> >> > > >>
> >> > > >> Q1 - If one OSD down, cluster IO down drastically and customer
> Apps impacted.
> >> > > >> Q2 - what is stability ratio, like with above, is ceph cluster
> >> > > >> workable condition, if one osd down or one node down,etc.
> >> > > >>
> >> > > >> Thanks
> >> > > >> Swami
> >> > > >> ___
> >> > > >> ceph-users mailing list
> >> > > >> ceph-users@lists.ceph.com
> >> > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> > > ___
> >> > > ceph-users mailing list
> >> > > ceph-users@lists.ceph.com
> >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time

2019-02-22 Thread David Turner

Can you correlate the times to scheduled tasks inside of any VMs? For
instance if you have several Linux VMs with the updatedb command installed
that by default they will all be scanning their disks at the same time each
day to see where files are. Other common culprits could be scheduled
backups, db cleanup, etc. Do you track cluster io at all? When I first
configured a graphing tool on my home cluster I found the updatedb/locate
command happening with a drastic io spike at the same time every day. I
also found a spike when a couple Windows VMs were checking for updates
automatically.

On Fri, Feb 22, 2019, 4:28 AM mart.v  wrote:

> Hello everyone,
>
> I'm experiencing a strange behaviour. My cluster is relatively small (43
> OSDs, 11 nodes), running Ceph 12.2.10 (and Proxmox 5). Nodes are connected
> via 10 Gbit network (Nexus 6000). Cluster is mixed (SSD and HDD), but with
> different pools. Descibed error is only on the SSD part of the cluster.
>
> I noticed that few times a day the cluster slows down a bit and I have
> discovered this in logs:
>
> 2019-02-22 08:21:20.064396 mon.node1 mon.0 172.16.254.101:6789/0 1794159
> : cluster [WRN] Health check failed: 27 slow requests are blocked > 32 sec.
> Implicated osds 10,22,33 (REQUEST_SLOW)
> 2019-02-22 08:21:26.589202 mon.node1 mon.0 172.16.254.101:6789/0 1794169
> : cluster [WRN] Health check update: 199 slow requests are blocked > 32
> sec. Implicated osds 0,4,5,6,7,8,9,10,12,16,17,19,20,21,22,25,26,33,41
> (REQUEST_SLOW)
> 2019-02-22 08:21:32.655671 mon.node1 mon.0 172.16.254.101:6789/0 1794183
> : cluster [WRN] Health check update: 448 slow requests are blocked > 32
> sec. Implicated osds
> 0,3,4,5,6,7,8,9,10,12,15,16,17,19,20,21,22,24,25,26,33,41 (REQUEST_SLOW)
> 2019-02-22 08:21:38.744210 mon.node1 mon.0 172.16.254.101:6789/0 1794210
> : cluster [WRN] Health check update: 388 slow requests are blocked > 32
> sec. Implicated osds 4,8,10,16,24,33 (REQUEST_SLOW)
> 2019-02-22 08:21:42.790346 mon.node1 mon.0 172.16.254.101:6789/0 1794214
> : cluster [INF] Health check cleared: REQUEST_SLOW (was: 18 slow requests
> are blocked > 32 sec. Implicated osds 8,16)
>
> "ceph health detail" shows nothing more
>
> It is happening through the whole day and the times can't be linked to any
> read or write intensive task (e.g. backup). I also tried to disable
> scrubbing, but it kept on going. These errors were not there since
> beginning, but unfortunately I cannot track the day they started (it is
> beyond my logs).
>
> Any ideas?
>
> Thank you!
> Martin
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread David Turner

What about the system stats on your mons during recovery? If they are
having a hard time keeping up with requests during a recovery, I could see
that impacting client io. What disks are they running on? CPU? Etc.

On Fri, Feb 22, 2019, 6:01 AM M Ranga Swami Reddy 
wrote:

> Debug setting defaults are using..like 1/5 and 0/5 for almost..
> Shall I try with 0 for all debug settings?
>
> On Wed, Feb 20, 2019 at 9:17 PM Darius Kasparavičius 
> wrote:
> >
> > Hello,
> >
> >
> > Check your CPU usage when you are doing those kind of operations. We
> > had a similar issue where our CPU monitoring was reporting fine < 40%
> > usage, but our load on the nodes was high mid 60-80. If it's possible
> > try disabling ht and see the actual cpu usage.
> > If you are hitting CPU limits you can try disabling crc on messages.
> > ms_nocrc
> > ms_crc_data
> > ms_crc_header
> >
> > And setting all your debug messages to 0.
> > If you haven't done you can also lower your recovery settings a little.
> > osd recovery max active
> > osd max backfills
> >
> > You can also lower your file store threads.
> > filestore op threads
> >
> >
> > If you can also switch to bluestore from filestore. This will also
> > lower your CPU usage. I'm not sure that this is bluestore that does
> > it, but I'm seeing lower cpu usage when moving to bluestore + rocksdb
> > compared to filestore + leveldb .
> >
> >
> > On Wed, Feb 20, 2019 at 4:27 PM M Ranga Swami Reddy
> >  wrote:
> > >
> > > Thats expected from Ceph by design. But in our case, we are using all
> > > recommendation like rack failure domain, replication n/w,etc, still
> > > face client IO performance issues during one OSD down..
> > >
> > > On Tue, Feb 19, 2019 at 10:56 PM David Turner 
> wrote:
> > > >
> > > > With a RACK failure domain, you should be able to have an entire
> rack powered down without noticing any major impact on the clients.  I
> regularly take down OSDs and nodes for maintenance and upgrades without
> seeing any problems with client IO.
> > > >
> > > > On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy <
> swamire...@gmail.com> wrote:
> > > >>
> > > >> Hello - I have a couple of questions on ceph cluster stability, even
> > > >> we follow all recommendations as below:
> > > >> - Having separate replication n/w and data n/w
> > > >> - RACK is the failure domain
> > > >> - Using SSDs for journals (1:4ratio)
> > > >>
> > > >> Q1 - If one OSD down, cluster IO down drastically and customer Apps
> impacted.
> > > >> Q2 - what is stability ratio, like with above, is ceph cluster
> > > >> workable condition, if one osd down or one node down,etc.
> > > >>
> > > >> Thanks
> > > >> Swami
> > > >> ___
> > > >> ceph-users mailing list
> > > >> ceph-users@lists.ceph.com
> > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] faster switch to another mds

2019-02-20 Thread David Turner

If I'm not mistaken, if you stop them at the same time during a reboot on a
node with both mds and mon, the mons might receive it, but wait to finish
their own election vote before doing anything about it.  If you're trying
to keep optimal uptime for your mds, then stopping it first and on its own
makes sense.

On Wed, Feb 20, 2019 at 3:46 PM Patrick Donnelly 
wrote:

> On Tue, Feb 19, 2019 at 11:39 AM Fyodor Ustinov  wrote:
> >
> > Hi!
> >
> > From documentation:
> >
> > mds beacon grace
> > Description:The interval without beacons before Ceph declares an MDS
> laggy (and possibly replace it).
> > Type:   Float
> > Default:15
> >
> > I do not understand, 15 - are is seconds or beacons?
>
> seconds
>
> > And an additional misunderstanding - if we gently turn off the MDS (or
> MON), why it does not inform everyone interested before death - "I am
> turned off, no need to wait, appoint a new active server"
>
> The MDS does inform the monitors if it has been shutdown. If you pull
> the plug or SIGKILL, it does not. :)
>
>
> --
> Patrick Donnelly
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] faster switch to another mds

It's also been mentioned a few times that when MDS and MON are on the same
host that the downtime for MDS is longer when both daemons stop at about
the same time.  It's been suggested to stop the MDS daemon, wait for `ceph
mds stat` to reflect the change, and then restart the rest of the server.
HTH.

On Mon, Feb 11, 2019 at 3:55 PM Gregory Farnum  wrote:

> You can't tell from the client log here, but probably the MDS itself was
> failing over to a new instance during that interval. There's not much
> experience with it, but you could experiment with faster failover by
> reducing the mds beacon and grace times. This may or may not work
> reliably...
>
> On Sat, Feb 9, 2019 at 10:52 AM Fyodor Ustinov  wrote:
>
>> Hi!
>>
>> I have ceph cluster with 3 nodes with mon/mgr/mds servers.
>> I reboot one node and see this in client log:
>>
>> Feb 09 20:29:14 ceph-nfs1 kernel: libceph: mon2 10.5.105.40:6789 socket
>> closed (con state OPEN)
>> Feb 09 20:29:14 ceph-nfs1 kernel: libceph: mon2 10.5.105.40:6789 session
>> lost, hunting for new mon
>> Feb 09 20:29:14 ceph-nfs1 kernel: libceph: mon0 10.5.105.34:6789 session
>> established
>> Feb 09 20:29:22 ceph-nfs1 kernel: libceph: mds0 10.5.105.40:6800 socket
>> closed (con state OPEN)
>> Feb 09 20:29:23 ceph-nfs1 kernel: libceph: mds0 10.5.105.40:6800 socket
>> closed (con state CONNECTING)
>> Feb 09 20:29:24 ceph-nfs1 kernel: libceph: mds0 10.5.105.40:6800 socket
>> closed (con state CONNECTING)
>> Feb 09 20:29:24 ceph-nfs1 kernel: libceph: mds0 10.5.105.40:6800 socket
>> closed (con state CONNECTING)
>> Feb 09 20:29:53 ceph-nfs1 kernel: ceph: mds0 reconnect start
>> Feb 09 20:29:53 ceph-nfs1 kernel: ceph: mds0 reconnect success
>> Feb 09 20:30:05 ceph-nfs1 kernel: ceph: mds0 recovery completed
>>
>> As I understand it, the following has happened:
>> 1. Client detects - link with mon server broken and fast switches to
>> another mon (less that 1 seconds).
>> 2. Client detects - link with mds server broken, 3 times trying reconnect
>> (unsuccessful), waiting and reconnects to the same mds after 30 seconds
>> downtime.
>>
>> I have 2 questions:
>> 1. Why?
>> 2. How to reduce switching time to another mds?
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS overwrite/truncate performance hit

If your client needs to be able to handle the writes like that on its own,
RBDs might be the more appropriate use case.  You lose the ability to have
multiple clients accessing the data as easily as with CephFS, but you would
gain the features you're looking for.

On Tue, Feb 12, 2019 at 1:43 PM Gregory Farnum  wrote:

>
>
> On Tue, Feb 12, 2019 at 5:10 AM Hector Martin 
> wrote:
>
>> On 12/02/2019 06:01, Gregory Farnum wrote:
>> > Right. Truncates and renames require sending messages to the MDS, and
>> > the MDS committing to RADOS (aka its disk) the change in status, before
>> > they can be completed. Creating new files will generally use a
>> > preallocated inode so it's just a network round-trip to the MDS.
>>
>> I see. Is there a fundamental reason why these kinds of metadata
>> operations cannot be buffered in the client, or is this just the current
>> way they're implemented?
>>
>
> It's pretty fundamental, at least to the consistency guarantees we hold
> ourselves to. What happens if the client has buffered an update like that,
> performs writes to the data with those updates in mind, and then fails
> before they're flushed to the MDS? A local FS doesn't need to worry about a
> different node having a different lifetime, and can control the write order
> of its metadata and data updates on belated flush a lot more precisely than
> we can. :(
> -Greg
>
>
>>
>> e.g. on a local FS these kinds of writes can just stick around in the
>> block cache unflushed. And of course for CephFS I assume file extension
>> also requires updating the file size in the MDS, yet that doesn't block
>> while truncation does.
>>
>> > Going back to your first email, if you do an overwrite that is confined
>> > to a single stripe unit in RADOS (by default, a stripe unit is the size
>> > of your objects which is 4MB and it's aligned from 0), it is guaranteed
>> > to be atomic. CephFS can only tear writes across objects, and only if
>> > your client fails before the data has been flushed.
>>
>> Great! I've implemented this in a backwards-compatible way, so that gets
>> rid of this bottleneck. It's just a 128-byte flag file (formerly
>> variable length, now I just pad it to the full 128 bytes and rewrite it
>> in-place). This is good information to know for optimizing things :-)
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS: client hangs

You're attempting to use mismatching client name and keyring.  You want to
use matching name and keyring.  For your example, you would want to either
use `--keyring /etc/ceph/ceph.client.admin.keyring --name client.admin` or
`--keyring /etc/ceph/ceph.client.cephfs.keyring --name client.cephfs`.
Mixing and matching does not work.  Treat them like username and password.
You wouldn't try to log into your computer under your account with the
admin password.

On Tue, Feb 19, 2019 at 12:58 PM Hennen, Christian <
christian.hen...@uni-trier.de> wrote:

> > sounds like network issue. are there firewall/NAT between nodes?
> No, there is currently no firewall in place. Nodes and clients are on the
> same network. MTUs match, ports are opened according to nmap.
>
> > try running ceph-fuse on the node that run mds, check if it works
> properly.
> When I try to run ceph-fuse on either a client or cephfiler1
> (MON,MGR,MDS,OSDs) I get
> - "operation not permitted" when using the client keyring
> - "invalid argument" when using the admin keyring
> - "ms_handle_refused" when using the admin keyring and connecting to
> 127.0.0.1:6789
>
> ceph-fuse --keyring /etc/ceph/ceph.client.admin.keyring --name
> client.cephfs -m 192.168.1.17:6789 /mnt/cephfs
>
> -Ursprüngliche Nachricht-
> Von: Yan, Zheng 
> Gesendet: Dienstag, 19. Februar 2019 11:31
> An: Hennen, Christian 
> Cc: ceph-users@lists.ceph.com
> Betreff: Re: [ceph-users] CephFS: client hangs
>
> On Tue, Feb 19, 2019 at 5:10 PM Hennen, Christian <
> christian.hen...@uni-trier.de> wrote:
> >
> > Hi!
> >
> > >mon_max_pg_per_osd = 400
> > >
> > >In the ceph.conf and then restart all the services / or inject the
> > >config into the running admin
> >
> > I restarted each server (MONs and OSDs weren’t enough) and now the
> health warning is gone. Still no luck accessing CephFS though.
> >
> >
> > > MDS show a client got evicted. Nothing else looks abnormal.  Do new
> > > cephfs clients also get evicted quickly?
> >
> > Aside from the fact that evicted clients don’t show up in ceph –s, we
> observe other strange things:
> >
> > ·   Setting max_mds has no effect
> >
> > ·   Ceph osd blacklist ls sometimes lists cluster nodes
> >
>
> sounds like network issue. are there firewall/NAT between nodes?
>
> > The only client that is currently running is ‚master1‘. It also hosts a
> MON and a MGR. Its syslog (https://gitlab.uni-trier.de/snippets/78) shows
> messages like:
> >
> > Feb 13 06:40:33 master1 kernel: [56165.943008] libceph: wrong peer,
> > want 192.168.1.17:6800/-2045158358, got 192.168.1.17:6800/1699349984
> >
> > Feb 13 06:40:33 master1 kernel: [56165.943014] libceph: mds1
> > 192.168.1.17:6800 wrong peer at address
> >
> > The other day I did the update from 12.2.8 to 12.2.11, which can also be
> seen in the logs. Again, there appeared these messages. I assume that’s
> normal operations since ports can change and daemons have to find each
> other again? But what about Feb 13 in the morning? I didn’t do any restarts
> then.
> >
> > Also, clients are printing messages like the following on the console:
> >
> > [1026589.751040] ceph: handle_cap_import: mismatched seq/mseq: ino
> > (1994988.fffe) mds0 seq1 mseq 15 importer mds1 has
> > peer seq 2 mseq 15
> >
> > [1352658.876507] ceph: build_path did not end path lookup where
> > expected, namelen is 23, pos is 0
> >
> > Oh, and btw, the ceph nodes are running on Ubuntu 16.04, clients are on
> 14.04 with kernel 4.4.0-133.
> >
>
> try running ceph-fuse on the node that run mds, check if it works properly.
>
>
> > For reference:
> >
> > > Cluster details: https://gitlab.uni-trier.de/snippets/77
> >
> > > MDS log:
> > > https://gitlab.uni-trier.de/snippets/79?expanded=true&viewer=simple)
> >
> >
> > Kind regards
> > Christian Hennen
> >
> > Project Manager Infrastructural Services ZIMK University of Trier
> > Germany
> >
> > Von: Ashley Merrick 
> > Gesendet: Montag, 18. Februar 2019 16:53
> > An: Hennen, Christian 
> > Cc: ceph-users@lists.ceph.com
> > Betreff: Re: [ceph-users] CephFS: client hangs
> >
> > Correct yes from my expirence OSD’s aswel.
> >
> > On Mon, 18 Feb 2019 at 11:51 PM, Hennen, Christian <
> christian.hen...@uni-trier.de> wrote:
> >
> > Hi!
> >
> > >mon_max_pg_per_osd = 400
> > >
> > >In the ceph.conf and then restart all the services / or inject the
> > >config into the running admin
> >
> > I restarted all MONs, but I assume the OSDs need to be restarted as well?
> >
> > > MDS show a client got evicted. Nothing else looks abnormal.  Do new
> > > cephfs clients also get evicted quickly?
> >
> > Yeah, it seems so. But strangely there is no indication of it in 'ceph
> > -s' or 'ceph health detail'. And they don't seem to be evicted
> > permanently? Right now, only 1 client is connected. The others are shut
> down since last week.
> > 'ceph osd blacklist ls' shows 0 entries.
> >
> >
> > Kind regards
> > Christian Hennen
> >
> > Project Manager Infrastructural Services ZI

Re: [ceph-users] crush map has straw_calc_version=0 and legacy tunables on luminous

[1] Here is a really cool set of slides from Ceph Day Berlin where Dan van
der Ster uses the mgr balancer module with upmap to gradually change the
tunables of a cluster without causing major client impact.  The down side
for you is that upmap requires all luminous or newer clients, but if you
upgrade your kernel clients to 1.13+, then you can enable upmap in the
cluster and utilize the balancer module to upgrade your cluster tunables.
As stated [2] here that those kernel versions still report as Jewel
clients, but only because they are missing some non-essential luminous
client features even they they are fully compatible with the upmap
features, and other required features.

As a side note to the balancer manager in upmap mode, it balances your
cluster in such a way that it attempts to evenly distribute all PGs for a
pool evenly across all OSDs.  So if you have 3 different pools, the PGs for
those pools should each be within 1 or 2 PG totals on every OSD in your
cluster... it's really cool.  The slides discuss how to get your cluster to
that point as well, incase you have modified your weights or reweights at
all.


[1]
https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer
[2]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031206.html

On Mon, Feb 4, 2019 at 6:31 PM Shain Miley  wrote:

> For future reference I found these 2 links which answer most of the
> questions:
>
> http://docs.ceph.com/docs/master/rados/operations/crush-map/
>
>
> https://www.openstack.org/assets/presentation-media/Advanced-Tuning-and-Operation-guide-for-Block-Storage-using-Ceph-Boston-2017-final.pdf
>
>
>
> We have about 250TB (x3) in our cluster so I am leaning toward not
> changing things at this point because it sounds like there will be a
> significant amount of data movement involved for not a lot in return.
>
>
>
> If anyone knows of a strong reason I should change the tunables profile
> away from what I have…then please let me know so I don’t end up running the
> cluster in a sub-optimal state for no reason.
>
>
>
> Thanks,
>
> Shain
>
>
>
> --
>
> Shain Miley | Manager of Systems and Infrastructure, Digital Media |
> smi...@npr.org | 202.513.3649
>
>
>
> *From: *ceph-users  on behalf of Shain
> Miley 
> *Date: *Monday, February 4, 2019 at 3:03 PM
> *To: *"ceph-users@lists.ceph.com" 
> *Subject: *[ceph-users] crush map has straw_calc_version=0 and legacy
> tunables on luminous
>
>
>
> Hello,
>
> I just upgraded our cluster to 12.2.11 and I have a few questions around
> straw_calc_version and tunables.
>
> Currently ceph status shows the following:
>
> crush map has straw_calc_version=0
>
> crush map has legacy tunables (require argonaut, min is firefly)
>
>
>
>1. Will setting tunables to optimal also change the staw_calc_version
>or do I need to set that separately?
>
>
>2. Right now I have a set of rbd kernel clients connecting using
>kernel version 4.4.  The ‘ceph daemon mon.id sessions’ command shows
>that this client is still connecting using the hammer feature set (and a
>few others on jewel as well):
>
>"MonSession(client.113933130 10.35.100.121:0/3425045489 is open allow
>*, features 0x7fddff8ee8cbffb (jewel))",  “MonSession(client.112250505
>10.35.100.99:0/4174610322 is open allow *, features 0x106b84a842a42
>(hammer))",
>
>My question is what is the minimum kernel version I would need to
>upgrade the 4.4 kernel server to in order to get to jewel or luminous?
>
>
>
>1. Will setting the tunables to optimal on luminous prevent jewel and
>hammer clients from connecting?  I want to make sure I don’t do anything
>will prevent my existing clients from connecting to the cluster.
>
>
>
>
> Thanks in advance,
>
> Shain
>
>
>
> --
>
> Shain Miley | Manager of Systems and Infrastructure, Digital Media |
> smi...@npr.org | 202.513.3649
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster stability

With a RACK failure domain, you should be able to have an entire rack
powered down without noticing any major impact on the clients.  I regularly
take down OSDs and nodes for maintenance and upgrades without seeing any
problems with client IO.

On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy 
wrote:

> Hello - I have a couple of questions on ceph cluster stability, even
> we follow all recommendations as below:
> - Having separate replication n/w and data n/w
> - RACK is the failure domain
> - Using SSDs for journals (1:4ratio)
>
> Q1 - If one OSD down, cluster IO down drastically and customer Apps
> impacted.
> Q2 - what is stability ratio, like with above, is ceph cluster
> workable condition, if one osd down or one node down,etc.
>
> Thanks
> Swami
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

Have you ever seen an example of a Ceph cluster being run and managed by
Rook?  It's a really cool idea and takes care of containerizing mons, rgw,
mds, etc that I've been thinking about doing anyway.  Having those
containerized means that if you can upgrade all of the mon services before
any of your other daemons are even aware of a new Ceph version even if
they're running on the same server.  There are some recent upgrade bugs for
small clusters with mons and osds on the same node that would have been
mitigated with containerized Ceph versions.  For putting OSDs in
containers, have you ever needed to run a custom compiled version of Ceph
for a few OSDs to get past a bug that was causing you some troubles?  With
OSDs in containers, you could do that without worrying about that version
of Ceph being used by any other OSDs.

On top of all of that, I keep feeling like a dinosaur for not understanding
Kubernetes better and have been really excited since seeing Rook
orchestrating a Ceph cluster in K8s.  I spun up a few VMs to start testing
configuring a Kubernetes cluster.  The Rook Slack channel recommended using
kubeadm to set up K8s to manage Ceph.

On Mon, Feb 18, 2019 at 11:50 AM Marc Roos  wrote:

>
> Why not just keep it bare metal? Especially with future ceph
> upgrading/testing. I am having centos7 with luminous and am running
> libvirt on the nodes aswell. If you configure them with a tls/ssl
> connection, you can even nicely migrate a vm, from one host/ceph node to
> the other.
> Next thing I am testing with is mesos, to use the ceph nodes to run
> containers. I am still testing this on some vm's, but looks like you
> have to install only a few rpms (maybe around 300MB) and 2 extra
> services on the nodes to get this up and running aswell. (But keep in
> mind that the help on their mailing list is not so good as here ;))
>
>
>
> -Original Message-
> From: David Turner [mailto:drakonst...@gmail.com]
> Sent: 18 February 2019 17:31
> To: ceph-users
> Subject: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook
>
> I'm getting some "new" (to me) hardware that I'm going to upgrade my
> home Ceph cluster with.  Currently it's running a Proxmox cluster
> (Debian) which precludes me from upgrading to Mimic.  I am thinking
> about taking the opportunity to convert most of my VMs into containers
> and migrate my cluster into a K8s + Rook configuration now that Ceph is
> [1] stable on Rook.
>
> I haven't ever configured a K8s cluster and am planning to test this out
> on VMs before moving to it with my live data.  Has anyone done a
> migration from a baremetal Ceph cluster into K8s + Rook?  Additionally
> what is a good way for a K8s beginner to get into managing a K8s
> cluster.  I see various places recommend either CoreOS or kubeadm for
> starting up a new K8s cluster but I don't know the pros/cons for either.
>
> As far as migrating the Ceph services into Rook, I would assume that the
> process would be pretty simple to add/create new mons, mds, etc into
> Rook with the baremetal cluster details.  Once those are active and
> working just start decommissioning the services on baremetal.  For me,
> the OSD migration should be similar since I don't have any multi-device
> OSDs so I only need to worry about migrating individual disks between
> nodes.
>
>
> [1]
> https://blog.rook.io/rook-v0-9-new-storage-backends-in-town-ab952523ec53
>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

I don't know that there's anything that can be done to resolve this yet
without rebuilding the OSD.  Based on a Nautilus tool being able to resize
the DB device, I'm assuming that Nautilus is also capable of migrating the
DB/WAL between devices.  That functionality would allow anyone to migrate
their DB back off of their spinner which is what's happening to you.  I
don't believe that sort of tooling exists yet, though, without compiling
the Nautilus Beta tooling for yourself.

On Tue, Feb 19, 2019 at 12:03 AM Konstantin Shalygin  wrote:

> On 2/18/19 9:43 PM, David Turner wrote:
> > Do you have historical data from these OSDs to see when/if the DB used
> > on osd.73 ever filled up?  To account for this OSD using the slow
> > storage for DB, all we need to do is show that it filled up the fast
> > DB at least once.  If that happened, then something spilled over to
> > the slow storage and has been there ever since.
>
> Yes, I have. Also I checked my JIRA records what I was do at this times
> and marked this on timeline: [1]
>
> Another graph compared osd.(33|73) for a last year: [2]
>
>
> [1] https://ibb.co/F7smCxW
>
> [1] https://ibb.co/dKWWDzW
>
> k
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

Everybody is just confused that you don't have a newer version of Ceph
available. Are you running `apt-get dist-upgrade` to upgrade ceph? Do you
have any packages being held back? There is no reason that Ubuntu 18.04
shouldn't be able to upgrade to 12.2.11.

On Mon, Feb 18, 2019, 4:38 PM  Hello people,
>
> Am 11. Februar 2019 12:47:36 MEZ schrieb c...@elchaka.de:
> >Hello Ashley,
> >
> >Am 9. Februar 2019 17:30:31 MEZ schrieb Ashley Merrick
> >:
> >>What does the output of apt-get update look like on one of the nodes?
> >>
> >>You can just list the lines that mention CEPH
> >>
> >
> >... .. .
> >Get:6 Https://Download.ceph.com/debian-luminous bionic InRelease [8393
> >B]
> >... .. .
> >
> >The Last available is 12.2.8.
>
> Any advice or recommends on how to proceed to be able to Update to
> mimic/(nautilus)?
>
> - Mehmet
> >
> >- Mehmet
> >
> >>Thanks
> >>
> >>On Sun, 10 Feb 2019 at 12:28 AM,  wrote:
> >>
> >>> Hello Ashley,
> >>>
> >>> Thank you for this fast response.
> >>>
> >>> I cannt prove this jet but i am using already cephs own repo for
> >>Ubuntu
> >>> 18.04 and this 12.2.7/8 is the latest available there...
> >>>
> >>> - Mehmet
> >>>
> >>> Am 9. Februar 2019 17:21:32 MEZ schrieb Ashley Merrick <
> >>> singap...@amerrick.co.uk>:
> >>> >Around available versions, are you using the Ubuntu repo’s or the
> >>CEPH
> >>> >18.04 repo.
> >>> >
> >>> >The updates will always be slower to reach you if your waiting for
> >>it
> >>> >to
> >>> >hit the Ubuntu repo vs adding CEPH’s own.
> >>> >
> >>> >
> >>> >On Sun, 10 Feb 2019 at 12:19 AM,  wrote:
> >>> >
> >>> >> Hello m8s,
> >>> >>
> >>> >> Im curious how we should do an Upgrade of our ceph Cluster on
> >>Ubuntu
> >>> >> 16/18.04. As (At least on our 18.04 nodes) we only have 12.2.7
> >(or
> >>> >.8?)
> >>> >>
> >>> >> For an Upgrade to mimic we should First Update to Last version,
> >>> >actualy
> >>> >> 12.2.11 (iirc).
> >>> >> Which is not possible on 18.04.
> >>> >>
> >>> >> Is there a Update path from 12.2.7/8 to actual mimic release or
> >>> >better the
> >>> >> upcoming nautilus?
> >>> >>
> >>> >> Any advice?
> >>> >>
> >>> >> - Mehmet___
> >>> >> ceph-users mailing list
> >>> >> ceph-users@lists.ceph.com
> >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>> >>
> >>> ___
> >>> ceph-users mailing list
> >>> ceph-users@lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >___
> >ceph-users mailing list
> >ceph-users@lists.ceph.com
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] IRC channels now require registered and identified users

Is this still broken in the 1-way direction where Slack users' comments do
not show up in IRC?  That would explain why nothing I ever type (as either
helping someone or asking a question) ever have anyone respond to them.

On Tue, Dec 18, 2018 at 6:50 AM Joao Eduardo Luis  wrote:

> On 12/18/2018 11:22 AM, Joao Eduardo Luis wrote:
> > On 12/18/2018 11:18 AM, Dan van der Ster wrote:
> >> Hi Joao,
> >>
> >> Has that broken the Slack connection? I can't tell if its broken or
> >> just quiet... last message on #ceph-devel was today at 1:13am.
> >
> > Just quiet, it seems. Just tested it and the bridge is still working.
>
> Okay, turns out the ceph-ircslackbot user is not identified, and that
> makes it unable to send messages to the channel. This means the bridge
> is working in one direction only (irc to slack), and will likely break
> when/if the user leaves the channel (as it won't be able to get back in).
>
> I will figure out just how this works today. In the mean time, I've
> relaxed the requirement for registered/identified users so that the bot
> works again. It will be reactivated once this is addressed.
>
>   -Joao
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

I'm getting some "new" (to me) hardware that I'm going to upgrade my home
Ceph cluster with.  Currently it's running a Proxmox cluster (Debian) which
precludes me from upgrading to Mimic.  I am thinking about taking the
opportunity to convert most of my VMs into containers and migrate my
cluster into a K8s + Rook configuration now that Ceph is [1] stable on Rook.

I haven't ever configured a K8s cluster and am planning to test this out on
VMs before moving to it with my live data.  Has anyone done a migration
from a baremetal Ceph cluster into K8s + Rook?  Additionally what is a good
way for a K8s beginner to get into managing a K8s cluster.  I see various
places recommend either CoreOS or kubeadm for starting up a new K8s cluster
but I don't know the pros/cons for either.

As far as migrating the Ceph services into Rook, I would assume that the
process would be pretty simple to add/create new mons, mds, etc into Rook
with the baremetal cluster details.  Once those are active and working just
start decommissioning the services on baremetal.  For me, the OSD migration
should be similar since I don't have any multi-device OSDs so I only need
to worry about migrating individual disks between nodes.


[1] https://blog.rook.io/rook-v0-9-new-storage-backends-in-town-ab952523ec53
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

We have 2 clusters of [1] these disks that have 2 Bluestore OSDs per disk
(partitioned), 3 disks per node, 5 nodes per cluster. The clusters are
12.2.4 running CephFS and RBDs. So in total we have 15 NVMe's per cluster
and 30 NVMe's in total. They were all built at the same time and were
running firmware version QDV10130. On this firmware version we early on
had 2 disks failures, a few months later we had 1 more, and then a month
after that (just a few weeks ago) we had 7 disk failures in 1 week.

The failures are such that the disk is no longer visible to the OS. This
holds true beyond server reboots as well as placing the failed disks into a
new server. With a firmware upgrade tool we got an error that pretty much
said there's no way to get data back and to RMA the disk. We upgraded all
of our remaining disks' firmware to QDV101D1 and haven't had any problems
since then. Most of our failures happened while rebalancing the cluster
after replacing dead disks and we tested rigorously around that use case
after upgrading the firmware. This firmware version seems to have resolved
whatever the problem was.

We have about 100 more of these scattered among database servers and other
servers that have never had this problem while running the
QDV10130 firmware as well as firmwares between this one and the one we
upgraded to. Bluestore on Ceph is the only use case we've had so far with
this sort of failure.

Has anyone else come across this issue before? Our current theory is that
Bluestore is accessing the disk in a way that is triggering a bug in the
older firmware version that isn't triggered by more traditional
filesystems. We have a scheduled call with Intel to discuss this, but
their preliminary searches into the bugfixes and known problems between
firmware versions didn't indicate the bug that we triggered. It would be
good to have some more information about what those differences for disk
accessing might be to hopefully get a better answer from them as to what
the problem is.

[1]
https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/data-center-ssds/dc-p4600-series/dc-p4600-3-2tb-2-5inch-3d1.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Placing replaced disks to correct buckets.

Also what commands did you run to remove the failed HDDs and the commands
you have so far run to add their replacements back in?

On Sat, Feb 16, 2019 at 9:55 PM Konstantin Shalygin  wrote:

> I recently replaced failed HDDs and removed them from their respective
> buckets as per procedure.
>
> But I’m now facing an issue when trying to place new ones back into the
> buckets. I’m getting an error of ‘osd nr not found’ OR ‘file or
> directory not found’ OR command sintax error.
>
> I have been using the commands below:
>
> ceph osd crush set   
> ceph osd crush  set   
>
> I do however find the OSD number when i run command:
>
> ceph osd find 
>
> Your assistance/response to this will be highly appreciated.
>
> Regards
> John.
>
>
> Please, paste your `ceph osd tree`, your version and what exactly error
> you get include osd number.
>
> Less obfuscation is better in this, perhaps, simple case.
>
>
> k
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

Do you have historical data from these OSDs to see when/if the DB used on
osd.73 ever filled up?  To account for this OSD using the slow storage for
DB, all we need to do is show that it filled up the fast DB at least once.
If that happened, then something spilled over to the slow storage and has
been there ever since.

On Sat, Feb 16, 2019 at 1:50 AM Konstantin Shalygin  wrote:

> On 2/16/19 12:33 AM, David Turner wrote:
> > The answer is probably going to be in how big your DB partition is vs
> > how big your HDD disk is.  From your output it looks like you have a
> > 6TB HDD with a 28GB Blocks.DB partition.  Even though the DB used size
> > isn't currently full, I would guess that at some point since this OSD
> > was created that it did fill up and what you're seeing is the part of
> > the DB that spilled over to the data disk. This is why the official
> > recommendation (that is quite cautious, but cautious because some use
> > cases will use this up) for a blocks.db partition is 4% of the data
> > drive.  For your 6TB disks that's a recommendation of 240GB per DB
> > partition.  Of course the actual size of the DB needed is dependent on
> > your use case.  But pretty much every use case for a 6TB disk needs a
> > bigger partition than 28GB.
>
>
> My current db size of osd.33 is 7910457344 bytes, and osd.73 is
> 2013265920+4685037568 bytes. 7544Mbyte (24.56% of db_total_bytes) vs
> 6388Mbyte (6.69% of db_total_bytes).
>
> Why osd.33 is not used slow storage at this case?
>
>
>
> k
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-15 Thread David Turner

The answer is probably going to be in how big your DB partition is vs how
big your HDD disk is.  From your output it looks like you have a 6TB HDD
with a 28GB Blocks.DB partition.  Even though the DB used size isn't
currently full, I would guess that at some point since this OSD was created
that it did fill up and what you're seeing is the part of the DB that
spilled over to the data disk.  This is why the official recommendation
(that is quite cautious, but cautious because some use cases will use this
up) for a blocks.db partition is 4% of the data drive.  For your 6TB disks
that's a recommendation of 240GB per DB partition.  Of course the actual
size of the DB needed is dependent on your use case.  But pretty much every
use case for a 6TB disk needs a bigger partition than 28GB.

On Thu, Feb 14, 2019 at 11:58 PM Konstantin Shalygin  wrote:

> Wrong metadata paste of osd.73 in previous message.
>
>
> {
>
>  "id": 73,
>  "arch": "x86_64",
>  "back_addr": "10.10.10.6:6804/175338",
>  "back_iface": "vlan3",
>  "bluefs": "1",
>  "bluefs_db_access_mode": "blk",
>  "bluefs_db_block_size": "4096",
>  "bluefs_db_dev": "259:22",
>  "bluefs_db_dev_node": "nvme2n1",
>  "bluefs_db_driver": "KernelDevice",
>  "bluefs_db_model": "INTEL SSDPEDMD400G4 ",
>  "bluefs_db_partition_path": "/dev/nvme2n1p11",
>  "bluefs_db_rotational": "0",
>  "bluefs_db_serial": "CVFT4324002Q400BGN  ",
>  "bluefs_db_size": "30064771072",
>  "bluefs_db_type": "nvme",
>  "bluefs_single_shared_device": "0",
>  "bluefs_slow_access_mode": "blk",
>  "bluefs_slow_block_size": "4096",
>  "bluefs_slow_dev": "8:176",
>  "bluefs_slow_dev_node": "sdl",
>  "bluefs_slow_driver": "KernelDevice",
>  "bluefs_slow_model": "TOSHIBA HDWE160 ",
>  "bluefs_slow_partition_path": "/dev/sdl2",
>  "bluefs_slow_rotational": "1",
>  "bluefs_slow_size": "6001069199360",
>  "bluefs_slow_type": "hdd",
>  "bluefs_wal_access_mode": "blk",
>  "bluefs_wal_block_size": "4096",
>  "bluefs_wal_dev": "259:22",
>  "bluefs_wal_dev_node": "nvme2n1",
>  "bluefs_wal_driver": "KernelDevice",
>  "bluefs_wal_model": "INTEL SSDPEDMD400G4 ",
>  "bluefs_wal_partition_path": "/dev/nvme2n1p12",
>  "bluefs_wal_rotational": "0",
>  "bluefs_wal_serial": "CVFT4324002Q400BGN  ",
>  "bluefs_wal_size": "1073741824",
>  "bluefs_wal_type": "nvme",
>  "bluestore_bdev_access_mode": "blk",
>  "bluestore_bdev_block_size": "4096",
>  "bluestore_bdev_dev": "8:176",
>  "bluestore_bdev_dev_node": "sdl",
>  "bluestore_bdev_driver": "KernelDevice",
>  "bluestore_bdev_model": "TOSHIBA HDWE160 ",
>  "bluestore_bdev_partition_path": "/dev/sdl2",
>  "bluestore_bdev_rotational": "1",
>  "bluestore_bdev_size": "6001069199360",
>  "bluestore_bdev_type": "hdd",
>  "ceph_version": "ceph version 12.2.10
> (177915764b752804194937482a39e95e0ca3de94) luminous (stable)",
>  "cpu": "Intel(R) Xeon(R) CPU E5-2609 v4 @ 1.70GHz",
>  "default_device_class": "hdd",
>  "distro": "centos",
>  "distro_description": "CentOS Linux 7 (Core)",
>  "distro_version": "7",
>  "front_addr": "172.16.16.16:6803/175338",
>  "front_iface": "vlan4",
>  "hb_back_addr": "10.10.10.6:6805/175338",
>  "hb_front_addr": "172.16.16.16:6805/175338",
>  "hostname": "ceph-osd5",
>  "journal_rotational": "0",
>  "kernel_description": "#1 SMP Tue Aug 14 21:49:04 UTC 2018",
>  "kernel_version": "3.10.0-862.11.6.el7.x86_64",
>  "mem_swap_kb": "0",
>  "mem_total_kb": "65724256",
>  "os": "Linux",
>  "osd_data": "/var/lib/ceph/osd/ceph-73",
>  "osd_objectstore": "bluestore",
>  "rotational": "1"
> }
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] jewel10.2.11 EC pool out a osd, its PGs remap to the osds in the same host

2019-02-15 Thread David Turner

I'm leaving the response on the CRUSH rule for Gregory, but you have
another problem you're running into that is causing more of this data to
stay on this node than you intend.  While you `out` the OSD it is still
contributing to the Host's weight.  So the host is still set to receive
that amount of data and distribute it among the disks inside of it.  This
is the default behavior (even if you `destroy` the OSD) to minimize the
data movement for losing the disk and again for adding it back into the
cluster after you replace the device.  If you are really strapped for
space, though, then you might consider fully purging the OSD which will
reduce the Host weight to what the other OSDs are.  However if you do have
a problem in your CRUSH rule, then doing this won't change anything for you.

On Thu, Feb 14, 2019 at 11:15 PM hnuzhoulin2  wrote:

> Thanks. I read the your reply in
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg48717.html
> so using indep will do fewer data remap when osd failed.
> using firstn: 1, 2, 3, 4, 5 -> 1, 2, 4, 5, 6 , 60% data remap
> using indep :1, 2, 3, 4, 5 -> 1, 2, 6, 4, 5, 25% data remap
>
> am I right?
> if so, what recommend to do when a disk failed and the total available
> size of the rest disk in the machine is not enough(can not replace failed
> disk immediately). or I should reserve more available size in EC situation.
>
> On 02/14/2019 02:49，Gregory Farnum
>  wrote：
>
> Your CRUSH rule for EC spools is forcing that behavior with the line
>
> step chooseleaf indep 1 type ctnr
>
> If you want different behavior, you’ll need a different crush rule.
>
> On Tue, Feb 12, 2019 at 5:18 PM hnuzhoulin2  wrote:
>
>> Hi, cephers
>>
>>
>> I am building a ceph EC cluster.when a disk is error,I out it.But its all
>> PGs remap to the osds in the same host,which I think they should remap to
>> other hosts in the same rack.
>> test process is:
>>
>> ceph osd pool create .rgw.buckets.data 8192 8192 erasure ISA-4-2
>> site1_sata_erasure_ruleset 4
>> ceph osd df tree|awk '{print $1" "$2" "$3" "$9" "$10}'> /tmp/1
>> /etc/init.d/ceph stop osd.2
>> ceph osd out 2
>> ceph osd df tree|awk '{print $1" "$2" "$3" "$9" "$10}'> /tmp/2
>> diff /tmp/1 /tmp/2 -y --suppress-common-lines
>>
>> 0 1.0 1.0 118 osd.0   | 0 1.0 1.0 126 osd.0
>> 1 1.0 1.0 123 osd.1   | 1 1.0 1.0 139 osd.1
>> 2 1.0 1.0 122 osd.2   | 2 1.0 0 0 osd.2
>> 3 1.0 1.0 113 osd.3   | 3 1.0 1.0 131 osd.3
>> 4 1.0 1.0 122 osd.4   | 4 1.0 1.0 136 osd.4
>> 5 1.0 1.0 112 osd.5   | 5 1.0 1.0 127 osd.5
>> 6 1.0 1.0 114 osd.6   | 6 1.0 1.0 128 osd.6
>> 7 1.0 1.0 124 osd.7   | 7 1.0 1.0 136 osd.7
>> 8 1.0 1.0 95 osd.8   | 8 1.0 1.0 113 osd.8
>> 9 1.0 1.0 112 osd.9   | 9 1.0 1.0 119 osd.9
>> TOTAL 3073T 197G | TOTAL 3065T 197G
>> MIN/MAX VAR: 0.84/26.56 | MIN/MAX VAR: 0.84/26.52
>>
>>
>> some config info: （detail configs see:
>> https://gist.github.com/hnuzhoulin/575883dbbcb04dff448eea3b9384c125）
>> jewel 10.2.11  filestore+rocksdb
>>
>> ceph osd erasure-code-profile get ISA-4-2
>> k=4
>> m=2
>> plugin=isa
>> ruleset-failure-domain=ctnr
>> ruleset-root=site1-sata
>> technique=reed_sol_van
>>
>> part of ceph.conf is:
>>
>> [global]
>> fsid = 1CAB340D-E551-474F-B21A-399AC0F10900
>> auth cluster required = cephx
>> auth service required = cephx
>> auth client required = cephx
>> pid file = /home/ceph/var/run/$name.pid
>> log file = /home/ceph/log/$cluster-$name.log
>> mon osd nearfull ratio = 0.85
>> mon osd full ratio = 0.95
>> admin socket = /home/ceph/var/run/$cluster-$name.asok
>> osd pool default size = 3
>> osd pool default min size = 1
>> osd objectstore = filestore
>> filestore merge threshold = -10
>>
>> [mon]
>> keyring = /home/ceph/var/lib/$type/$cluster-$id/keyring
>> mon data = /home/ceph/var/lib/$type/$cluster-$id
>> mon cluster log file = /home/ceph/log/$cluster.log
>> [osd]
>> keyring = /home/ceph/var/lib/$type/$cluster-$id/keyring
>> osd data = /home/ceph/var/lib/$type/$cluster-$id
>> osd journal = /home/ceph/var/lib/$type/$cluster-$id/journal
>> osd journal size = 1
>> osd mkfs type = xfs
>> osd mount options xfs = rw,noatime,nodiratime,inode64,logbsize=256k
>> osd backfill full ratio = 0.92
>> osd failsafe full ratio = 0.95
>> osd failsafe nearfull ratio = 0.85
>> osd max backfills = 1
>> osd crush update on start = false
>> osd op thread timeout = 60
>> filestore split multiple = 8
>> filestore max sync interval = 15
>> filestore min sync interval = 5
>> [osd.0]
>> host = cld-osd1-56
>> addr = X
>> user = ceph
>> devs = /disk/link/osd-0/data
>> osd journal = /disk/link/osd-0/journal
>> …….
>> [osd.503]
>> host = cld-osd42-56
>> addr = 10.108.87.52
>> user = ceph
>> devs = /disk/link/osd-503/data
>> osd journal = /disk/link/osd-503/journal
>>
>>
>> crushmap is below:
>>
>> # begin crush map
>> tuna

Re: [ceph-users] Problems with osd creation in Ubuntu 18.04, ceph 13.2.4-1bionic

2019-02-15 Thread David Turner

I have found that running a zap before all prepare/create commands with
ceph-volume helps things run smoother.  Zap is specifically there to clear
everything on a disk away to make the disk ready to be used as an OSD.
Your wipefs command is still fine, but then I would lvm zap the disk before
continuing.  I would run the commands like [1] this.  I also prefer the
single command lvm create as opposed to lvm prepare and lvm activate.  Try
that out and see if you still run into the problems creating the BlueStore
filesystem.

[1] ceph-volume lvm zap /dev/sdg
ceph-volume lvm prepare --bluestore --data /dev/sdg

On Thu, Feb 14, 2019 at 10:25 AM Rainer Krienke 
wrote:

> Hi,
>
> I am quite new to ceph and just try to set up a ceph cluster. Initially
> I used ceph-deploy for this but when I tried to create a BlueStore osd
> ceph-deploy fails. Next I tried the direct way on one of the OSD-nodes
> using ceph-volume to create the osd, but this also fails. Below you can
> see what  ceph-volume says.
>
> I ensured that there was no left over lvm VG and LV on the disk sdg
> before I started the osd creation for this disk. The very same error
> happens also on other disks not just for /dev/sdg. All the disk have 4TB
> in size and the linux system is Ubuntu 18.04 and finally ceph is
> installed in version 13.2.4-1bionic from this repo:
> https://download.ceph.com/debian-mimic.
>
> There is a VG and two LV's  on the system for the ubuntu system itself
> that is installed on two separate disks configured as software raid1 and
> lvm on top of the raid. But I cannot imagine that this might do any harm
> to cephs osd creation.
>
> Does anyone have an idea what might be wrong?
>
> Thanks for hints
> Rainer
>
> root@ceph1:~# wipefs -fa /dev/sdg
> root@ceph1:~# ceph-volume lvm prepare --bluestore --data /dev/sdg
> Running command: /usr/bin/ceph-authtool --gen-print-key
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new 14d041d6-0beb-4056-8df2-3920e2febce0
> Running command: /sbin/vgcreate --force --yes
> ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b /dev/sdg
>  stdout: Physical volume "/dev/sdg" successfully created.
>  stdout: Volume group "ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b"
> successfully created
> Running command: /sbin/lvcreate --yes -l 100%FREE -n
> osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
> ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b
>  stdout: Logical volume "osd-block-14d041d6-0beb-4056-8df2-3920e2febce0"
> created.
> Running command: /usr/bin/ceph-authtool --gen-print-key
> Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
> --> Absolute path not found for executable: restorecon
> --> Ensure $PATH environment variable contains common executable locations
> Running command: /bin/chown -h ceph:ceph
>
> /dev/ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b/osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
> Running command: /bin/chown -R ceph:ceph /dev/dm-8
> Running command: /bin/ln -s
>
> /dev/ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b/osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
> /var/lib/ceph/osd/ceph-0/block
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
>  stderr: got monmap epoch 1
> Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-0/keyring
> --create-keyring --name osd.0 --add-key
> AQAAY2VcU968HxAAvYWMaJZmriUc4H9bCCp8XQ==
>  stdout: creating /var/lib/ceph/osd/ceph-0/keyring
> added entity osd.0 auth auth(auid = 18446744073709551615
> key=AQAAY2VcU968HxAAvYWMaJZmriUc4H9bCCp8XQ== with 0 caps)
> Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
> Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
> Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore
> bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap
> --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid
> 14d041d6-0beb-4056-8df2-3920e2febce0 --setuser ceph --setgroup ceph
>  stderr: 2019-02-14 13:45:54.788 7f3fcecb3240 -1
> bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
>  stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: In
> function 'virtual int KernelDevice::read(uint64_t, uint64_t,
> ceph::bufferlist*, IOContext*, bool)' thread 7f3fcecb3240 time
> 2019-02-14 13:45:54.841130
>  stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: 821:
> FAILED assert((uint64_t)r == len)
>  stderr: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
> mimic (stable)
>  stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int,
> char const*)+0x102) [0x7f3fc60d33e2]
>  stderr: 2: (()+0x26d5a7) [0x7f3fc60d35a7]
>  stderr: 3: (KernelDevice::read(unsigned long, unsigned long,
> ceph::buffer::list*, IOContext*, bool)+0x4a7) [0x561371346817]
>  stderr: 4: (BlueFS::_read(BlueFS::FileReade

Re: [ceph-users] [Ceph-community] Deploy and destroy monitors

Ceph-users is the proper ML to post questions like this.

On Thu, Dec 20, 2018 at 2:30 PM Joao Eduardo Luis  wrote:

> On 12/20/2018 04:55 PM, João Aguiar wrote:
> > I am having an issue with "ceph-ceploy mon”
> >
> > I started by creating a cluster with one monitor with "create-deploy
> new"… "create-initial”...
> > And ended up with ceph,conf like:
> > ...
> > mon_initial_members = node0
> > mon_host = 10.2.2.2
> > ….
> >
> > Later I try to deploy a new monitor (ceph-deploy mon create node1),
> wait for it to get in quorum and then destroy the node0 (ceph-deploy mon
> destroy node0).
>
> Is the new monitor forming a quorum with the existing monitor? If not,
> then you won't have monitors running when you remove node0.
>
> Does ceph-deploy remove the mon being destroyed from the monmap? If not,
> you'll have two monitors in the monmap, and you'll need a majority to
> form quorum; for a 2 monitor deployment that means you'll need 2
> monitors up and running.
>
> > Result: Ceph gets unresponsive.
>
> This is the typical symptom of absence of a quorum.
>
>   -Joao
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Ceph SSE-KMS integration to use Safenet as Key Manager service

Ceph-users is the correct ML to post questions like this.

On Wed, Jan 2, 2019 at 5:40 PM Rishabh S  wrote:

> Dear Members,
>
> Please let me know if you have any link with examples/detailed steps of
> Ceph-Safenet(KMS) integration.
>
> Thanks & Regards,
> Rishabh
>
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Error during playbook deployment: TASK [ceph-mon : test if rbd exists]

Ceph-users ML is the proper mailing list for questions like this.

On Sat, Jan 26, 2019 at 12:31 PM Meysam Kamali  wrote:

> Hi Ceph Community,
>
> I am using ansible 2.2 and ceph branch stable-2.2, on centos7, to deploy
> the playbook. But the deployment get hangs in this step "TASK [ceph-mon :
> test if rbd exists]". it gets hangs there and doesnot move.
> I have all the three ceph nodes ceph-admin, ceph-mon, ceph-osd
> I appreciate any help! Here I am providing log:
>
> ---Log --
> TASK [ceph-mon : test if rbd exists]
> ***
> task path: /root/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:60
> Using module file
> /usr/lib/python2.7/site-packages/ansible/modules/core/commands/command.py
>  ESTABLISH SSH CONNECTION FOR USER: None
>  SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o
> KbdInteractiveAuthentication=no -o
> PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
> -o PasswordAuthentication=no -o ConnectTimeout=10 -o
> ControlPath=/root/.ansible/cp/%h-%r ceph2mon '/bin/sh -c '"'"'echo ~ &&
> sleep 0'"'"''
>  ESTABLISH SSH CONNECTION FOR USER: None
>  SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o
> KbdInteractiveAuthentication=no -o
> PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
> -o PasswordAuthentication=no -o ConnectTimeout=10 -o
> ControlPath=/root/.ansible/cp/%h-%r ceph2mon '/bin/sh -c '"'"'( umask 77 &&
> mkdir -p "` echo
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896 `" && echo
> ansible-tmp-1547740115.56-213823795856896="` echo
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896 `" ) && sleep
> 0'"'"''
>  PUT /tmp/tmpG7u1eN TO
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896/command.py
>  SSH: EXEC sftp -b - -C -o ControlMaster=auto -o
> ControlPersist=60s -o KbdInteractiveAuthentication=no -o
> PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
> -o PasswordAuthentication=no -o ConnectTimeout=10 -o
> ControlPath=/root/.ansible/cp/%h-%r '[ceph2mon]'
>  ESTABLISH SSH CONNECTION FOR USER: None
>  SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o
> KbdInteractiveAuthentication=no -o
> PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
> -o PasswordAuthentication=no -o ConnectTimeout=10 -o
> ControlPath=/root/.ansible/cp/%h-%r ceph2mon '/bin/sh -c '"'"'chmod u+x
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896/
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896/command.py &&
> sleep 0'"'"''
>  ESTABLISH SSH CONNECTION FOR USER: None
>  SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o
> KbdInteractiveAuthentication=no -o
> PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
> -o PasswordAuthentication=no -o ConnectTimeout=10 -o
> ControlPath=/root/.ansible/cp/%h-%r -tt ceph2mon '/bin/sh -c '"'"'sudo -H
> -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo
> BECOME-SUCCESS-iefqzergptqzfhqmxouabfjfvdvbadku; /usr/bin/python
> /root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896/command.py; rm
> -rf "/root/.ansible/tmp/ansible-tmp-1547740115.56-213823795856896/" >
> /dev/null 2>&1'"'"'"'"'"'"'"'"' && sleep 0'"'"''
>
> -
>
>
> Thanks,
> Meysam Kamali
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Need help related to ceph client authentication

The Ceph-users ML is the correct list to ask questions like this.  Did you
figure out the problems/questions you had?

On Tue, Dec 4, 2018 at 11:39 PM Rishabh S  wrote:

> Hi Gaurav,
>
> Thank You.
>
> Yes, I am using boto, though I was looking for suggestions on how my ceph
> client should get access and secret keys.
>
> Another thing where I need help is regarding encryption
> http://docs.ceph.com/docs/mimic/radosgw/encryption/#
>
> I am little confused what does these statement means.
>
> The Ceph Object Gateway supports server-side encryption of uploaded
> objects, with 3 options for the management of encryption keys. Server-side
> encryption means that the data is sent over HTTP in its unencrypted form,
> and the Ceph Object Gateway stores that data in the Ceph Storage Cluster in
> encrypted form.
>
> Note
>
>
> Requests for server-side encryption must be sent over a secure HTTPS
> connection to avoid sending secrets in plaintext.
>
> CUSTOMER-PROVIDED KEYS
> 
>
> In this mode, the client passes an encryption key along with each request
> to read or write encrypted data. It is the client’s responsibility to
> manage those keys and remember which key was used to encrypt each object.
>
> My understanding is when ceph client is trying to upload a file/object to
> Ceph cluster then client request should be https and will include
>  “customer-provided-key”.
> Then Ceph will use customer-provided-key to encrypt file/object before
> storing data into Ceph cluster.
>
> Please correct and suggest best approach to store files/object in Ceph
> cluster.
>
> Any code example of initial handshake to upload a file/object with
> encryption-key will be of great help.
>
> Regards,
> Rishabh
>
>
> On 05-Dec-2018, at 2:15 AM, Gaurav Sitlani 
> wrote:
>
> Hi Rishabh,
> You can refer the ceph RGW doc and search for boto :
> http://docs.ceph.com/docs/master/install/install-ceph-gateway/?highlight=boto
> You can get a basic python boto script where you can mention your access
> and secret key and connect to your S3 cluster.
> I hope you know how to get your keys right.
>
> Regards,
> Gaurav Sitlani
>
>
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

This might not be a Ceph issue at all depending on if you're using any sort
of caching.  If you have caching on your disk controllers at all, then the
write might have happened to the cache but never made it to the OSD disks
which would show up as problems on the VM RBDs.  Make sure you have proper
BBU's on your disk controllers and/or disable caching that might be enabled
on your controllers or disks that could be benefiting you with write speed
while the cluster is healthy, but potentially causing you to run into this
state during a catastrophe.

On Tue, Dec 4, 2018 at 10:49 PM linghucongsong 
wrote:

>
> Thanks to all! I might have found the reason.
>
> It is look like relate to the below bug.
>
> https://bugs.launchpad.net/nova/+bug/1773449
>
>
>
>
> At 2018-12-04 23:42:15, "Ouyang Xu"  wrote:
>
> Hi linghucongsong:
>
> I have got this issue before, you can try to fix it as below:
>
> 1. use *rbd lock ls* to get the lock for the vm
> 2. use *rbd lock rm* to remove that lock for the vm
> 3. start vm again
>
> hope that can help you.
>
> regards,
>
> Ouyang
>
> On 2018/12/4 下午4:48, linghucongsong wrote:
>
> HI all!
>
> I have a ceph test envirment use ceph with openstack. There are some vms
> run on the openstack. It is just a test envirment.
>
> my ceph version is 12.2.4. Last day I reboot all the ceph hosts before
> this I do not shutdown the vms on the openstack.
>
> When all the hosts boot up and the ceph become healthy. I  found all the
> vms can not start up. All the vms have the
>
> below xfs error. Even I use xfs_repair also can not repair this problem .
>
> It is just a test envrement so the data is not important  to me. I know
> the ceph version 12.2..4 is not stable
>
> enough but how does it have so serious problems. Mind to other people care
> about this. Thanks to all. :)
>
>
>
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> 【网易自营|30天无忧退货】真性价比：网易员工用纸“无添加谷风一木软抽面巾纸”，限时仅16.9元一提>>
> 
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

Note that this format in fstab does require a certain version of util-linux
because of the funky format of the line.  Pretty much it maps all command
line options at the beginning of the line separated with commas.

On Wed, Feb 13, 2019 at 2:10 PM David Turner  wrote:

> I believe the fstab line for ceph-fuse in this case would look something
> like [1] this.  We use a line very similar to that to mount cephfs at a
> specific client_mountpoint that the specific cephx user only has access to.
>
> [1] id=acapp3,client_mds_namespace=fs1   /tmp/ceph   fuse.ceph
>  defaults,noatime,_netdev 0 2
>
> On Tue, Dec 4, 2018 at 3:22 AM Zhenshi Zhou  wrote:
>
>> Hi
>>
>> I can use this mount cephfs manually. But how to edit fstab so that the
>> system will auto-mount cephfs by ceph-fuse?
>>
>> Thanks
>>
>> Yan, Zheng  于2018年11月20日周二 下午8:08写道：
>>
>>> ceph-fuse --client_mds_namespace=xxx
>>> On Tue, Nov 20, 2018 at 7:33 PM ST Wong (ITSC)  wrote:
>>> >
>>> > Hi all,
>>> >
>>> >
>>> >
>>> > We’re using mimic and enabled multiple fs flag. We can do
>>> kernel mount of particular fs (e.g. fs1) with mount option
>>> mds_namespace=fs1.However, this is not working for ceph-fuse:
>>> >
>>> >
>>> >
>>> > #ceph-fuse -n client.acapp3 -o mds_namespace=fs1 /tmp/ceph
>>> >
>>> > 2018-11-20 19:30:35.246 7ff5653edcc0 -1 init, newargv =
>>> 0x5564a21633b0 newargc=9
>>> >
>>> > fuse: unknown option `mds_namespace=fs1'
>>> >
>>> > ceph-fuse[3931]: fuse failed to start
>>> >
>>> > 2018-11-20 19:30:35.264 7ff5653edcc0 -1 fuse_lowlevel_new failed
>>> >
>>> >
>>> >
>>> > Sorry that I can’t find the correct option in ceph-fuse man page or
>>> doc.
>>> >
>>> > Please help.   Thanks a lot.
>>> >
>>> >
>>> >
>>> > Best Rgds
>>> >
>>> > /stwong
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

I believe the fstab line for ceph-fuse in this case would look something
like [1] this.  We use a line very similar to that to mount cephfs at a
specific client_mountpoint that the specific cephx user only has access to.

[1] id=acapp3,client_mds_namespace=fs1   /tmp/ceph   fuse.ceph
 defaults,noatime,_netdev 0 2

On Tue, Dec 4, 2018 at 3:22 AM Zhenshi Zhou  wrote:

> Hi
>
> I can use this mount cephfs manually. But how to edit fstab so that the
> system will auto-mount cephfs by ceph-fuse?
>
> Thanks
>
> Yan, Zheng  于2018年11月20日周二 下午8:08写道：
>
>> ceph-fuse --client_mds_namespace=xxx
>> On Tue, Nov 20, 2018 at 7:33 PM ST Wong (ITSC)  wrote:
>> >
>> > Hi all,
>> >
>> >
>> >
>> > We’re using mimic and enabled multiple fs flag. We can do
>> kernel mount of particular fs (e.g. fs1) with mount option
>> mds_namespace=fs1.However, this is not working for ceph-fuse:
>> >
>> >
>> >
>> > #ceph-fuse -n client.acapp3 -o mds_namespace=fs1 /tmp/ceph
>> >
>> > 2018-11-20 19:30:35.246 7ff5653edcc0 -1 init, newargv =
>> 0x5564a21633b0 newargc=9
>> >
>> > fuse: unknown option `mds_namespace=fs1'
>> >
>> > ceph-fuse[3931]: fuse failed to start
>> >
>> > 2018-11-20 19:30:35.264 7ff5653edcc0 -1 fuse_lowlevel_new failed
>> >
>> >
>> >
>> > Sorry that I can’t find the correct option in ceph-fuse man page or
>> doc.
>> >
>> > Please help.   Thanks a lot.
>> >
>> >
>> >
>> > Best Rgds
>> >
>> > /stwong
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] compacting omap doubles its size