Re: [ceph-users] How to make nfs v3 work? nfs-ganesha for cephfs

2018-06-27 Thread Youzhong Yang
Thank you Paul. mount_path_pseudo does the trick. Now nfs v3 works on Linux, but on MAC OS, it mounts successfully, with empty folder: bat8485maci:~ root# mount -t nfs -o vers=3 ceph-admin:/ceph /mnt/ceph bat8485maci:~ root# ls -l /mnt/ceph/ bat8485maci:~ root# This is how it looks like on

Re: [ceph-users] pre-sharding s3 buckets

2018-06-27 Thread Thomas Bennett
Hi Matthew, Thanks for your reply, much appreciated. Sorry, I meant to say that we're running on Luminous, so I'm aware of dynamic resharding - however, I'm worried that this does not suit our particular use case. What I also forgot to mention is that we could be resharding a bucket 30 times in

Re: [ceph-users] Ceph snapshots

2018-06-27 Thread Brian :
Hi John Have you looked at ceph documentation? RBD: http://docs.ceph.com/docs/luminous/rbd/rbd-snapshot/ The ceph project documentation is really good for most areas. Have a look at what you can find then come back with more specific questions! Thanks Brian On Wed, Jun 27, 2018 at 2:24 PM,

Re: [ceph-users] CephFS MDS server stuck in "resolve" state

2018-06-27 Thread Yan, Zheng
On Wed, Jun 27, 2018 at 6:16 PM Dennis Kramer (DT) wrote: > > Hi, > > Currently i'm running Ceph Luminous 12.2.5. > > This morning I tried running Multi MDS with: > ceph fs set max_mds 2 > > I have 5 MDS servers. After running above command, > I had 2 active MDSs, 2 standby-active and 1 standby.

Re: [ceph-users] Ceph FS Random Write 4KB block size only 2MB/s?!

2018-06-27 Thread Yan, Zheng
On Wed, Jun 27, 2018 at 8:04 PM Yu Haiyang wrote: > > Hi All, > > Using fio with job number ranging from 1 to 128, the random write speed for > 4KB block size has been consistently around 1MB/s to 2MB/s. > Random read of the same block size can reach 60MB/s with 32 jobs. run fio on ceph-fuse?

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-27 Thread Steve Anthony
One addendum for the sake of completeness. A few PGs still refused to repair even after the clone object was gone. To resolve this I needed to remove the clone metadata from the HEAD using ceph-objectstore-tool. First, I found the problematic clone ID in the log on the primary replica: ceph2:~#

Re: [ceph-users] Ceph snapshots

2018-06-27 Thread Marc Schöchlin
Hello list, i currently hold 3 snapshots per rbd image for my virtual systems. What i miss in the current documentation: * details about the implementation of snapshots o implementation details o which scenarios create high overhead per snapshot o what causes the really

Re: [ceph-users] pulled a disk out, ceph still thinks its in

2018-06-27 Thread pixelfairy
even pulling a few more out didnt show up in osd tree. had to actually try to use them. ceph tell osd.N bench works. On Sun, Jun 24, 2018 at 2:23 PM pixelfairy wrote: > 15, 5 in each node. 14 currently in. > > is there another way to know if theres a problem with one? or to make the > threshold

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Brad Hubbard
Try the following. You can do this with all osds up and running. # rados -p [name_of_pool_18] setomapval .dir.default.80018061.2 temporary-key anything # ceph pg deep-scrub 18.2 Once you are sure the scrub has completed and the pg is no longer inconsistent you can remove the temporary key. #

[ceph-users] unable to remove phantom snapshot for object, snapset_inconsistency

2018-06-27 Thread Steve Anthony
In the process of trying to repair snapshot inconsistencies associated with the issues in this thread, http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-June/027125.html ("FAILED assert(p != recovery_info.ss.clone_snaps.end())​"), I have one PG I still can't get to repair. Two of the three

Re: [ceph-users] Ceph Luminous RocksDB vs WalDB?

2018-06-27 Thread Igor Fedotov
Hi Pardhiv, there is no WalDB in Ceph. It's WAL (Write Ahead Log) that is a way to ensure write safety in RocksDB. In other words - that's just a RocksDB subsystem which can use separate volume though. In general For BlueStore/BlueFS one can either allocate separate volumes for WAL and DB

Re: [ceph-users] Recreating a purged OSD fails

2018-06-27 Thread Igor Fedotov
Looks like a known issue tracked by http://tracker.ceph.com/issues/24423 http://tracker.ceph.com/issues/24599 Regards, Igor On 6/27/2018 9:40 AM, Steffen Winther Sørensen wrote: List, Had a failed disk behind an OSD in a Mimic Cluster 13.2.0, so I tried following the doc on removal of

[ceph-users] CephFS MDS server stuck in "resolve" state

2018-06-27 Thread Dennis Kramer (DT)
Hi, Currently i'm running Ceph Luminous 12.2.5. This morning I tried running Multi MDS with: ceph fs set max_mds 2 I have 5 MDS servers. After running above command, I had 2 active MDSs, 2 standby-active and 1 standby. And after trying a failover on one of the active MDSs, a standby-active

[ceph-users] pre-sharding s3 buckets

2018-06-27 Thread Thomas Bennett
Hi, We have a particular use case that we know that we're going to be writing lots of objects (up to 3 million) into a bucket. To take advantage of sharding, I'm wanting to shard buckets, without the performance hit of resharding. So far I've created an empty bucket and then used the

Re: [ceph-users] In a High Avaiability setup, MON, OSD daemon take up the floating IP

2018-06-27 Thread Paul Emmerich
Mons have only exactly one fixed IP address. A mon cannot use a floating IP, otherwise it couldn't find its peers. Also, the concept of a floating IP makes no sense for mons - you simply give your clients a list of mon IPs to connect to. Paul 2018-06-26 10:17 GMT+02:00 Rahul S : > Hi! In my

Re: [ceph-users] How to make nfs v3 work? nfs-ganesha for cephfs

2018-06-27 Thread Paul Emmerich
NFS3 does not use pseudo paths usually. You can enable the Mount_Path_Pseudo option in NFS_CORE_PARAM to enable usage of pseudo fsal for NFS3 clients. (Note that the NFS3 clients cannot mount the pseudo root itself, but only subdirectories due to limitations in the inode size) Paul 2018-06-26

[ceph-users] Recreating a purged OSD fails

2018-06-27 Thread Steffen Winther Sørensen
List, Had a failed disk behind an OSD in a Mimic Cluster 13.2.0, so I tried following the doc on removal of an OSD. I did: # ceph osd crush reweight osd.19 0 waited for rebalancing to finish and cont.: # ceph osd out 19 # systemctl stop ceph-osd@19 # ceph osd purge 19 --yes-i-really-mean-it

[ceph-users] Ceph behavior on (lots of) small objects (RGW, RADOS + erasure coding)?

2018-06-27 Thread Nicolas Dandrimont
Hi, I would like to use ceph to store a lot of small objects. Our current usage pattern is 4.5 billion unique objects, ranging from 0 to 100MB, with a median size of 3-4kB. Overall, that's around 350 TB of raw data to store, which isn't much, but that's across a *lot* of tiny files. We expect a

Re: [ceph-users] Monitoring bluestore compression ratio

2018-06-27 Thread Igor Fedotov
Hi David, First of all 'bluestore_extent_compress' is unrelated to data compression - that's amount of merged onode's extent map entries. When BlueStore detects neighboring extents in an onode - it might merge them into a single map entry, e.g. write 0x4000~1000 ... ... write

Re: [ceph-users] Monitoring bluestore compression ratio

2018-06-27 Thread Igor Fedotov
And yes - first 3 parameters from this list are the right and the only way to inspect compression effectiveness so far. Corresponding updates to show that with "ceph df" are on the way and are targeted for Nautilus. Thanks, Igor On 6/26/2018 4:53 PM, David Turner wrote: ceph daemon

Re: [ceph-users] Recreating a purged OSD fails

2018-06-27 Thread Paul Emmerich
> Am 27.06.2018 um 09:02 schrieb Steffen Winther Sørensen : > > > >> On 27 Jun 2018, at 08.57, Paul Emmerich wrote: >> >> You are running into https://tracker.ceph.com/issues/24423 > In above You said: > > Updated by Paul Emmerich 18 days ago > > I can't reproduce this on any new Mimic

Re: [ceph-users] In a High Avaiability setup, MON, OSD daemon take up the floating IP

2018-06-27 Thread Rahul S
I have an OpenNebula HA setup which requires a floating ip to function. For testing purposes, I also have my ceph cluster co-located on the OpenNebula cluster. While I have configured my mon daemons to take the actual ips. On installation one of the mon daemons takes its IP from outside the

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
Hi Brad, Thanks, that helped to get the query info on the inconsistent PG 18.2: { "state": "active+clean+inconsistent", "snap_trimq": "[]", "snap_trimq_len": 0, "epoch": 121293, "up": [ 21, 28 ], "acting": [ 21, 28 ],

Re: [ceph-users] In a High Avaiability setup, MON, OSD daemon take up the floating IP

2018-06-27 Thread Дробышевский , Владимир
Hello, Rahul! Do you have your problem during initial cluster creation or on any reboot\leadership transfer? If the first then try to remove floating IP while creating mons and temporarily transfer the leadership from the server your going to create OSD on. We are using the same

[ceph-users] Centralised Logging Strategy

2018-06-27 Thread Tom W
Morning all, Does anybody have any advice regarding moving their Ceph clusters to centralised logging? We are presently investigating routes to undertake this (long awaited and needed) change, and any pointers or gotchas that may lay ahead that we could be advised on would be great. We are

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
Here is one more thing: rados list-inconsistent-obj 18.2 { "inconsistents" : [ { "object" : { "locator" : "", "version" : 632942, "nspace" : "", "name" : ".dir.default.80018061.2", "snap" : "head" },

Re: [ceph-users] Ceph 12.2.5 - FAILED assert(0 == "put on missing extent (nothing before)")

2018-06-27 Thread Dyweni - Ceph-Users
Good Morning, I have rebuilt the OSD and the cluster is healthy now. I have one pool with 3 replica setup. I am a bit concerned that removing a snapshot can cause an OSD to crash. I've asked myself what would have happened if 2 OSD's had crashed? God forbid, what if 3 or more OSD's had

[ceph-users] Ceph FS Random Write 4KB block size only 2MB/s?!

2018-06-27 Thread Yu Haiyang
Hi All, Using fio with job number ranging from 1 to 128, the random write speed for 4KB block size has been consistently around 1MB/s to 2MB/s. Random read of the same block size can reach 60MB/s with 32 jobs. Our ceph cluster consists of 4 OSDs all running on SSD connected through a switch

[ceph-users] Luminous Bluestore performance, bcache

2018-06-27 Thread Richard Bade
Hi Everyone, There's been a few threads go past around this but I haven't seen any that pointed me in the right direction. We've recently set up a new luminous (12.2.5) cluster with 5 hosts each with 12 4TB Seagate Constellation ES spinning disks for osd's. We also have 2x 400GB Intel DC P3700's

Re: [ceph-users] Ceph FS Random Write 4KB block size only 2MB/s?!

2018-06-27 Thread Yu Haiyang
Hi Yan, Thanks for your suggestion. No, I didn’t run fio on ceph-fuse. I mounted my Ceph FS in kernel mode. Regards, Haiyang > On Jun 27, 2018, at 9:45 PM, Yan, Zheng wrote: > > On Wed, Jun 27, 2018 at 8:04 PM Yu Haiyang wrote: >> >> Hi All, >> >> Using fio with job number ranging from 1

[ceph-users] Luminous BlueStore OSD - Still a way to pinpoint an object?

2018-06-27 Thread Yu Haiyang
Hi All, Previously I read this article about how to locate an object on the OSD disk. Apparently it was on a FileStore-back disk partition. Now I have upgraded my Ceph to Luminous and hosted my OSDs on BlueStore partition, the OSD directory structure has completely changed. The data is mapped