Re: [ceph-users] the state of cephfs in giant

2014-10-13 Thread Eric Eastman
I would be interested in testing the Samba VFS and Ganesha NFS integration with CephFS. Are there any notes on how to configure these two interfaces with CephFS? Eric We've been doing a lot of work on CephFS over the past few months. This is an update on the current state of things as of

[ceph-users] HEALTH_WARN pool has too few pgs

2014-06-11 Thread Eric Eastman
Hi, I am seeing the following warning on one of my test clusters: # ceph health detail HEALTH_WARN pool Ray has too few pgs pool Ray objects per pg (24) is more than 12 times cluster average (2) This is a reported issue and is set to Won't Fix at: http://tracker.ceph.com/issues/8103 My test

Re: [ceph-users] HEALTH_WARN pool has too few pgs

2014-06-12 Thread Eric Eastman
Hi JC, The cluster already has 1024 PGs on only 15 OSD, which is above the formula of (100 x #OSDs)/size. How large should I make it? # ceph osd dump | grep Ray pool 17 'Ray' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 7785 owner 0

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread Eric Eastman
Do not go to a 3.15 or later Ubuntu kernel at this time if your are using krbd. See bug 8818. The Ubuntu 3.14.x kernels seems to work fine with krbd on Trusty. The mainline packages from Ubuntu should be helpful in testing. Info: https://wiki.ubuntu.com/Kernel/MainlineBuilds

[ceph-users] Multiple kernel RBD clients failures

2013-09-30 Thread Eric Eastman
I have 5 RBD kernel based clients, all using kernel 3.11.1, running Ubuntu 1304, that all failed with a write error at the same time and I need help to figure out what caused the failure. The 5 clients were all using the same pool, and each had its own image, with an 18TB XFS file system on

Re: [ceph-users] Multiple kernel RBD clients failures

2013-09-30 Thread Eric Eastman
Thank you for the reply -28 == -ENOSPC (No space left on device). I think it's is due to the fact that some osds are near full. Yan, Zheng I thought that may be the case, but I would expect that ceph health would tell me I had a full OSDs, but it is only saying they are near full: #

Re: [ceph-users] Multiple kernel RBD clients failures

2013-10-01 Thread Eric Eastman
Hi Travis, Both you and Yan saw the same thing, in that the drives in my test system go from 300GB to 4TB. I used ceph-deploy to create all the OSDs, which I assume picked the weights of 0.26 for my 300GB drives, and 3.63 for my 4TB drives. All the OSDs that are reporting nearly full are

Re: [ceph-users] Balance data on near full osd warning or error

2013-10-22 Thread Eric Eastman
Hello, What I have used to rebalance my cluster is: ceph osd reweight-by-utilization we're using a small Ceph cluster with 8 nodes, each 4 osds. People are using it through instances and volumes in a Openstack platform. We're facing a HEALTH_ERR with full or near full osds : cluster

[ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread Eric Eastman
I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after installing ZFS from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1 I do have a few questions and comments on Ceph using ZFS backed OSDs As ceph-deploy does not show support for ZFS, I used the instructions at:

Re: [ceph-users] Ceph incomplete pg

2013-12-16 Thread Eric Eastman
I am currently trying to figure out how to debug pgs issues myself and the debugging documentation I have found has not been that helpful. In my case the underlying problem is probably ZFS which I am using for my OSDs, but it would be nice to be able to recover what I can. My health output

Re: [ceph-users] clock skew

2014-01-30 Thread Eric Eastman
I have this problem on some of my Ceph clusters, and I think it is due to the older hardware the I am using does not have the best clocks. To fix the problem, I setup one server in my lab to be my local NTP time server, and then on each of my Ceph monitors, in the /etc/ntp.conf file, I put in

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
I am seeing issues with ceph-deploy and ceph-disk, which it calls, if the storage devices are not generic sdx devices. On my older HP systems, ceph-deploy fails on the cciss devices, and I tried to use it with multipath dm devices, and that did not work at all. Logging is not verbose enough

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
1 root disk 104, 2 Jul 23 20:25 c0d0p2 brw-rw 1 root disk 104, 5 Jul 23 20:25 c0d0p5 On Tue, 23 Jul 2013, Eric Eastman wrote: I tried running: ceph-deploy install --dev=wip-cuttlefish-ceph-disk HOST To a clean system, and just after I ran: ceph-deploy install ceph11 (Which worked

Re: [ceph-users] ceph-deploy

2013-07-24 Thread Eric Eastman
Later today I will try both the HP testing using multiple cciss devices for my OSDs and separately testing manually specifying the dm devices on my external FC and iSCSI storage and will let you know how both tests turn out. Thanks again, Eric Tomorrow I will bring up a HP system that has

Re: [ceph-users] ceph-deploy

2013-07-24 Thread Eric Eastman
Hi Sage, I tested the HP cciss devices as OSD disks on the --dev=wip-cuttlefish-ceph-disk build tonight and it worked, but not exactly as expected. I first tried: # ceph-deploy -v osd create testsrv16:c0d1 which failed with: ceph-disk: Error: data path does not exist: /dev/c0d1 so I went

Re: [ceph-users] Defective ceph startup script

2013-07-31 Thread Eric Eastman
Hi Greg, I saw about the same thing on Ubuntu 13.04 as you did. I used apt-get -y update apt-get -y upgrade On all my cluster nodes to upgrade from 0.61.5 to 0.61.7 and then noticed that some of my systems did not restart all the daemons. I tried: stop ceph-all start ceph-all On those

Re: [ceph-users] Cache tiers flushing logic

2014-12-30 Thread Eric Eastman
On Tue, Dec 30, 2014 at 7:56 AM, Erik Logtenberg e...@logtenberg.eu wrote: Hi, I use a cache tier on SSD's in front of the data pool on HDD's. I don't understand the logic behind the flushing of the cache however. If I start writing data to the pool, it all ends up in the cache pool at

Re: [ceph-users] Cache tiers flushing logic

2014-12-30 Thread Eric Eastman
On Tue, Dec 30, 2014 at 12:38 PM, Erik Logtenberg e...@logtenberg.eu wrote: Hi Erik, I have tiering working on a couple test clusters. It seems to be working with Ceph v0.90 when I set: ceph osd pool set POOL hit_set_type bloom ceph osd pool set POOL hit_set_count 1 ceph osd pool set

[ceph-users] Problem mapping RBD images with v0.92

2015-02-07 Thread Eric Eastman
Has anything changed in v0.92 that would keep a 3.18 Kernel from mapping a RBD image? I have been using a test script to create RBD images and map them since FireFly and the script has worked fine through Ceph v0.91. It is not working with v0.92, so I minimized it to the following 3 commands

Re: [ceph-users] Problem mapping RBD images with v0.92

2015-02-07 Thread Eric Eastman
* Without the --image-shared option, rbd CLI creates the image with RBD_FEATURE_EXCLUSIVE_LOCK, which is not supported by the linux kernel RDB. Thanks, Raju *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Eric Eastman *Sent:* Sunday, February 08, 2015 8:46 AM

[ceph-users] chattr +i not working with cephfs

2015-01-27 Thread Eric Eastman
Should chattr +i work with cephfs? Using ceph v0.91 and a 3.18 kernel on the CephFS client, I tried this: # mount | grep ceph 172.16.30.10:/ on /cephfs/test01 type ceph (name=cephfs,key=client.cephfs) # echo 1 /cephfs/test01/test.1 # ls -l /cephfs/test01/test.1 -rw-r--r-- 1 root root 2 Jan 27

Re: [ceph-users] chattr +i not working with cephfs

2015-01-28 Thread Eric Eastman
On Wed, Jan 28, 2015 at 11:43 AM, Gregory Farnum g...@gregs42.com wrote: On Wed, Jan 28, 2015 at 10:06 AM, Sage Weil s...@newdream.net wrote: On Wed, 28 Jan 2015, John Spray wrote: On Wed, Jan 28, 2015 at 5:23 PM, Gregory Farnum g...@gregs42.com wrote: My concern is whether we as the FS

Re: [ceph-users] chattr +i not working with cephfs

2015-01-28 Thread Eric Eastman
background. John On Wed, Jan 28, 2015 at 1:24 AM, Eric Eastman eric.east...@keepertech.com wrote: Should chattr +i work with cephfs? Using ceph v0.91 and a 3.18 kernel on the CephFS client, I tried this: # mount | grep ceph 172.16.30.10:/ on /cephfs/test01 type ceph (name=cephfs,key

Re: [ceph-users] Understanding High Availability - iSCSI/CIFS/NFS

2015-04-04 Thread Eric Eastman
You may want to look at the Clustered SCSI Target Using RBD Status Blueprint, Etherpad and video at: https://wiki.ceph.com/Planning/Blueprints/Hammer/Clustered_SCSI_target_using_RBD http://pad.ceph.com/p/I-scsi

Re: [ceph-users] umount stuck on NFS gateways switch over by using Pacemaker

2015-05-28 Thread Eric Eastman
On Thu, May 28, 2015 at 1:33 AM, wd_hw...@wistron.com wrote: Hello, I am testing NFS over RBD recently. I am trying to build the NFS HA environment under Ubuntu 14.04 for testing, and the packages version information as follows: - Ubuntu 14.04 : 3.13.0-32-generic(Ubuntu 14.04.2 LTS) -

[ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
Hi, I need to verify in Ceph v9.0.2 if the kernel version of Ceph file system supports ACLs and the libcephfs file system interface does not. I am trying to have SAMBA, version 4.3.0rc1, support Windows ACLs using vfs objects = acl_xattr with the SAMBA VFS Ceph file system interface vfs objects =

Re: [ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
On Sun, Aug 16, 2015 at 9:12 PM, Yan, Zheng uker...@gmail.com wrote: On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman eric.east...@keepertech.com wrote: Hi, I need to verify in Ceph v9.0.2 if the kernel version of Ceph file system supports ACLs and the libcephfs file system interface does

Re: [ceph-users] rbd-fuse Transport endpoint is not connected

2015-07-30 Thread Eric Eastman
It is great having access to features that are not fully production ready, but it would be nice to know which Ceph features are ready and which are not. Just like Ceph File System is well marked that it is not yet fully ready for production, it would be nice if rbd-fuse could be marked as not

Re: [ceph-users] Weird behaviour of cephfs with samba

2015-07-27 Thread Eric Eastman
I don't have any answers but I am also seeing some strange results exporting a Ceph file system using the Samba VFS interface on Ceph version 9.0.2. If I mount a Linux client with vers=1, I see the file system the same as I see it on a ceph file system mount. If I use vers=2.0 or vers=3.0 on the

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-27 Thread Eric Eastman
We are looking at using Ganesha NFS with the Ceph file system. Currently I am testing the FSAL interface on Ganesha NFS Release = V2.2.0-2 running on Ceph 9.0.2. This is all early work, as Ceph FS is still not considered production ready, and Ceph 9.0.2 is a development release. Currently I am

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Eric Eastman
You may want to check your min_size value for your pools. If it is set to the pool size value, then the cluster will not do I/O if you loose a chassis. On Sun, Jul 5, 2015 at 11:04 PM, Mallikarjun Biradar mallikarjuna.bira...@gmail.com wrote: Hi all, Setup details: Two storage enclosures

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-14 Thread Eric Eastman
written, I will retest using the SAMBA VFS interface, followed by the kernel test. Please let me know if there is more info you need and if you want me to open a ticket. Best regards Eric On Mon, Jul 13, 2015 at 9:40 AM, Eric Eastman eric.east...@keepertech.com wrote: Thanks John. I will back

Re: [ceph-users] Ceph experiences

2015-07-18 Thread Eric Eastman
Congratulations on getting your cluster up and running. Many of us have seen the distribution issue on smaller clusters. More PGs and more OSDs help. A 100 OSD configuration balances better then a 12 OSD system. Ceph tries to protect your data, so a single full OSD shuts off writes. Ceph CRUSH

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-13 Thread Eric Eastman
, Eric Eastman wrote: Hi John, I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all nodes. This system is using 4 Ceph FS client systems. They all have the kernel driver version of CephFS loaded, but none are mounting the file system. All 4 clients are using the libcephfs VFS

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-12 Thread Eric Eastman
Hi John, I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all nodes. This system is using 4 Ceph FS client systems. They all have the kernel driver version of CephFS loaded, but none are mounting the file system. All 4 clients are using the libcephfs VFS interface to Ganesha NFS

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-12 Thread Eric Eastman
In the last email, I stated the clients were not mounted using the ceph file system kernel driver. Re-checking the client systems, the file systems are mounted, but all the IO is going through Ganesha NFS using the ceph file system library interface. On Sun, Jul 12, 2015 at 9:02 PM, Eric Eastman

Re: [ceph-users] Read-out much slower than write-in on my ceph cluster

2015-10-28 Thread Eric Eastman
On the RBD performance issue, you may want to look at: http://tracker.ceph.com/issues/9192 Eric On Tue, Oct 27, 2015 at 8:59 PM, FaHui Lin wrote: > Dear Ceph experts, > > I found something strange about the performance of my Ceph cluster: Read-out > much slower than

[ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
I am trying to figure out why my Ceph file system is not freeing space. Using Ceph 9.1.0 I created a file system with snapshots enabled, filled up the file system over days while taking snapshots hourly. I then deleted all files and all snapshots, but Ceph is not returning the space. I left the

Re: [ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
On Wed, Nov 11, 2015 at 4:19 PM, John Spray wrote: > > Eric: for the ticket, can you also gather an MDS log (with debug mds = > 20) from the point where the MDS starts up until the point where it > has been active for a few seconds? The strays are evaluated for > purging

Re: [ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
On Wed, Nov 11, 2015 at 11:09 AM, John Spray <jsp...@redhat.com> wrote: > On Wed, Nov 11, 2015 at 5:39 PM, Eric Eastman > <eric.east...@keepertech.com> wrote: >> I am trying to figure out why my Ceph file system is not freeing >> space. Using Ceph 9.1.0 I create

[ceph-users] Cannot delete ceph file system snapshots

2015-07-08 Thread Eric Eastman
Hi, I have created a ceph file system on a cluster running ceph v9.0.1 and have enable snapshots with the command: ceph mds set allow_new_snaps true --yes-i-really-mean-it On the top level of the ceph file system, I can cd into the hidden .snap directory and I can create new directories with

Re: [ceph-users] Cannot delete ceph file system snapshots

2015-07-08 Thread Eric Eastman
Thank you! That was the solution. Eric On Wed, Jul 8, 2015 at 12:02 PM, Jan Schermer j...@schermer.cz wrote: Have you tried rmdir instead of rm -rf? Jan On 08 Jul 2015, at 19:17, Eric Eastman eric.east...@keepertech.com wrote: Hi, I have created a ceph file system on a cluster

[ceph-users] Seeing huge number of open pipes per OSD process

2015-10-05 Thread Eric Eastman
I am testing a Ceph cluster running Ceph v9.0.3 on Trusty using the 4.3rc4 kernel and I am seeing a huge number of open pipes on my OSD processes as I run a sequential load on the system using a single Ceph file system client. A "lsof -n > file.txt" on one of the OSD servers produced a 9GB file

Re: [ceph-users] v10.0.4 released

2016-03-19 Thread Eric Eastman
Thank you for doing this. It will make testing 10.0.x easier for all of us in the field, and will make it easier to report bugs, as we will know that the problems we find were not caused by our build process. Eric On Wed, Mar 16, 2016 at 7:14 AM, Loic Dachary wrote: > Hi, >

Re: [ceph-users] Multiple MDSes

2016-04-22 Thread Eric Eastman
On Fri, Apr 22, 2016 at 9:59 PM, Andrus, Brian Contractor wrote: > All, > > Ok, I understand Jewel is considered stable for CephFS with a single active > MDS. > > But, how do I add a standby MDS? What documentation I find is a bit > confusing. > > I ran > > ceph-deploy create

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-11 Thread Eric Eastman
On Wed, May 11, 2016 at 2:04 AM, Nick Fisk <n...@fisk.me.uk> wrote: >> -Original Message----- >> From: Eric Eastman [mailto:eric.east...@keepertech.com] >> Sent: 10 May 2016 18:29 >> To: Nick Fisk <n...@fisk.me.uk> >> Cc: Ceph Users <ceph-user

[ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-06 Thread Eric Eastman
I was doing some SAMBA testing and noticed that a kernel mounted share acted differently then a fuse mounted share with Windows security on my windows client. I cut my test down to as simple as possible, and I am seeing the kernel mounted Ceph file system working as expected with SAMBA and the

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-06 Thread Eric Eastman
t 9:53 AM, Eric Eastman > <eric.east...@keepertech.com> wrote: >> I was doing some SAMBA testing and noticed that a kernel mounted share >> acted differently then a fuse mounted share with Windows security on >> my windows client. I cut my test down to as simple as pos

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-09 Thread Eric Eastman
On Mon, May 9, 2016 at 8:08 PM, Yan, Zheng <uker...@gmail.com> wrote: > On Tue, May 10, 2016 at 2:10 AM, Eric Eastman > <eric.east...@keepertech.com> wrote: >> On Mon, May 9, 2016 at 10:36 AM, Gregory Farnum <gfar...@redhat.com> wrote: >>> On Sa

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-10 Thread Eric Eastman
On Tue, May 10, 2016 at 6:48 AM, Nick Fisk <n...@fisk.me.uk> wrote: > > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Nick Fisk >> Sent: 10 May 2016 13:30 >> To: 'Eric Eastman' <eric.east

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-07 Thread Eric Eastman
On Fri, May 6, 2016 at 2:14 PM, Eric Eastman <eric.east...@keepertech.com> wrote: > As it should be working, I will increase the logging level in my > smb.conf file and see what info I can get out of the logs, and report back. Setting the log level = 20 in my smb.conf file, and t

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-09 Thread Eric Eastman
On Mon, May 9, 2016 at 10:36 AM, Gregory Farnum <gfar...@redhat.com> wrote: > On Sat, May 7, 2016 at 9:53 PM, Eric Eastman > <eric.east...@keepertech.com> wrote: >> On Fri, May 6, 2016 at 2:14 PM, Eric Eastman >> <eric.east...@keepertech.com> wrote: >>

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-09 Thread Eric Eastman
I am trying to do some similar testing with SAMBA and CTDB with the Ceph file system. Are you using the vfs_ceph SAMBA module or are you kernel mounting the Ceph file system? Thanks Eric On Mon, May 9, 2016 at 9:31 AM, Nick Fisk wrote: > Hi All, > > I've been testing an

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Eric Eastman
Under Jewel 10.2.2 I have also had to delete PG directories to get very full OSDs to restart. I first use "du -sh *" under the "current" directory to find which OSD directories are the fullest on the full OSD disk, and pick 1 of the fullest. I then look at the PG map and verify the PG is

Re: [ceph-users] cephfs (rbd) read performance low - where is the bottleneck?

2016-11-20 Thread Eric Eastman
Have you looked at your file layout? On a test cluster running 10.2.3 I created a 5GB file and then looked at the layout: # ls -l test.dat -rw-r--r-- 1 root root 524288 Nov 20 23:09 test.dat # getfattr -n ceph.file.layout test.dat # file: test.dat ceph.file.layout="stripe_unit=4194304

Re: [ceph-users] Ceph file system hang

2017-06-15 Thread Eric Eastman
d file system, the file system should not hang. Thanks, Eric > > On Thu, Jun 15, 2017 at 12:39 PM Eric Eastman <eric.east...@keepertech.com> > wrote: >> >> We are running Ceph 10.2.7 and after adding a new multi-threaded >> writer application we are seeing hangs acces

Re: [ceph-users] Ceph file system hang

2017-06-16 Thread Eric Eastman
I have created a ticket on this issue: http://tracker.ceph.com/issues/20329 On Thu, Jun 15, 2017 at 12:14 PM, Eric Eastman <eric.east...@keepertech.com> wrote: > On Thu, Jun 15, 2017 at 11:45 AM, David Turner <drakonst...@gmail.com> wrote: >> Have you compared performance to

[ceph-users] Ceph file system hang

2017-06-15 Thread Eric Eastman
We are running Ceph 10.2.7 and after adding a new multi-threaded writer application we are seeing hangs accessing metadata from ceph file system kernel mounted clients. I have a "du -ah /cephfs" process that been stuck for over 12 hours on one cephfs client system. We started seeing hung "du

[ceph-users] Is the StupidAllocator supported in Luminous?

2017-09-09 Thread Eric Eastman
I am seeing OOM issues with some of my OSD nodes that I am testing with Bluestore on 12.2.0, so I decided to try the StupidAllocator to see if it has a smaller memory footprint, by setting the following in my ceph.conf: bluefs_allocator = stupid bluestore_cache_size_hdd = 1073741824

Re: [ceph-users] Is the StupidAllocator supported in Luminous?

2017-09-09 Thread Eric Eastman
Opened: http://tracker.ceph.com/issues/21332 On Sat, Sep 9, 2017 at 10:03 PM, Gregory Farnum wrote: > Yes. Please open a ticket! > > >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >>

Re: [ceph-users] Ceph release cadence

2017-09-06 Thread Eric Eastman
I have been working with Ceph for the last several years and I help support multiple Ceph clusters. I would like to have the team drop the Even/Odd release schedule, and go to an all production release schedule. I would like releases on no more then a 9 month schedule, with smaller incremental

[ceph-users] Looking for help with debugging cephfs snapshots

2017-10-22 Thread Eric Eastman
With help from the list we recently recovered one of our Jewel based clusters that started failing when we got to about 4800 cephfs snapshots. We understand that cephfs snapshots are still marked experimental. We are running a single active MDS with 2 standby MDS. We only have a single file

Re: [ceph-users] Looking for help with debugging cephfs snapshots

2017-10-22 Thread Eric Eastman
On Sun, Oct 22, 2017 at 8:05 PM, Yan, Zheng <uker...@gmail.com> wrote: > On Mon, Oct 23, 2017 at 9:35 AM, Eric Eastman > <eric.east...@keepertech.com> wrote: > > With help from the list we recently recovered one of our Jewel based > > clusters that started failing wh