[ceph-users] Packages for Luminous RC 12.1.0?

2017-06-14 Thread Linh Vu
Hi all, I saw that Luminous RC 12.1.0 has been mentioned in the latest release notes here: http://docs.ceph.com/docs/master/release-notes/ However, I can't see any 12.1.0 package yet on http://download.ceph.com Does anyone have any idea when the packages will be available? Thanks 

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread David Turner
I've used the kernel client and the ceph-fuse driver for mapping the cephfs volume. I didn't notice any network hiccups while failing over, but I was reading large files during my tests (and live) and some caching may have hidden hidden network hiccups for my use case. Going back to the memory

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread Daniel Carrasco
Is strange because on my test cluster (three nodes) with two nodes with OSD, and all with MON and MDS, I've configured the size to 2 and min_size to 1, I've restarted all nodes one by one and the client loose the connection for about 5 seconds until connect to other MDS. Are you using ceph client

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread David Turner
I have 3 ceph nodes, size 3, min_size 2, and I can restart them all 1 at a time to do ceph and kernel upgrades. The VM's running out of ceph, the clients accessing MDS, etc all keep working fine without any problem during these restarts. What is your full ceph configuration? There must be

[ceph-users] Directory size doesn't match contents

2017-06-14 Thread Bryan Stillwell
I have a cluster running 10.2.7 that is seeing some extremely large directory sizes in CephFS according to the recursive stats: $ ls -lhd Originals/ drwxrwxr-x 1 bryan bryan 16E Jun 13 13:27 Originals/ du reports a much smaller (and accurate) number: $ du -sh Originals/ 300GOriginals/

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread Daniel Carrasco
El 14 jun. 2017 10:08 p. m., "David Turner" escribió: Not just the min_size of your cephfs data pool, but also your cephfs_metadata pool. Both were at 1. I don't know why because I don't remember to have changed the min_size and the cluster has 3 odd from beginning (I

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread David Turner
Not just the min_size of your cephfs data pool, but also your cephfs_metadata pool. On Wed, Jun 14, 2017 at 4:07 PM David Turner wrote: > Ceph recommends 1GB of RAM for ever 1TB of OSD space. Your 2GB nodes are > definitely on the low end. 50GB OSDs... I don't know what

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread David Turner
Ceph recommends 1GB of RAM for ever 1TB of OSD space. Your 2GB nodes are definitely on the low end. 50GB OSDs... I don't know what that will require, but where you're running the mon and mds on the same node, I'd still say that 2GB is low. The Ceph OSD daemon using 1GB of RAM is not surprising,

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

2017-06-14 Thread Daniel Carrasco
Finally I've created three nodes, I've increased the size of pools to 3 and I've created 3 MDS (active, standby, standby). Today the server has decided to fail and I've noticed that failover is not working... The ceph -s command shows like everything was OK but the clients weren't able to connect

Re: [ceph-users] Effect of tunables on client system load

2017-06-14 Thread Nathanial Byrnes
Thanks for the input David. I'm not sold on xenserver per se, but, It is what we've been using for the past 7 years... Proxmox has been coming up a lot recently, I guess it is time to give it a look. I like the sound of directly using librbd. Regards, Nate On Wed, Jun 14, 2017 at 10:30 AM,

Re: [ceph-users] ceph pg repair : Error EACCES: access denied

2017-06-14 Thread Gregory Farnum
On Wed, Jun 14, 2017 at 4:08 AM Jake Grimmett wrote: > Hi Greg, > > Many thanks for your reply. > > I couldn't see an obvious cephx permission error, as I can issue other > admin commands from this node. > > However, I agree that this is probably the issue; disabling

Re: [ceph-users] Help build a drive reliability service!

2017-06-14 Thread David Turner
I understand concern over annoying drive manufacturers, but if you have data to back it up you aren't slandering a drive manufacturer. If they don't like the numbers that are found, then they should up their game or at least request that you put in how your tests negatively affected their drive

Re: [ceph-users] v11.2.0 Disk activation issue while booting

2017-06-14 Thread David Turner
Note, I am not certain that this follows the same for bluestore. I haven't set up bluestore osds yet. On Wed, Jun 14, 2017 at 11:48 AM David Turner wrote: > tl;dr to get a ceph journal to work with udev rules run this command > substituting your device name (/dev/sdb)

Re: [ceph-users] v11.2.0 Disk activation issue while booting

2017-06-14 Thread David Turner
tl;dr to get a ceph journal to work with udev rules run this command substituting your device name (/dev/sdb) and partition number used twice (=4). sgdisk /dev/sdb -t=4:45B0969E-9B03-4F30-B4C6-5EC00CEFF106 -c=4:'ceph journal' And this for your osds replacing the device name (/dev/sdc) and

Re: [ceph-users] Help build a drive reliability service!

2017-06-14 Thread Dan van der Ster
Hi Patrick, We've just discussed this internally and I wanted to share some notes. First, there are at least three separate efforts in our IT dept to collect and analyse SMART data -- its clearly a popular idea and simple to implement, but this leads to repetition and begs for a common, good

Re: [ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread David Turner
And that is why I thought to respond. I was hoping for a little more specifics about what that daemon was going to do. Thank's Gregory, I'm really excited for it! On Wed, Jun 14, 2017 at 11:05 AM Gregory Farnum wrote: > On Wed, Jun 14, 2017 at 7:52 AM David Turner

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread David Turner
"143 active+clean 17 activating" Wait until all of the PG's finish activating and you should be good. Let's revisit your 160 PG's, though. If you had 128 PGs and 8TB of data in your pool, then you each PG would have about 62.5GB in size. Because you set it to 160 instead of a base 2 number,

Re: [ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread Gregory Farnum
On Wed, Jun 14, 2017 at 7:52 AM David Turner wrote: > In the Firefly release of Ceph, all OSDs asked the mons every question > about maps that they had. In the Jewel release, that burden was taken off > of the Mons and given to the OSDs as often as possible to ask each

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Stéphane Klein
And now : ceph status cluster 800221d2-4b8c-11e7-9bb9-cffc42889917 health HEALTH_OK monmap e1: 2 mons at {ceph-storage-rbx-1= 172.29.20.30:6789/0,ceph-storage-rbx-2=172.29.20.31:6789/0} election epoch 4, quorum 0,1 ceph-storage-rbx-1,ceph-storage-rbx-2 osdmap e21: 6

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread David Turner
You increased your pg_num and it finished creating them "160 active+clean". Now you need to increase your pgp_num to match the 160 and you should be good to go. On Wed, Jun 14, 2017 at 10:57 AM Stéphane Klein wrote: > 2017-06-14 16:40 GMT+02:00 David Turner

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Stéphane Klein
And now: 2017-06-14 17:00 GMT+02:00 Stéphane Klein : > Ok, I missed: > > ceph osd pool set rbd pgp_num 160 > > Now I have: > > ceph status > cluster 800221d2-4b8c-11e7-9bb9-cffc42889917 > health HEALTH_ERR > 9 pgs are stuck inactive for more

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Stéphane Klein
Ok, I missed: ceph osd pool set rbd pgp_num 160 Now I have: ceph status cluster 800221d2-4b8c-11e7-9bb9-cffc42889917 health HEALTH_ERR 9 pgs are stuck inactive for more than 300 seconds 9 pgs stuck inactive 9 pgs stuck unclean monmap e1: 2

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Stéphane Klein
2017-06-14 16:40 GMT+02:00 David Turner : > Once those PG's have finished creating and the cluster is back to normal > How can I see Cluster migration progression? Now I have: # ceph status cluster 800221d2-4b8c-11e7-9bb9-cffc42889917 health HEALTH_WARN

Re: [ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread David Turner
In the Firefly release of Ceph, all OSDs asked the mons every question about maps that they had. In the Jewel release, that burden was taken off of the Mons and given to the OSDs as often as possible to ask each other. This was done because the Mons were found to be a bottleneck for growing a

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread David Turner
A few things to note, it is recommended to have your PG count, per pool, to be a base 2 value. Also, the number of PG's per OSD is an aggregate number between all of your pools. If you're planning to add 3 more pools for cephfs and other things, then you really want to be mindful of how many

Re: [ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Jean-Charles LOPEZ
Hi, see comments below. JC > On Jun 14, 2017, at 07:23, Stéphane Klein wrote: > > Hi, > > I have this parameter in my Ansible configuration: > > pool_default_pg_num: 300 # (100 * 6) / 2 = 300 > > But I have this error: > > # ceph status > cluster

Re: [ceph-users] Effect of tunables on client system load

2017-06-14 Thread David Turner
I don't know if you're sold on Xen and only Xen, but I've been running a 3 node cluster hyper-converged on a 4 node Proxmox cluster for my home projects. 3 of the nodes are running on Proxmox with Ceph OSDs, Mon, and MDS daemons. The fourth node is a much beefier system handling the majority of

Re: [ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread John Spray
On Wed, Jun 14, 2017 at 6:47 AM, Manuel Lausch wrote: > Hi, > > we decided to test a bit the upcoming ceph release (luminous). It seems > that I need to install this ceph-mgr daemon as well. But I don't > understand exactly why I need this service and what I can do with

[ceph-users] too few PGs per OSD (16 < min 30) but I set pool_default_pg_num: 300 in Ansible

2017-06-14 Thread Stéphane Klein
Hi, I have this parameter in my Ansible configuration: pool_default_pg_num: 300 # (100 * 6) / 2 = 300 But I have this error: # ceph status cluster 800221d2-4b8c-11e7-9bb9-cffc42889917 health HEALTH_ERR 73 pgs are stuck inactive for more than 300 seconds 22 pgs

Re: [ceph-users] Sparse file info in filestore not propagated to other OSDs

2017-06-14 Thread Sage Weil
On Wed, 14 Jun 2017, Paweł Sadowski wrote: > On 04/13/2017 04:23 PM, Piotr Dałek wrote: > > On 04/06/2017 03:25 PM, Sage Weil wrote: > >> On Thu, 6 Apr 2017, Piotr Dałek wrote: > >>> Hello, > >>> > >>> We recently had an interesting issue with RBD images and filestore > >>> on Jewel > >>> 10.2.5:

Re: [ceph-users] ceph pg repair : Error EACCES: access denied

2017-06-14 Thread Jake Grimmett
Hi Greg, Many thanks for your reply. I couldn't see an obvious cephx permission error, as I can issue other admin commands from this node. However, I agree that this is probably the issue; disabling cephx (auth to none) enabled me to repair the pgs, and gain a clean HEALTH report. thanks

[ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread Manuel Lausch
Hi, we decided to test a bit the upcoming ceph release (luminous). It seems that I need to install this ceph-mgr daemon as well. But I don't understand exactly why I need this service and what I can do with it. The ceph Cluster is working well without installing any manager daemon. However in

[ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-14 Thread Dan van der Ster
Dear ceph users, Today we had O(100) slow requests which were caused by deep-scrubbing of the metadata log: 2017-06-14 11:07:55.373184 osd.155 [2001:1458:301:24::100:d]:6837/3817268 7387 : cluster [INF] 24.1d deep-scrub starts ... 2017-06-14 11:22:04.143903 osd.155

Re: [ceph-users] v11.2.0 Disk activation issue while booting

2017-06-14 Thread nokia ceph
Hello David, Thanks for the update. http://tracker.ceph.com/issues/13833#note-7 - As per this tracker they mentioned that the GUID may differ which cause udev were unable to chown ceph. We are following below procedure to create OSD's #sgdisk -Z /dev/sdb #ceph-disk prepare --bluestore

[ceph-users] Integratin ceph with openstack with cephx disabled

2017-06-14 Thread Tzachi Strul
Hi, We have ceph cluster that we want to integrate with openstack. We disabled cephx. We noticed that when we integrate ceph with libvirt it doesnt work unless we use client.cinder.key when we import secret.xml are we doing something wrong? or it is impossible to implement without cephx enabled?

Re: [ceph-users] Sparse file info in filestore not propagated to other OSDs

2017-06-14 Thread Paweł Sadowski
On 04/13/2017 04:23 PM, Piotr Dałek wrote: > On 04/06/2017 03:25 PM, Sage Weil wrote: >> On Thu, 6 Apr 2017, Piotr Dałek wrote: >>> Hello, >>> >>> We recently had an interesting issue with RBD images and filestore >>> on Jewel >>> 10.2.5: >>> We have a pool with RBD images, all of them mostly