Re: [ceph-users] New user on Ubuntu 16.04

2016-09-08 Thread Alex Evonosky
disregard-- found the issue, it was a remote hostname issue not matching the localhostname. Thank you. On Thu, Sep 8, 2016 at 10:26 PM, Alex Evonosky wrote: > Hey group- > > I am a new CEPH user on Ubuntu and notice this when creating a brand new > monitor following

Re: [ceph-users] Ceph-deploy not creating osd's

2016-09-08 Thread Shain Miley
I ended up starting from scratch and doing a purge and purgedata on that host using ceph-deploy, after that things seemed to go better. The osd is up and in at this point, however when the osd was added to the cluster...no data was being moved to the new osd. Here is a copy of my current crush

[ceph-users] New user on Ubuntu 16.04

2016-09-08 Thread Alex Evonosky
Hey group- I am a new CEPH user on Ubuntu and notice this when creating a brand new monitor following the documentation: storage@alex-desktop:~/ceph$ ceph-deploy --overwrite-conf mon create alex-desktop [ceph_deploy.conf][DEBUG ] found configuration file at: /home/storage/.cephdeploy.conf

[ceph-users] Memory leak with latest ceph code

2016-09-08 Thread Zhiyuan Wang
Hi guys I am testing the performance of blue store backend on PCIE SSD with ceph jewel 10.2.2 and I found that the 4K random write performance is bad because of large write amplification. After I got some recently change log which related to this issue. I upgrade to the latest code, and I was

Re: [ceph-users] non-effective new deep scrub interval

2016-09-08 Thread Christian Balzer
Hello, On Thu, 8 Sep 2016 17:09:27 +0200 (CEST) David DELON wrote: > > First, thanks for your answer Christian. > C'est rien. > - Le 8 Sep 16, à 13:30, Christian Balzer ch...@gol.com a écrit : > > > Hello, > > > > On Thu, 8 Sep 2016 09:48:46 +0200 (CEST) David DELON wrote: > > > >> >

Re: [ceph-users] FW: Multiple public networks and ceph-mon daemons listening

2016-09-08 Thread Jim Kilborn
Thanks for the clarification Greg. The private network was a NAT network, but I got rid of the NAT, and set the head node just to straight routing. I went ahead an set all the daemons to the private network, and its working fine now. I was hoping to avoid routing the outside traffic, but no big

Re: [ceph-users] Client XXX failing to respond to cache pressure

2016-09-08 Thread Gregory Farnum
On Thu, Sep 8, 2016 at 5:59 AM, Georgi Chorbadzhiyski wrote: > Today I was surprised to find our cluster in HEALTH_WARN condition and > searching in documentation was no help at all. > > Does anybody have an idea how to cure the dreaded "failing to respond > to

Re: [ceph-users] PGs lost from cephfs data pool, how to determine which files to restore from backup?

2016-09-08 Thread John Spray
On Thu, Sep 8, 2016 at 3:42 PM, John Spray wrote: > On Thu, Sep 8, 2016 at 2:06 AM, Gregory Farnum wrote: >> On Wed, Sep 7, 2016 at 7:44 AM, Michael Sudnick >> wrote: >>> I've had to force recreate some PGs on my cephfs data pool

Re: [ceph-users] FW: Multiple public networks and ceph-mon daemons listening

2016-09-08 Thread Gregory Farnum
On Thu, Sep 8, 2016 at 7:13 AM, Jim Kilborn wrote: > Thanks for the reply. > > > > When I said the compute nodes mounted the cephfs volume, I am referring to a > real linux cluster of physical machines,. Openstack VM/ compute nodes are not > involved in my setup. We are

Re: [ceph-users] ceph OSD with 95% full

2016-09-08 Thread Ronny Aasen
ceph-dash is VERY easy to set up and get working https://github.com/Crapworks/ceph-dash gives you a nice webpage to manually observe from. the page is also easily read by any alerting software you might have. and you should configure it to alert on anything besides HEALTH_OK Kind regards

Re: [ceph-users] CephFS and calculation of directory size

2016-09-08 Thread John Spray
On Thu, Sep 8, 2016 at 6:59 PM, Ilya Moldovan wrote: > Hello! > > How CephFS calculates the directory size? As I know there is two > implementations: > > 1. Recursive directory traversal like in EXT4 and NTFS > 2. Calculation of the directory size by the file system driver

[ceph-users] Ceph-deploy not creating osd's

2016-09-08 Thread Shain Miley
Hello, I am trying to use ceph-deploy to add some new osd's to our cluster. I have used this method over the last few years to add all of our 107 osd's and things have seemed to work quite well. One difference this time is that we are going to use a pci nvme card to journal the 16 disks in

[ceph-users] CephFS and calculation of directory size

2016-09-08 Thread Ilya Moldovan
Hello! How CephFS calculates the directory size? As I know there is two implementations: 1. Recursive directory traversal like in EXT4 and NTFS 2. Calculation of the directory size by the file system driver and save it as an attribute. In this case, the driver catches adding, deleting and

Re: [ceph-users] Cannot start the Ceph daemons using upstart after upgrading to Jewel 10.2.2

2016-09-08 Thread David
Afaik, the daemons are managed by systemd now on most distros e.g: systemctl start ceph-osd@0.service On Thu, Sep 8, 2016 at 3:36 PM, Simion Marius Rad wrote: > Hello, > > Today I upgraded an Infernalis 9.2.1 cluster to Jewel 10.2.2. > All went well until I wanted to

Re: [ceph-users] Excluding buckets in RGW Multi-Site Sync

2016-09-08 Thread Casey Bodley
On 09/08/2016 08:35 AM, Wido den Hollander wrote: Hi, I've been setting up a RGW Multi-Site [0] configuration in 6 VMs. 3 VMs per cluster and one RGW per cluster. Works just fine, I can create a user in the master zone, create buckets and upload data using s3cmd (S3). What I see is that

Re: [ceph-users] non-effective new deep scrub interval

2016-09-08 Thread David DELON
First, thanks for your answer Christian. - Le 8 Sep 16, à 13:30, Christian Balzer ch...@gol.com a écrit : > Hello, > > On Thu, 8 Sep 2016 09:48:46 +0200 (CEST) David DELON wrote: > >> >> Hello, >> >> i'm using ceph jewel. >> I would like to schedule the deep scrub operations on my own.

[ceph-users] Memory leak with latest ceph code

2016-09-08 Thread Wangzhiyuan
Hi guys I am testing the performance of blue store backend on PCIE SSD with ceph jewel 10.2.2 and I found that the 4K random write performance is bad because of large write amplification. After I got some recently change log which related to this issue. I upgrade to the latest code, and I was

Re: [ceph-users] PGs lost from cephfs data pool, how to determine which files to restore from backup?

2016-09-08 Thread John Spray
On Thu, Sep 8, 2016 at 2:06 AM, Gregory Farnum wrote: > On Wed, Sep 7, 2016 at 7:44 AM, Michael Sudnick > wrote: >> I've had to force recreate some PGs on my cephfs data pool due to some >> cascading disk failures in my homelab cluster. Is there a

Re: [ceph-users] Bluestore crashes

2016-09-08 Thread thomas.swindells
You are right we are running ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) at the moment, if master has had substantial work done on it then it sounds like it is worth us retesting on that. I fully appreciate that it is still a work in progress and that future performance may

[ceph-users] Cannot start the Ceph daemons using upstart after upgrading to Jewel 10.2.2

2016-09-08 Thread Simion Marius Rad
Hello, Today I upgraded an Infernalis 9.2.1 cluster to Jewel 10.2.2. All went well until I wanted to restart the daemons using upstart (initctl ). Any upstart invocation fails to start the daemons. In order to keep the cluster up I started the daemons by myself using the commands invoked usually

Re: [ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-08 Thread Alexey Sheplyakov
Hi, > root@:~# ceph-osd -i 12 --flush-journal > SG_IO: questionable sense data, results may be incorrect > SG_IO: questionable sense data, results may be incorrect As far as I understand these lines is a hdparm warning (OSD uses hdparm command to query the journal device write cache state). The

Re: [ceph-users] Bluestore crashes

2016-09-08 Thread Mark Nelson
It's important to keep in mind that bluestore is still rapidly being developed. At any given commit it might crash, eat data, be horribly slow, destroy your computer, etc. It's very much wild west territory. Jewel's version of bluestore is quite different than what is in master right now.

Re: [ceph-users] Bluestore crashes

2016-09-08 Thread Wido den Hollander
> Op 8 september 2016 om 14:58 schreef thomas.swinde...@yahoo.com: > > > We've been doing some performance testing on Bluestore to see whether it > could be viable to use in the future. > The good news we are seeing significant performance improvements on using it, > so thank you for all the

Re: [ceph-users] FW: Multiple public networks and ceph-mon daemons listening

2016-09-08 Thread Wido den Hollander
> Op 8 september 2016 om 15:02 schreef Jim Kilborn : > > > Hello all… > > I am setting up a ceph cluster (jewel) on a private network. The compute > nodes are all running centos 7 and mounting the cephfs volume using the > kernel driver. The ceph storage nodes are dual

[ceph-users] FW: Multiple public networks and ceph-mon daemons listening

2016-09-08 Thread Jim Kilborn
Hello all… I am setting up a ceph cluster (jewel) on a private network. The compute nodes are all running centos 7 and mounting the cephfs volume using the kernel driver. The ceph storage nodes are dual connected to the private network, as well as our corporate network, as some users need to

[ceph-users] Client XXX failing to respond to cache pressure

2016-09-08 Thread Georgi Chorbadzhiyski
Today I was surprised to find our cluster in HEALTH_WARN condition and searching in documentation was no help at all. Does anybody have an idea how to cure the dreaded "failing to respond to cache pressure" message. As I understand it, it tells me that a client is not responding to MDS request to

[ceph-users] Bluestore crashes

2016-09-08 Thread thomas.swindells
We've been doing some performance testing on Bluestore to see whether it could be viable to use in the future. The good news we are seeing significant performance improvements on using it, so thank you for all the work that has gone into it. The bad news is we keep encountering crashes and

[ceph-users] Excluding buckets in RGW Multi-Site Sync

2016-09-08 Thread Wido den Hollander
Hi, I've been setting up a RGW Multi-Site [0] configuration in 6 VMs. 3 VMs per cluster and one RGW per cluster. Works just fine, I can create a user in the master zone, create buckets and upload data using s3cmd (S3). What I see is that ALL data is synced between the two zones. While I

[ceph-users] rgw meta pool

2016-09-08 Thread Pavan Rallabhandi
Trying it one more time on the users list. In our clusters running Jewel 10.2.2, I see default.rgw.meta pool running into large number of objects, potentially to the same range of objects contained in the data pool. I understand that the immutable metadata entries are now stored in this heap

Re: [ceph-users] non-effective new deep scrub interval

2016-09-08 Thread Christian Balzer
Hello, On Thu, 8 Sep 2016 09:48:46 +0200 (CEST) David DELON wrote: > > Hello, > > i'm using ceph jewel. > I would like to schedule the deep scrub operations on my own. Welcome to the club, alas the ride isn't for the faint of heart. You will want to (re-)search the ML archive (google)

Re: [ceph-users] experiences in upgrading Infernalis to Jewel

2016-09-08 Thread Arvydas Opulskis
Hi, if you are using RGW, you can experience similar problems to ours when creating a bucket. You'll find what went wrong and how we solved it in my older email. Subject of topic is "Can't create bucket (ERROR: endpoints not configured for upstream zone)" Cheers, Arvydas On Thu, Sep 8, 2016 at

Re: [ceph-users] experiences in upgrading Infernalis to Jewel

2016-09-08 Thread felderm
Thanks Alexandre!! We plan to proceed as follows for upgrading infernalis to jewel: Enable Repo on all Nodes #cat /etc/apt/sources.list.d/ceph_com_jewel.list deb http://download.ceph.com/debian-jewel trusty main For all Monitors (one after the other) 1) sudo apt-get update && sudo apt-get

[ceph-users] non-effective new deep scrub interval

2016-09-08 Thread David DELON
Hello, i'm using ceph jewel. I would like to schedule the deep scrub operations on my own. First of all, i have tried to change the interval value for 30 days: In each /etc/ceph/ceph.conf, i have added: [osd] #30*24*3600 osd deep scrub interval = 2592000 I have restarted all the OSD

Re: [ceph-users] rados bench output question

2016-09-08 Thread mj
Hi Christian, Thanks a lot for all your information! (specially the bit that ceph never reads from the journal, but writes to osd from memory was new for me) MJ On 09/07/2016 03:20 AM, Christian Balzer wrote: hello, On Tue, 6 Sep 2016 13:38:45 +0200 lists wrote: Hi Christian, Thanks

Re: [ceph-users] Scrub and deep-scrub repeating over and over

2016-09-08 Thread Arvydas Opulskis
Hi Goncalo, there it is: # ceph pg 11.34a query { "state": "active+clean+scrubbing", "snap_trimq": "[]", "epoch": 6547, "up": [ 24, 3 ], "acting": [ 24, 3 ], "actingbackfill": [ "3", "24" ], "info": {

Re: [ceph-users] Scrub and deep-scrub repeating over and over

2016-09-08 Thread Goncalo Borges
Can you please share the result of ceph pg 11.34a query ? On 09/08/2016 05:03 PM, Arvydas Opulskis wrote: 2016-09-08 08:45:01.441945 osd.24 [INF] 11.34a scrub starts 2016-09-08 08:45:03.585039 osd.24 [INF] 11.34a scrub ok -- Goncalo Borges Research Computing ARC Centre of Excellence for

[ceph-users] Scrub and deep-scrub repeating over and over

2016-09-08 Thread Arvydas Opulskis
Hi all, we have several PG's with repeating scrub tasks. As soon as scrub is complete, it starts again. You can get an idea from the log bellow: $ ceph -w | grep -i "11.34a" 2016-09-08 08:28:33.346798 osd.24 [INF] 11.34a scrub ok 2016-09-08 08:28:37.319018 osd.24 [INF] 11.34a scrub starts

Re: [ceph-users] PGs lost from cephfs data pool, how to determine which files to restore from backup?

2016-09-08 Thread Goncalo Borges
Hi Greg... I've had to force recreate some PGs on my cephfs data pool due to some cascading disk failures in my homelab cluster. Is there a way to easily determine which files I need to restore from backup? My metadata pool is completely intact. Assuming you're on Jewel, run a recursive