[ceph-users] iSCSI Lun issue after MON Out Of Memory

2016-11-14 Thread Daleep Singh Bais
Hello friends, I had RBD images mapped to Windows client through iSCSI, however, the MON got OOM due to some unknown reason. After rebooting MON, I am able to mount one of the image/ iSCSI LUN back to client, second image when mapped is shown as unallocated on windows client. I have data on that

[ceph-users] iSCSI Lun issue after MON Out Of Memory

2016-11-14 Thread Daleep Singh Bais
Hello friends, I had RBD images mapped to Windows client through iSCSI, however, the MON got OOM due to some unknown reason. After rebooting MON, I am able to mount one of the image/ iSCSI LUN back to client, second image when mapped is shown as unallocated on windows client. I have data on that

[ceph-users] radosgw sync_user() failed

2016-11-14 Thread William Josefsson
Hi all, I got these error messages daily on radosgw for multiple users: 2016-11-12 13:49:08.905114 7fbba7fff700 20 RGWUserStatsCache: sync user=myuserid1 2016-11-12 13:49:08.905956 7fbba7fff700 0 ERROR: can't read user header: ret=-2 2016-11-12 13:49:08.905978 7fbba7fff700 0 ERROR: sync_user()

Re: [ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-14 Thread Craig Chi
Hi, What's your Ceph version? I am using Jewel 10.2.3 and systemd seems to work normally. I deployed Ceph by ansible, too. You can check whether you have /lib/systemd/system/ceph-mon.target file. I believe it was a bug existing in 10.2.1 before cfa2d0a08a0bcd0fac153041b9eff17cb6f7c9af has been

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-14 Thread Chris Taylor
Maybe a long shot, but have you checked OSD memory usage? Are the OSD hosts low on RAM and swapping to disk? I am not familiar with your issue, but though that might cause it. Chris On 2016-11-14 3:29 pm, Brad Hubbard wrote: > Have you looked for clues in the output of dump_historic_ops

Re: [ceph-users] Standby-replay mds: 10.2.2

2016-11-14 Thread Goncalo Borges
Hi John... Thanks for replying. Some of the requested input is inline. Cheers Goncalo We are currently undergoing an infrastructure migration. One of the first machines to go through this migration process is our standby-replay mds. We are running 10.2.2. My plan is to: Is the 10.2.2

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-14 Thread Brad Hubbard
Have you looked for clues in the output of dump_historic_ops ? On Tue, Nov 15, 2016 at 1:45 AM, Thomas Danan wrote: > Thanks Luis, > > > > Here are some answers …. > > > > Journals are not on SSD and collocated with OSD daemons host. > > We look at the disk

[ceph-users] unsubscribe

2016-11-14 Thread Glusnevs, Sergejs
smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 4.8 kernel cephfs issue reading old filesystems

2016-11-14 Thread Ilya Dryomov
On Mon, Nov 14, 2016 at 10:05 PM, John Spray wrote: > Hi folks, > > For those with cephfs filesystems created using older versions of > Ceph, you may be affected by this issue if you try to access your > filesystem using the 4.8 or 4.9-rc kernels: >

[ceph-users] 4.8 kernel cephfs issue reading old filesystems

2016-11-14 Thread John Spray
Hi folks, For those with cephfs filesystems created using older versions of Ceph, you may be affected by this issue if you try to access your filesystem using the 4.8 or 4.9-rc kernels: http://tracker.ceph.com/issues/17825 If your data pool does not have ID 0 then you don't need to worry. If

Re: [ceph-users] effect of changing ceph osd primary affinity

2016-11-14 Thread Ilya Dryomov
On Mon, Nov 14, 2016 at 9:38 AM, Ridwan Rashid Noel wrote: > Hi Ilya, > > I tried to test the primary-affinity change so I have setup a small cluster > to test. I am trying to understand how the different components of Ceph > interacts in the event of change of

Re: [ceph-users] rgw print continue and civetweb

2016-11-14 Thread Yehuda Sadeh-Weinraub
On Mon, Nov 14, 2016 at 9:20 AM, Brian Andrus wrote: > Hi William, > > "rgw print continue = true" is an apache specific setting, as mentioned > here: > > http://docs.ceph.com/docs/master/install/install-ceph-gateway/#migrating-from-apache-to-civetweb > > I do not

Re: [ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-14 Thread David Turner
I had to set my mons to sysvinit while my osds are systemd. That allows everything to start up when my system boots. I don't know why the osds don't work with sysvinit and the mon doesn't work with systemd... but that worked to get me running.

Re: [ceph-users] rgw print continue and civetweb

2016-11-14 Thread Brian Andrus
Hi William, "rgw print continue = true" is an apache specific setting, as mentioned here: http://docs.ceph.com/docs/master/install/install-ceph- gateway/#migrating-from-apache-to-civetweb I do not believe it is needed for civetweb. For documentation, you can see or change the version branch in

[ceph-users] ceph-mon not starting on system startup (Ubuntu 16.04 / systemd)

2016-11-14 Thread Matthew Vernon
Hi, I have a problem that my ceph-mon isn't getting started when my machine boots; the OSDs start up just fine. Checking logs, there's no sign of systemd making any attempt to start it, although it is seemingly enabled: root@sto-1-1:~# systemctl status ceph-mon@sto-1-1 ● ceph-mon@sto-1-1.service

Re: [ceph-users] cephfs page cache

2016-11-14 Thread Sean Redmond
Hi, Thanks for looking into this, this seems to mitigate the problem. Do you think this is just related to httpd or is going to impact other services such as nginx and be a wider point to know about / document? Thanks On Mon, Oct 24, 2016 at 2:28 PM, Yan, Zheng wrote: > I

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-14 Thread Thomas Danan
Thanks Luis, Here are some answers …. Journals are not on SSD and collocated with OSD daemons host. We look at the disk performances and did not notice anything wrong with acceptable rw latency < 20ms. No issue on the network as well from what we have seen. There is only one pool in the

Re: [ceph-users] A VM with 6 volumes - hangs

2016-11-14 Thread Luis Periquito
Without knowing the cluster architecture it's hard to know exactly what may be happening. And you sent no information on your cluster... How is the cluster hardware? Where are the journals? How busy are the disks (% time busy)? What is the pool size? Are these replicated or EC pools? On Mon,

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-14 Thread Luis Periquito
Without knowing the cluster architecture it's hard to know exactly what may be happening. How is the cluster hardware? Where are the journals? How busy are the disks (% time busy)? What is the pool size? Are these replicated or EC pools? Have you tried tuning the deep-scrub processes? Have you

Re: [ceph-users] A VM with 6 volumes - hangs

2016-11-14 Thread German Anders
try to see the specific logs for those particularly osd's, and see if something is there, also take a deep close to the pg's that hold those osds Best, *German* 2016-11-14 12:04 GMT-03:00 M Ranga Swami Reddy : > When this issue seen, ceph logs shows "slow requests to

Re: [ceph-users] A VM with 6 volumes - hangs

2016-11-14 Thread M Ranga Swami Reddy
When this issue seen, ceph logs shows "slow requests to OSD" But Ceph status is in OK state. Thanks Swami On Mon, Nov 14, 2016 at 8:27 PM, German Anders wrote: > Could you share some info about the ceph cluster? logs? did you see > anything different from normal op on

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > William Josefsson > Sent: 14 November 2016 14:46 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Ceph Blog Articles > > Hi Nick,

Re: [ceph-users] A VM with 6 volumes - hangs

2016-11-14 Thread German Anders
Could you share some info about the ceph cluster? logs? did you see anything different from normal op on the logs? Best, *German* 2016-11-14 11:46 GMT-03:00 M Ranga Swami Reddy : > +ceph-devel > > On Fri, Nov 11, 2016 at 5:09 PM, M Ranga Swami Reddy

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread William Josefsson
Hi Nick, I found the graph very useful explaining the concept. thx for sharing. I'm currently planning to setup a new cluster and wanted to get low latency by using, 2U server, 6xIntel P3700 400GB for journal and 18x1.8TB Hitachi Spinning 10k SAS. My OSD:Journal ratio would be 3:1. All over

Re: [ceph-users] A VM with 6 volumes - hangs

2016-11-14 Thread M Ranga Swami Reddy
+ceph-devel On Fri, Nov 11, 2016 at 5:09 PM, M Ranga Swami Reddy wrote: > Hello, > I am using the ceph volumes with a VM. Details are below: > > VM: > OS: Ubuntu 14.0.4 >CPU: 12 Cores >RAM: 40 GB > > Volumes: >Size: 1 TB > No: 6 Volumes > > > With

[ceph-users] crashing mon with crush_ruleset change

2016-11-14 Thread Luis Periquito
I have a pool that I every time I try to change it's crush_ruleset crashes 2 out of my 3 mons, and it's always the same. I've tried leaving the first one down and it crashes the second. It's a replicated pool, and I have other pools that look exactly the same. I've deep-scrub'ed all the PG's to

Re: [ceph-users] Can we drop ubuntu 14.04 (trusty) for kraken and lumninous?

2016-11-14 Thread Sage Weil
On Fri, 11 Nov 2016, Sage Weil wrote: > Currently the distros we use for upstream testing are > > centos 7.x > ubuntu 16.04 (xenial) > ubuntu 14.04 (trusty) > > We also do some basic testing for Debian 8 and Fedora (some old version). > > Jewel was the first release that had native systemd

Re: [ceph-users] Standby-replay mds: 10.2.2

2016-11-14 Thread John Spray
On Mon, Nov 14, 2016 at 12:46 AM, Goncalo Borges wrote: > Hi Greg, Jonh, Zheng, CephFSers > > Maybe a simple question but I think it is better to ask first than to > complain after. > > We are currently undergoing an infrastructure migration. One of the first >

Re: [ceph-users] Intermittent permission denied using kernel client with mds path cap

2016-11-14 Thread John Spray
On Thu, Nov 10, 2016 at 3:41 PM, Dan van der Ster wrote: > Hi all, Hi Zheng, > > We're seeing a strange issue with the kernel cephfs clients, combined > with a path restricted mds cap. It seems that files/dirs are > intermittently not created due to permission denied. > > For

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Maged Mokhtar
Hi Nick, Actually I was referring to an all SSD cluster. I expect the latency to increase from when you have a low load / queue depth to when you have a cluster under heavy load at/near its maximum iops throughput when the cpu cores are near peak utilization. Cheers /Maged

Re: [ceph-users] Can we drop ubuntu 14.04 (trusty) for kraken and lumninous?

2016-11-14 Thread Tomasz Kuzemko
I vote for 1. until Ubuntu 14.04 is supported. On 11.11.2016 19:43, Sage Weil wrote: > Currently the distros we use for upstream testing are > > centos 7.x > ubuntu 16.04 (xenial) > ubuntu 14.04 (trusty) > > We also do some basic testing for Debian 8 and Fedora (some old version). > > Jewel

Re: [ceph-users] Can we drop ubuntu 14.04 (trusty) for kraken and lumninous?

2016-11-14 Thread Özhan Rüzgar Karaman
Hi; There are still lots of people who are using 14.04 and its also supported till 2019 so +1 for option 1. Thanks Özhan On Fri, Nov 11, 2016 at 11:22 PM, Blair Bethwaite wrote: > Worth considering OpenStack and Ubuntu cloudarchive release cycles > here. Mitaka is

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
Hi Maged, I would imagine as soon as you start saturating the disks, the latency impact would make the savings from the fast CPU's pointless. Really you would only try and optimise the latency if you are using SSD based cluster. This was only done with spinning disks in our case with a low

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
Hi, Yes, I used fio, here is the fio file I used for the latency test [global] ioengine=rbd randrepeat=0 clientname=admin rbdname=test2 invalidate=0# mandatory rw=write bs=4k direct=1 time_based=1 runtime=360 numjobs=1 [rbd_iodepth1] iodepth=1 > -Original Message- > From: Fulvio