[ceph-users] Luminous RadosGW with Apache

2017-11-15 Thread Monis Monther
Good Day,


I am trying to install radosgw with apache instead of civetweb, I am under
Luminous 12.2.0

I followed documentation from docs.ceph.com/docs/jewel/man/8/radosgw

I keep getting error of permission denied (apache cant access the socket
file)

changing the ownership of /var/run/ceph does not solve the problem either
and gives me a connection reset by client.

I also noticed that radogsgw does not run as the apache user.

Is apache config still supported in Luminous or I only have civetweb

Thanks
Monis





-- 
Best Regards
Monis
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Luminous RadosGW with Apache

2017-11-15 Thread Monis Monther
Good Day,


I am trying to install radosgw with apache instead of civetweb, I am under
Luminous 12.2.0

I followed documentation from docs.ceph.com/docs/jewel/man/8/radosgw

I keep getting error of permission denied (apache cant access the socket
file)

changing the ownership of /var/run/ceph does not solve the problem either
and gives me a connection reset by client.

I also noticed that radogsgw does not run as the apache user.

Is apache config still supported in Luminous or I only have civetweb

Thanks
Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Where are the ceph-iscsi-* RPMS officially located?

2017-11-15 Thread Richard Chan
For the iSCSI RPMS referenced in

http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/

Where is the "official" RPM distribution repo?

ceph-iscsi-config
ceph-iscsi-cli



-- 
Richard Chan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Luminous Directory

2017-11-15 Thread Linh Vu
Luminous supports this now http://docs.ceph.com/docs/master/cephfs/dirfrags/


and in my testing it has handled 2M files per directory with no problem.

Configuring Directory fragmentation — Ceph 
Documentation
docs.ceph.com
Configuring Directory fragmentation¶ In CephFS, directories are fragmented when 
they become very large or very busy. This splits up the metadata so that it can 
be ...



From: ceph-users  on behalf of Hauke Homburg 

Sent: Thursday, 16 November 2017 5:06:59 PM
To: ceph-users
Subject: [ceph-users] Ceph Luminous Directory

Hello List,

In our Factory we have again the Question cephfs. We noticed the new
Release Luminous.

We had problems with Jewel and big Directories and cephfs.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-October/013628.html

Does anyone know a Limitation in Luminous in cephfs in big Directories?

Thanks for help.

Regards
Hauke

--
http://www.w3-creative.de

http://www.westchat.de

https://friendica.westchat.de/profile/hauke

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Luminous Directory

2017-11-15 Thread Hauke Homburg
Hello List,

In our Factory we have again the Question cephfs. We noticed the new
Release Luminous.

We had problems with Jewel and big Directories and cephfs.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-October/013628.html

Does anyone know a Limitation in Luminous in cephfs in big Directories?

Thanks for help.

Regards
Hauke

-- 
www.w3-creative.de

www.westchat.de

https://friendica.westchat.de/profile/hauke

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [Luminous, bluestore]How to reduce memory usage of OSDs?

2017-11-15 Thread lin yunfan
Hi all,
  Is there a way to reduce memory usage of osd to below 800MB per
osd?My server has only about 1 GB memory for every osd and the osd
sometimes gets killed by oom killer.
I have used the newer version of luminous from github(12.2.1-249-g42172a4
(42172a443183ffe6b36e85770e53fe678db293bf)  to fix some memory problem of
bluestore cache) and used this config for bluestore cache
bluestore_cache_size = 104857600
bluestore_cache_kv_max = 67108864

 I have tested with both replicated and ec.Ec seems cost more memory
but even in replicated mode osd still gets killed by oom killer.

 I also find that if I restart  all the osds when the memory usage is
high then the memory usage goes down and It takes a few hour to get to the
memory usage level before the restart.

 I can bear with some performance loss to get the memory usage down but
I am running out of idea how to do that:<
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] luminous vs jewel rbd performance

2017-11-15 Thread Rafael Lopez
Hey Linh...have not but if it makes any difference we are still using
filestore.

On 16 Nov. 2017 12:31, "Linh Vu"  wrote:

> Noticed that you're on 12.2.0 Raf. 12.2.1 fixed a lot of performance
> issues from 12.2.0 for us on Luminous/Bluestore. Have you tried upgrading
> to it?
> --
> *From:* ceph-users  on behalf of
> Rafael Lopez 
> *Sent:* Thursday, 16 November 2017 11:59:14 AM
> *To:* Mark Nelson
> *Cc:* ceph-users
> *Subject:* Re: [ceph-users] luminous vs jewel rbd performance
>
> Hi Mark,
>
> Sorry for the late reply... I have been away on vacation/openstack summit
> etc for over a month now and looking at this again.
>
> Yeah the snippet was a bit misleading. The fio file contains small block
> jobs as well as big block jobs:
>
> [write-rbd1-4m-depth1]
> rbdname=rbd-tester-fio
> bs=4m
> iodepth=1
> rw=write
> stonewall
> [write-rbd2-4m-depth16]
> rbdname=rbd-tester-fio-2
> bs=4m
> iodepth=16
> rw=write
> stonewall
>
> [read-rbd1-4m-depth1]
> rbdname=rbd-tester-fio
> bs=4m
> iodepth=1
> rw=read
> stonewall
> [read-rbd2-4m-depth16]
> rbdname=rbd-tester-fio-2
> bs=4m
> iodepth=16
> rw=read
> stonewall
>
> The performance hit is more noticeable on bigblock, I think up to 10x
> slower on some runs but as a percentage it seems to affect a small block
> workload too. I understand that runs will vary... I wish I had more runs
> from before upgrading to luminous but I only have that single set of
> results. Regardless, I cannot come close to that single set of results
> since upgrading to luminous.
> I understand the caching stuff you mentioned, however we have not changed
> any of that config since the upgrade and the fio job is exactly the same.
> So if I do many runs on luminous throughout the course of a day, including
> when we think the cluster is least busy, we should be able to come pretty
> close to the jewel result on at least one of the runs or is my thinking
> flawed?
>
> Sage mentioned at openstack that there was a perf regression with librbd
> which will be fixed in 12.2.2 are you aware of this? If so can you send
> me the link to the bug?
>
> Cheers,
> Raf
>
>
> On 22 September 2017 at 00:31, Mark Nelson  wrote:
>
> Hi Rafael,
>
> In the original email you mentioned 4M block size, seq read, but here it
> looks like you are doing 4k writes?  Can you clarify?  If you are doing 4k
> direct sequential writes with iodepth=1 and are also using librbd cache,
> please make sure that librbd is set to writeback mode in both cases.  RBD
> by default will not kick into WB mode until it sees a flush request, and
> the librbd engine in fio doesn't issue one before a test is started.  It
> can be pretty easy to end up in a situation where writeback cache is active
> on some tests but not others if you aren't careful.  IE If one of your
> tests was done after a flush and the other was not, you'd likely see a
> dramatic difference in performance during this test.
>
> You can avoid this by telling librbd to always use WB mode (at least when
> benchmarking):
>
> rbd cache writethrough until flush = false
>
> Mark
>
>
> On 09/20/2017 01:51 AM, Rafael Lopez wrote:
>
> Hi Alexandre,
>
> Yeah we are using filestore for the moment with luminous. With regards
> to client, I tried both jewel and luminous librbd versions against the
> luminous cluster - similar results.
>
> I am running fio on a physical machine with fio rbd engine. This is a
> snippet of the fio config for the runs (the complete jobfile adds
> variations of read/write/block size/iodepth).
>
> [global]
> ioengine=rbd
> clientname=cinder-volume
> pool=rbd-bronze
> invalidate=1
> ramp_time=5
> runtime=30
> time_based
> direct=1
>
> [write-rbd1-4k-depth1]
> rbdname=rbd-tester-fio
> bs=4k
> iodepth=1
> rw=write
> stonewall
>
> [write-rbd2-4k-depth16]
> rbdname=rbd-tester-fio-2
> bs=4k
> iodepth=16
> rw=write
> stonewall
>
> Raf
>
> On 20 September 2017 at 16:43, Alexandre DERUMIER  > wrote:
>
> Hi
>
> so, you use also filestore on luminous ?
>
> do you have also upgraded librbd on client ? (are you benching
> inside a qemu machine ? or directly with fio-rbd ?)
>
>
>
> (I'm going to do a lot of benchmarks in coming week, I'll post
> results on mailing soon.)
>
>
>
> - Mail original -
> De: "Rafael Lopez"  >
> À: "ceph-users"  >
>
> Envoyé: Mercredi 20 Septembre 2017 08:17:23
> Objet: [ceph-users] luminous vs jewel rbd performance
>
> hey guys.
> wondering if anyone else has done some solid benchmarking of jewel
> vs luminous, in particular on the same cluster that has been
> upgraded (same cluster, client and config).
>
> we have recently upgraded a cluster from 10.2.9 to 12.2.0, 

Re: [ceph-users] 10.2.10: "default" zonegroup in custom root pool not found

2017-11-15 Thread Richard Chan
Yes, that was it. Thank you.

On Thu, Nov 16, 2017 at 1:48 AM, Casey Bodley  wrote:

>
>
> On 11/15/2017 12:11 AM, Richard Chan wrote:
>
> After creating a non-default root pool
> rgw_realm_root_pool = gold.rgw.root
> rgw_zonegroup_root_pool = gold.rgw.root
> rgw_period_root_pool = gold.rgw.root
> rgw_zone_root_pool = gold.rgw.root
> rgw_region = gold.rgw.root
>
>
> You probably meant to set rgw_region_root_pool for that last line. As it
> is, this is triggering some compatibility code that sets 'rgw_zonegroup =
> rgw_region' when a region is given but zonegroup is not.
>
>
> radosgw-admin realm create --rgw-realm gold --default
> radosgw-admin zonegroup create --rgw-zonegroup=us  --default --master
> --endpoints http://rgw:7480
>
> The "default" is not respected anymore:
>
>
> radosgw-admin period update --commit
>
>
>
> 2017-11-15 04:50:42.400404 7f694dd4e9c0  0 failed reading zonegroup info:
> ret -2 (2) No such file or directory
> couldn't init storage provider
>
>
> I require --rgw-zonegroup=us on command line or /etc/ceph/ceph.conf
>
> This seems to be regression.
>
>
>
>
> --
> Richard Chan
>
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Richard Chan
Chief Architect

TreeBox Solutions Pte Ltd
1 Commonwealth Lane #03-01
Singapore 149544
Tel: 6570 3725
http://www.treeboxsolutions.com

Co.Reg.No. 201100585R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] luminous vs jewel rbd performance

2017-11-15 Thread Linh Vu
Noticed that you're on 12.2.0 Raf. 12.2.1 fixed a lot of performance issues 
from 12.2.0 for us on Luminous/Bluestore. Have you tried upgrading to it?


From: ceph-users  on behalf of Rafael Lopez 

Sent: Thursday, 16 November 2017 11:59:14 AM
To: Mark Nelson
Cc: ceph-users
Subject: Re: [ceph-users] luminous vs jewel rbd performance

Hi Mark,

Sorry for the late reply... I have been away on vacation/openstack summit etc 
for over a month now and looking at this again.

Yeah the snippet was a bit misleading. The fio file contains small block jobs 
as well as big block jobs:

[write-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=write
stonewall
[write-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=write
stonewall

[read-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=read
stonewall
[read-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=read
stonewall

The performance hit is more noticeable on bigblock, I think up to 10x slower on 
some runs but as a percentage it seems to affect a small block workload too. I 
understand that runs will vary... I wish I had more runs from before upgrading 
to luminous but I only have that single set of results. Regardless, I cannot 
come close to that single set of results since upgrading to luminous.
I understand the caching stuff you mentioned, however we have not changed any 
of that config since the upgrade and the fio job is exactly the same. So if I 
do many runs on luminous throughout the course of a day, including when we 
think the cluster is least busy, we should be able to come pretty close to the 
jewel result on at least one of the runs or is my thinking flawed?

Sage mentioned at openstack that there was a perf regression with librbd which 
will be fixed in 12.2.2 are you aware of this? If so can you send me the 
link to the bug?

Cheers,
Raf


On 22 September 2017 at 00:31, Mark Nelson 
> wrote:
Hi Rafael,

In the original email you mentioned 4M block size, seq read, but here it looks 
like you are doing 4k writes?  Can you clarify?  If you are doing 4k direct 
sequential writes with iodepth=1 and are also using librbd cache, please make 
sure that librbd is set to writeback mode in both cases.  RBD by default will 
not kick into WB mode until it sees a flush request, and the librbd engine in 
fio doesn't issue one before a test is started.  It can be pretty easy to end 
up in a situation where writeback cache is active on some tests but not others 
if you aren't careful.  IE If one of your tests was done after a flush and the 
other was not, you'd likely see a dramatic difference in performance during 
this test.

You can avoid this by telling librbd to always use WB mode (at least when 
benchmarking):

rbd cache writethrough until flush = false

Mark


On 09/20/2017 01:51 AM, Rafael Lopez wrote:
Hi Alexandre,

Yeah we are using filestore for the moment with luminous. With regards
to client, I tried both jewel and luminous librbd versions against the
luminous cluster - similar results.

I am running fio on a physical machine with fio rbd engine. This is a
snippet of the fio config for the runs (the complete jobfile adds
variations of read/write/block size/iodepth).

[global]
ioengine=rbd
clientname=cinder-volume
pool=rbd-bronze
invalidate=1
ramp_time=5
runtime=30
time_based
direct=1

[write-rbd1-4k-depth1]
rbdname=rbd-tester-fio
bs=4k
iodepth=1
rw=write
stonewall

[write-rbd2-4k-depth16]
rbdname=rbd-tester-fio-2
bs=4k
iodepth=16
rw=write
stonewall

Raf

On 20 September 2017 at 16:43, Alexandre DERUMIER 

>> wrote:

Hi

so, you use also filestore on luminous ?

do you have also upgraded librbd on client ? (are you benching
inside a qemu machine ? or directly with fio-rbd ?)



(I'm going to do a lot of benchmarks in coming week, I'll post
results on mailing soon.)



- Mail original -
De: "Rafael Lopez" 
>>
À: "ceph-users" 
>>

Envoyé: Mercredi 20 Septembre 2017 08:17:23
Objet: [ceph-users] luminous vs jewel rbd performance

hey guys.
wondering if anyone else has done some solid benchmarking of jewel
vs luminous, in particular on the same cluster that has been
upgraded (same cluster, client and config).

we have recently upgraded a cluster from 10.2.9 to 12.2.0, and
unfortunately i only captured results from a single fio (librbd) run
with a few jobs in it before upgrading. i have run the same fio
jobfile many times at different times of the 

Re: [ceph-users] luminous vs jewel rbd performance

2017-11-15 Thread Rafael Lopez
Hi Mark,

Sorry for the late reply... I have been away on vacation/openstack summit
etc for over a month now and looking at this again.

Yeah the snippet was a bit misleading. The fio file contains small block
jobs as well as big block jobs:

[write-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=write
stonewall
[write-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=write
stonewall

[read-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=read
stonewall
[read-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=read
stonewall

The performance hit is more noticeable on bigblock, I think up to 10x
slower on some runs but as a percentage it seems to affect a small block
workload too. I understand that runs will vary... I wish I had more runs
from before upgrading to luminous but I only have that single set of
results. Regardless, I cannot come close to that single set of results
since upgrading to luminous.
I understand the caching stuff you mentioned, however we have not changed
any of that config since the upgrade and the fio job is exactly the same.
So if I do many runs on luminous throughout the course of a day, including
when we think the cluster is least busy, we should be able to come pretty
close to the jewel result on at least one of the runs or is my thinking
flawed?

Sage mentioned at openstack that there was a perf regression with librbd
which will be fixed in 12.2.2 are you aware of this? If so can you send
me the link to the bug?

Cheers,
Raf


On 22 September 2017 at 00:31, Mark Nelson  wrote:

> Hi Rafael,
>
> In the original email you mentioned 4M block size, seq read, but here it
> looks like you are doing 4k writes?  Can you clarify?  If you are doing 4k
> direct sequential writes with iodepth=1 and are also using librbd cache,
> please make sure that librbd is set to writeback mode in both cases.  RBD
> by default will not kick into WB mode until it sees a flush request, and
> the librbd engine in fio doesn't issue one before a test is started.  It
> can be pretty easy to end up in a situation where writeback cache is active
> on some tests but not others if you aren't careful.  IE If one of your
> tests was done after a flush and the other was not, you'd likely see a
> dramatic difference in performance during this test.
>
> You can avoid this by telling librbd to always use WB mode (at least when
> benchmarking):
>
> rbd cache writethrough until flush = false
>
> Mark
>
>
> On 09/20/2017 01:51 AM, Rafael Lopez wrote:
>
>> Hi Alexandre,
>>
>> Yeah we are using filestore for the moment with luminous. With regards
>> to client, I tried both jewel and luminous librbd versions against the
>> luminous cluster - similar results.
>>
>> I am running fio on a physical machine with fio rbd engine. This is a
>> snippet of the fio config for the runs (the complete jobfile adds
>> variations of read/write/block size/iodepth).
>>
>> [global]
>> ioengine=rbd
>> clientname=cinder-volume
>> pool=rbd-bronze
>> invalidate=1
>> ramp_time=5
>> runtime=30
>> time_based
>> direct=1
>>
>> [write-rbd1-4k-depth1]
>> rbdname=rbd-tester-fio
>> bs=4k
>> iodepth=1
>> rw=write
>> stonewall
>>
>> [write-rbd2-4k-depth16]
>> rbdname=rbd-tester-fio-2
>> bs=4k
>> iodepth=16
>> rw=write
>> stonewall
>>
>> Raf
>>
>> On 20 September 2017 at 16:43, Alexandre DERUMIER > > wrote:
>>
>> Hi
>>
>> so, you use also filestore on luminous ?
>>
>> do you have also upgraded librbd on client ? (are you benching
>> inside a qemu machine ? or directly with fio-rbd ?)
>>
>>
>>
>> (I'm going to do a lot of benchmarks in coming week, I'll post
>> results on mailing soon.)
>>
>>
>>
>> - Mail original -
>> De: "Rafael Lopez" > >
>> À: "ceph-users" > >
>>
>> Envoyé: Mercredi 20 Septembre 2017 08:17:23
>> Objet: [ceph-users] luminous vs jewel rbd performance
>>
>> hey guys.
>> wondering if anyone else has done some solid benchmarking of jewel
>> vs luminous, in particular on the same cluster that has been
>> upgraded (same cluster, client and config).
>>
>> we have recently upgraded a cluster from 10.2.9 to 12.2.0, and
>> unfortunately i only captured results from a single fio (librbd) run
>> with a few jobs in it before upgrading. i have run the same fio
>> jobfile many times at different times of the day since upgrading,
>> and been unable to produce a close match to the pre-upgrade (jewel)
>> run from the same client. one particular job is significantly slower
>> (4M block size, iodepth=1, seq read), up to 10x in one run.
>>
>> i realise i havent supplied much detail and it could be dozens of
>> things, but i just wanted to see if anyone else had done more
>> quantitative benchmarking or had similar 

Re: [ceph-users] OSD Random Failures - Latest Luminous

2017-11-15 Thread Eric Nelson
I've been seeing these as well on our SSD cachetier that's been ravaged by
disk failures as of late Same tp_peering assert as above even running
luminous branch from git.

Let me know if you have a bug filed I can +1 or have found a workaround.

E

On Wed, Nov 15, 2017 at 10:25 AM, Ashley Merrick 
wrote:

> Hello,
>
>
>
> After replacing a single OSD disk due to a failed disk I am now seeing 2-3
> OSD’s randomly stop and fail to start, do a boot loop get to load_pgs and
> then fail with the following (I tried setting OSD log’s to 5/5 but didn’t
> get any extra lines around the error just more information pre boot.
>
>
>
> Could this be a certain PG causing these OSD’s to crash (6.2f2s10 for
> example)?
>
>
>
> -9> 2017-11-15 17:37:14.696229 7fa4ec50f700  1 osd.37 pg_epoch: 161571
> pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209]
> local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474
> les/c/f 161521/152523/159786 161517/161519/161519)
> [34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,
> 118,120,28,20,53,54,2147483647] r=1 lpr=161563 pi=[152474,161519)/1
> crt=161562'158208 lcod 0'0 unknown NOTIFY m=21] state: transitioning
> to Stray
>
> -8> 2017-11-15 17:37:14.696239 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209]
> local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474
> les/c/f 161521/152523/159786 161517/161519/161519)
> [34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,
> 118,120,28,20,53,54,2147483647] r=1 lpr=161563 pi=[152474,161519)/1
> crt=161562'158208 lcod 0'0 unknown NOTIFY m=21] exit Start 0.19 0
> 0.00
>
> -7> 2017-11-15 17:37:14.696250 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209]
> local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474
> les/c/f 161521/152523/159786 161517/161519/161519)
> [34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,
> 118,120,28,20,53,54,2147483647] r=1 lpr=161563 pi=[152474,161519)/1
> crt=161562'158208 lcod 0'0 unknown NOTIFY m=21] enter Started/Stray
>
> -6> 2017-11-15 17:37:14.696324 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] exit
> Reset 3.363755 2 0.76
>
> -5> 2017-11-15 17:37:14.696337 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter
> Started
>
> -4> 2017-11-15 17:37:14.696346 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter
> Start
>
> -3> 2017-11-15 17:37:14.696353 7fa4ec50f700  1 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5]
> state: transitioning to Stray
>
> -2> 2017-11-15 17:37:14.696364 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] exit
> Start 0.18 0 0.00
>
> -1> 2017-11-15 17:37:14.696372 7fa4ec50f700  5 osd.37 pg_epoch: 161571
> pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712]
> local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962
> les/c/f 161519/160963/159786 161517/161517/108939)
> [96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570
> pi=[160962,161517)/2 crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter
> Started/Stray
>
>  0> 2017-11-15 17:37:14.697245 7fa4ebd0e700 -1 *** Caught signal
> (Aborted) **
>
> in thread 7fa4ebd0e700 thread_name:tp_peering
>
>

Re: [ceph-users] Moving bluestore WAL and DB after bluestore creation

2017-11-15 Thread Shawn Edwards
On Wed, Nov 15, 2017, 11:07 David Turner  wrote:

> I'm not going to lie.  This makes me dislike Bluestore quite a bit.  Using
> multiple OSDs to an SSD journal allowed for you to monitor the write
> durability of the SSD and replace it without having to out and re-add all
> of the OSDs on the device.  Having to now out and backfill back onto the
> HDDs is awful and would have made a time when I realized that 20 journal
> SSDs all ran low on writes at the same time nearly impossible to recover
> from.
>
> Flushing journals, replacing SSDs, and bringing it all back online was a
> slick process.  Formatting the HDDs and backfilling back onto the same
> disks sounds like a big regression.  A process to migrate the WAL and DB
> onto the HDD and then back off to a new device would be very helpful.
>
> On Wed, Nov 15, 2017 at 10:51 AM Mario Giammarco 
> wrote:
>
>> It seems it is not possible. I recreated the OSD
>>
>> 2017-11-12 17:44 GMT+01:00 Shawn Edwards :
>>
>>> I've created some Bluestore OSD with all data (wal, db, and data) all on
>>> the same rotating disk.  I would like to now move the wal and db onto an
>>> nvme disk.  Is that possible without re-creating the OSD?
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
This.  Exactly this.  Not being able to move the .db and .wal data on and
off the main storage disk on Bluestore is a regression.


>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD Random Failures - Latest Luminous

2017-11-15 Thread Ashley Merrick
Hello,

After replacing a single OSD disk due to a failed disk I am now seeing 2-3 
OSD's randomly stop and fail to start, do a boot loop get to load_pgs and then 
fail with the following (I tried setting OSD log's to 5/5 but didn't get any 
extra lines around the error just more information pre boot.

Could this be a certain PG causing these OSD's to crash (6.2f2s10 for example)?

-9> 2017-11-15 17:37:14.696229 7fa4ec50f700  1 osd.37 pg_epoch: 161571 
pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209] 
local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474 les/c/f 
161521/152523/159786 161517/161519/161519) 
[34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,118,120,28,20,53,54,2147483647]
 r=1 lpr=161563 pi=[152474,161519)/1 crt=161562'158208 lcod 0'0 unknown NOTIFY 
m=21] state: transitioning to Stray
-8> 2017-11-15 17:37:14.696239 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209] 
local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474 les/c/f 
161521/152523/159786 161517/161519/161519) 
[34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,118,120,28,20,53,54,2147483647]
 r=1 lpr=161563 pi=[152474,161519)/1 crt=161562'158208 lcod 0'0 unknown NOTIFY 
m=21] exit Start 0.19 0 0.00
-7> 2017-11-15 17:37:14.696250 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f9s1( v 161563'158209 lc 161175'158153 (150659'148187,161563'158209] 
local-lis/les=161519/161521 n=47572 ec=31534/31534 lis/c 161519/152474 les/c/f 
161521/152523/159786 161517/161519/161519) 
[34,37,13,12,66,69,118,120,28,20,88,0,2]/[34,37,13,12,66,69,118,120,28,20,53,54,2147483647]
 r=1 lpr=161563 pi=[152474,161519)/1 crt=161562'158208 lcod 0'0 unknown NOTIFY 
m=21] enter Started/Stray
-6> 2017-11-15 17:37:14.696324 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] exit Reset 3.363755 2 0.76
-5> 2017-11-15 17:37:14.696337 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter Started
-4> 2017-11-15 17:37:14.696346 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter Start
-3> 2017-11-15 17:37:14.696353 7fa4ec50f700  1 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] state: transitioning to 
Stray
-2> 2017-11-15 17:37:14.696364 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] exit Start 0.18 0 0.00
-1> 2017-11-15 17:37:14.696372 7fa4ec50f700  5 osd.37 pg_epoch: 161571 
pg[6.2f2s10( v 161570'157712 lc 161175'157648 (160455'154564,161570'157712] 
local-lis/les=161517/161519 n=47328 ec=31534/31534 lis/c 161517/160962 les/c/f 
161519/160963/159786 161517/161517/108939) 
[96,100,79,4,69,65,57,59,135,134,37,35,18] r=10 lpr=161570 pi=[160962,161517)/2 
crt=161560'157711 lcod 0'0 unknown NOTIFY m=5] enter Started/Stray
 0> 2017-11-15 17:37:14.697245 7fa4ebd0e700 -1 *** Caught signal (Aborted) 
**
in thread 7fa4ebd0e700 thread_name:tp_peering

ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
1: (()+0xa3acdc) [0x55dfb6ba3cdc]
2: (()+0xf890) [0x7fa510e2c890]
3: (gsignal()+0x37) [0x7fa50fe66067]
4: (abort()+0x148) [0x7fa50fe67448]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27f) 
[0x55dfb6be6f5f]
6: (PG::start_peering_interval(std::shared_ptr, std::vector const&, int, std::vector 
const&, int, 

Re: [ceph-users] Cluster network slower than public network

2017-11-15 Thread Ronny Aasen

On 15.11.2017 13:50, Gandalf Corvotempesta wrote:
As 10gb switches are expansive, what would happen by using a gigabit 
cluster network and a 10gb public network?


Replication and rebalance should be slow, but what about public I/O ?
When a client wants to write to a file, it does over the public 
network and the ceph automatically replicate it over the cluster 
network or the whole IO is made over the public?





public io would be slow.
each write goes from client to primary osd on public network, then is 
replicated 2 times to the secondary osd's over the cluster network, then 
the client is informed the block is written.
since cluster network would see 2x write traffic compared to public 
network when things a OK. and many times the traffic of the public 
network when things are recovering or backfilling. i would prioritize 
the clusternetwork for the highest speed if one could not have 10Gbps on 
everything.


kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 10.2.10: "default" zonegroup in custom root pool not found

2017-11-15 Thread Casey Bodley



On 11/15/2017 12:11 AM, Richard Chan wrote:

After creating a non-default root pool
rgw_realm_root_pool = gold.rgw.root
rgw_zonegroup_root_pool = gold.rgw.root
rgw_period_root_pool = gold.rgw.root
rgw_zone_root_pool = gold.rgw.root
rgw_region = gold.rgw.root


You probably meant to set rgw_region_root_pool for that last line. As it 
is, this is triggering some compatibility code that sets 'rgw_zonegroup 
= rgw_region' when a region is given but zonegroup is not.




radosgw-admin realm create --rgw-realm gold --default
radosgw-admin zonegroup create --rgw-zonegroup=us  --default --master 
--endpoints http://rgw:7480


The "default" is not respected anymore:


radosgw-admin period update --commit
2017-11-15 04:50:42.400404 7f694dd4e9c0  0 failed reading zonegroup 
info: ret -2 (2) No such file or directory

couldn't init storage provider


I require --rgw-zonegroup=us on command line or /etc/ceph/ceph.conf

This seems to be regression.




--
Richard Chan



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw multi site different period

2017-11-15 Thread Casey Bodley
Your period configuration is indeed consistent between zones. This 
"master is on a different period" error is specific to the metadata sync 
status. It's saying that zone b is unable to finish syncing the metadata 
changes from zone a that occurred during the previous period. Even 
though zone b was the master during that period, it needs to re-sync 
from zone a to make sure everyone ends up with a consistent view (even 
if this results in the loss of metadata changes).


It sounds like zone a was re-promoted to master before it had a chance 
to catch up completely. The docs offer some guidance [1] to avoid this 
situation, but you can recover on zone b by running `radosgw-admin 
metadata sync init` and restarting its gateways to restart a full sync.


[1] 
http://docs.ceph.com/docs/luminous/radosgw/multisite/#changing-the-metadata-master-zone


On 11/15/2017 02:56 AM, Kim-Norman Sahm wrote:

both cluster are in the same epoch and period:
  
root@ceph-a-1:~# radosgw-admin period get-current

{
 "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current
{
 "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

but the sync state is still "master is on a different period":

root@ceph-b-1:~# radosgw-admin sync status
   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
   metadata sync syncing
 full sync: 0/64 shards
 master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
 incremental sync: 64/64 shards
 metadata is caught up with master
   data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
 syncing
 full sync: 0/128 shards
 incremental sync: 128/128 shards
 data is caught up with source


Am Dienstag, den 14.11.2017, 18:21 +0100 schrieb Kim-Norman Sahm:

both cluster are in the same epoch and period:

root@ceph-a-1:~# radosgw-admin period get-current
{
 "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current
{
 "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

Am Dienstag, den 14.11.2017, 17:05 + schrieb David Turner:

I'm assuming you've looked at the period in both places `radosgw-
admin period get` and confirmed that the second site is behind the
master site (based on epochs).  I'm also assuming (since you linked
the instructions) that you've done `radosgw-admin period pull` on
the
second site to get any period updates that have been done to the
master site.

If my assumptions are wrong.  Then you should do those things.  If
my
assumptions are correct, then running `radosgw-admin period update
--
commit` on the the master site and `radosgw-admin period pull` on
the
second site might fix this.  If you've already done that as well
(as
they're steps in the article you linked), then you need someone
smarter than I am to chime in.

On Tue, Nov 14, 2017 at 11:35 AM Kim-Norman Sahm 
wrote:

hi,

i've installed a ceph multi site setup with two ceph clusters and
each
one radosgw.
the multi site setup was in sync, so i tried a failover.
cluster A is going down and i've changed the zone (b) on cluster
b
to
the new master zone.
it's working fine.

now i start the cluster A and try to switch back the master zone
to
A.
cluster A believes that he is the master, cluster b is secondary.
but on the secondary is a different period and the bucket delta
is
not
synced to the new master zone:

root@ceph-a-1:~# radosgw-admin sync status
   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
   metadata sync no sync (zone is master)
   data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
 syncing
 full sync: 0/128 shards
 incremental sync: 128/128 shards
 data is caught up with source

root@ceph-b-1:~# radosgw-admin sync status
   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
   metadata sync syncing
 full sync: 0/64 shards
 master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
 incremental sync: 64/64 shards
 metadata is caught up with master
   data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
 syncing
 full sync: 0/128 shards
  

Re: [ceph-users] Reuse pool id

2017-11-15 Thread David Turner
It's probably against the inner workings of Ceph to change the ID of the
pool.  There are a couple other things in Ceph that keep old data around
most likely to prevent potential collisions.  One in particular is keeping
deleted_snaps in the OSD map indefinitely.

One thing I can think of in particular with the pool ID is that I deleted a
large pool 3 weeks ago and there are still copies of the PGs being deleted
from the OSDs now.  If a new pool were created with the same ID, the PG's
could collide.

On Wed, Nov 15, 2017 at 11:49 AM Karun Josy  wrote:

> Any suggestions ?
>
> Karun Josy
>
> On Mon, Nov 13, 2017 at 10:06 PM, Karun Josy  wrote:
>
>> Hi,
>>
>> Is there anyway we can change or reuse pool id ?
>> I had created and deleted lot of test pools. So the IDs kind of look like
>> this now:
>>
>> ---
>> $ ceph osd lspools
>> 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1,
>> --
>>
>> Can I change it to 0,1,2,3 etc ?
>>
>> Karun
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Moving bluestore WAL and DB after bluestore creation

2017-11-15 Thread David Turner
I'm not going to lie.  This makes me dislike Bluestore quite a bit.  Using
multiple OSDs to an SSD journal allowed for you to monitor the write
durability of the SSD and replace it without having to out and re-add all
of the OSDs on the device.  Having to now out and backfill back onto the
HDDs is awful and would have made a time when I realized that 20 journal
SSDs all ran low on writes at the same time nearly impossible to recover
from.

Flushing journals, replacing SSDs, and bringing it all back online was a
slick process.  Formatting the HDDs and backfilling back onto the same
disks sounds like a big regression.  A process to migrate the WAL and DB
onto the HDD and then back off to a new device would be very helpful.

On Wed, Nov 15, 2017 at 10:51 AM Mario Giammarco 
wrote:

> It seems it is not possible. I recreated the OSD
>
> 2017-11-12 17:44 GMT+01:00 Shawn Edwards :
>
>> I've created some Bluestore OSD with all data (wal, db, and data) all on
>> the same rotating disk.  I would like to now move the wal and db onto an
>> nvme disk.  Is that possible without re-creating the OSD?
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reuse pool id

2017-11-15 Thread Karun Josy
Any suggestions ?

Karun Josy

On Mon, Nov 13, 2017 at 10:06 PM, Karun Josy  wrote:

> Hi,
>
> Is there anyway we can change or reuse pool id ?
> I had created and deleted lot of test pools. So the IDs kind of look like
> this now:
>
> ---
> $ ceph osd lspools
> 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1,
> --
>
> Can I change it to 0,1,2,3 etc ?
>
> Karun
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Luminous RadosGW issue

2017-11-15 Thread Sam Huracan
Thanks Hans, I've fixed it.
Ceph luminous auto create an user client.rgw, I did't know and make a new
user client.radowgw.


On Nov 9, 2017 17:03, "Hans van den Bogert"  wrote:

> On Nov 9, 2017, at 5:25 AM, Sam Huracan  wrote:
>
> root@radosgw system]# ceph --admin-daemon 
> /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep log_file
> "log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log”,
>
>
> The .asok filename resembles what should be used in your config. If Im
> right you should use ‘client.rgw.radosgw’ in your ceph.conf.
>
>
>
> On Nov 9, 2017, at 5:25 AM, Sam Huracan  wrote:
>
> @Hans: Yes, I tried to redeploy RGW, and ensure client.radosgw.gateway is
> the same in ceph.conf.
> Everything go well, service radosgw running, port 7480 is opened, but all
> my config of radosgw in ceph.conf can't be set, rgw_dns_name is still
> empty, and log file keeps default value.
>
> [root@radosgw system]# ceph --admin-daemon 
> /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep log_file
> "log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log",
>
>
> [root@radosgw system]# cat /etc/ceph/ceph.client.radosgw.keyring
> [client.radosgw.gateway]
> key = AQCsywNaqQdDHxAAC24O8CJ0A9Gn6qeiPalEYg==
> caps mon = "allow rwx"
> caps osd = "allow rwx"
>
>
> 2017-11-09 6:11 GMT+07:00 Hans van den Bogert :
>
>> Are you sure you deployed it with the client.radosgw.gateway name as
>> well? Try to redeploy the RGW and make sure the name you give it
>> corresponds to the name you give in the ceph.conf. Also, do not forget
>> to push the ceph.conf to the RGW machine.
>>
>> On Wed, Nov 8, 2017 at 11:44 PM, Sam Huracan 
>> wrote:
>> >
>> >
>> > Hi Cephers,
>> >
>> > I'm testing RadosGW in Luminous version.  I've already installed done
>> in separate host, service is running but RadosGW did not accept any my
>> configuration in ceph.conf.
>> >
>> > My Config:
>> > [client.radosgw.gateway]
>> > host = radosgw
>> > keyring = /etc/ceph/ceph.client.radosgw.keyring
>> > rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
>> > log file = /var/log/radosgw/client.radosgw.gateway.log
>> > rgw dns name = radosgw.demo.com
>> > rgw print continue = false
>> >
>> >
>> > When I show config of radosgw socket:
>> > [root@radosgw ~]# ceph --admin-daemon 
>> > /var/run/ceph/ceph-client.rgw.radosgw.asok
>> config show | grep dns
>> > "mon_dns_srv_name": "",
>> > "rgw_dns_name": "",
>> > "rgw_dns_s3website_name": "",
>> >
>> > rgw_dns_name is empty, hence S3 API is unable to access Ceph Object
>> Storage.
>> >
>> >
>> > Do anyone meet this issue?
>> >
>> > My ceph version I'm  using is ceph-radosgw-12.2.1-0.el7.x86_64
>> >
>> > Thanks in advance
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Moving bluestore WAL and DB after bluestore creation

2017-11-15 Thread Mario Giammarco
It seems it is not possible. I recreated the OSD

2017-11-12 17:44 GMT+01:00 Shawn Edwards :

> I've created some Bluestore OSD with all data (wal, db, and data) all on
> the same rotating disk.  I would like to now move the wal and db onto an
> nvme disk.  Is that possible without re-creating the OSD?
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Separation of public/cluster networks

2017-11-15 Thread Wido den Hollander

> Op 15 november 2017 om 15:03 schreef Richard Hesketh 
> :
> 
> 
> On 15/11/17 12:58, Micha Krause wrote:
> > Hi,
> > 
> > I've build a few clusters with separated public/cluster network, but I'm 
> > wondering if this is really
> > the way to go.
> > 
> > http://docs.ceph.com/docs/jewel/rados/configuration/network-config-ref
> > 
> > states 2 reasons:
> > 
> > 1. There is more traffic in the backend, which could cause latencies in the 
> > public network.
> > 
> >  Is a low latency public network really an advantage, if my cluster network 
> > has high latency?
> > 
> > 2. Security: evil users could cause damage in the cluster net.
> > 
> >  Couldn't you cause the same kind, or even more damage in the public 
> > network?
> > 
> > 
> > On the other hand, if one host looses it's cluster network, it will report 
> > random OSDs down over the
> > remaining public net. (yes I know about the "mon osd min down reporters" 
> > workaround)
> > 
> > 
> > Advantages of a single, shared network:
> > 
> > 1. Hosts with network problems, that can't reach other OSDs, all so can't 
> > reach the mon. So our mon server doesn't get conflicting informations.
> > 
> > 2. Given the same network bandwidth overall, OSDs can use a bigger part of 
> > the bandwidth for backend traffic.
> > 
> > 3. KISS principle.
> > 
> > So if my server has 4 x 10GB/s network should I really split them in 2 x 
> > 20GB/s (cluster/public) or am I
> > better off using 1 x 40GB/s (shared)?
> > 
> > Micha Krause
> 
> I have two clusters, one running all-public-network and one with separated 
> public/cluster networks. The latter is a bit of a pain because it's much more 
> fiddly if I have to change anything, and also there is basically no point to 
> it being set up this way (it all goes into the same switch so there's no real 
> redundancy).
> 
> To quote Wido 
> (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017527.html):
> > I rarely use public/cluster networks as they don't add anything for most
> > systems. 20Gbit of bandwidth per node is more then enough in most cases and
> > my opinion is that multiple IPs per machine only add complexity.
> 
> Unless you actually have to make your cluster available on a public network 
> which you don't control/trust I really don't think there's much point in 
> splitting things up; just bond your links together. Even if you still want to 
> logically split cluster/public network so they're in different subnets, you 
> can just assign multiple IPs to the link or potentially set up VLAN tagging 
> on the switch/interfaces if you want your traffic a bit more securely 
> segregated.
> 

Thanks for the quote!

I still think about it that way. Public and Cluster networks might make sense 
in some cases, but imho it shouldn't be the default.

One IP per machine is just KISS. It's up or down, not half-up, half-down.

In the 7 years I've been running and building Ceph systems now I never ran into 
a case where I thought: I would really like/need a cluster network here.

A 20Gbit LACP is sufficient for a bunch of disks in a system.

And when you go full NVMe you might need more bandwidth, add a 40Gbit NIC in a 
system or so.

To finish with, bandwidth usually isn't a really issue. Latency is. You'll run 
into latency problems (Ceph code, slow CPU's, disk latency) before you run out 
of network latency or bandwidth.

Wido

> Rich
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Separation of public/cluster networks

2017-11-15 Thread Richard Hesketh
On 15/11/17 12:58, Micha Krause wrote:
> Hi,
> 
> I've build a few clusters with separated public/cluster network, but I'm 
> wondering if this is really
> the way to go.
> 
> http://docs.ceph.com/docs/jewel/rados/configuration/network-config-ref
> 
> states 2 reasons:
> 
> 1. There is more traffic in the backend, which could cause latencies in the 
> public network.
> 
>  Is a low latency public network really an advantage, if my cluster network 
> has high latency?
> 
> 2. Security: evil users could cause damage in the cluster net.
> 
>  Couldn't you cause the same kind, or even more damage in the public network?
> 
> 
> On the other hand, if one host looses it's cluster network, it will report 
> random OSDs down over the
> remaining public net. (yes I know about the "mon osd min down reporters" 
> workaround)
> 
> 
> Advantages of a single, shared network:
> 
> 1. Hosts with network problems, that can't reach other OSDs, all so can't 
> reach the mon. So our mon server doesn't get conflicting informations.
> 
> 2. Given the same network bandwidth overall, OSDs can use a bigger part of 
> the bandwidth for backend traffic.
> 
> 3. KISS principle.
> 
> So if my server has 4 x 10GB/s network should I really split them in 2 x 
> 20GB/s (cluster/public) or am I
> better off using 1 x 40GB/s (shared)?
> 
> Micha Krause

I have two clusters, one running all-public-network and one with separated 
public/cluster networks. The latter is a bit of a pain because it's much more 
fiddly if I have to change anything, and also there is basically no point to it 
being set up this way (it all goes into the same switch so there's no real 
redundancy).

To quote Wido 
(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017527.html):
> I rarely use public/cluster networks as they don't add anything for most
> systems. 20Gbit of bandwidth per node is more then enough in most cases and
> my opinion is that multiple IPs per machine only add complexity.

Unless you actually have to make your cluster available on a public network 
which you don't control/trust I really don't think there's much point in 
splitting things up; just bond your links together. Even if you still want to 
logically split cluster/public network so they're in different subnets, you can 
just assign multiple IPs to the link or potentially set up VLAN tagging on the 
switch/interfaces if you want your traffic a bit more securely segregated.

Rich



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread jorpilo
I think you can always use the --release luminous option that will ensure you 
install luminous.ceph-deploy install --release luminous node1 node2 node3
 Mensaje original De: "Ragan, Tj (Dr.)" 
 Fecha: 15/11/17  11:11 a. m.  (GMT+01:00) Para: Hans 
van den Bogert  Cc: ceph-users@lists.ceph.com Asunto: Re: 
[ceph-users] ceps-deploy won't install luminous 

Yes, I’ve done that.  I’ve also tried changing the priority field from 1 to 2, 
with no effect.














On 15 Nov 2017, at 09:58, Hans van den Bogert  wrote:



Never mind, you already said you are on the latest ceph-deploy, so that can’t 
be it.

I’m not familiar with deploying on Centos, but I can imagine that the last part 
of the checklist is important:



http://docs.ceph.com/docs/luminous/start/quick-start-preflight/#priorities-preferences



Can you verify that you did that part?



On Nov 15, 2017, at 10:41 AM, Hans van den Bogert  wrote:



Hi,



Can you show the contents of the file, /etc/yum.repos.d/ceph.repo ?



Regards,



Hans

On Nov 15, 2017, at 10:27 AM, Ragan, Tj (Dr.)  wrote:



Hi All,



I feel like I’m doing something silly.  I’m spinning up a new cluster, and 
followed the instructions on the pre-flight and quick start here:



http://docs.ceph.com/docs/luminous/start/quick-start-preflight/

http://docs.ceph.com/docs/luminous/start/quick-ceph-deploy/



but ceph-deploy always installs Jewel.   



ceph-deploy is version 1.5.39 and I’m running CentOS 7 (7-4.1708)



Any help would be appreciated.



-TJ Ragan

___

ceph-users mailing list

ceph-users@lists.ceph.com

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com













___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy failed to deploy osd randomly

2017-11-15 Thread Wei Jin
I tried to do purge/purgedata and then redo the deploy command for a
few times, and it still fails to start osd.
And there is no error log, anyone know what's the problem?
BTW, my os is dedian with 4.4 kernel.
Thanks.


On Wed, Nov 15, 2017 at 8:24 PM, Wei Jin  wrote:
> Hi, List,
>
> My machine has 12 SSDs disk, and I use ceph-deploy to deploy them. But for
> some machine/disks,it failed to start osd.
> I tried many times, some success but others failed. But there is no error
> info.
> Following is ceph-deploy log for one disk:
>
>
> root@n10-075-012:~# ceph-deploy osd create --zap-disk n10-075-094:sdb:sdb
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.39): /usr/bin/ceph-deploy osd create
> --zap-disk n10-075-094:sdb:sdb
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  block_db  : None
> [ceph_deploy.cli][INFO  ]  disk  : [('n10-075-094',
> '/dev/sdb', '/dev/sdb')]
> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  bluestore : None
> [ceph_deploy.cli][INFO  ]  block_wal : None
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  subcommand: create
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
> [ceph_deploy.cli][INFO  ]  filestore : None
> [ceph_deploy.cli][INFO  ]  func  :  0x7f566ae9a938>
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  zap_disk  : True
> [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks
> n10-075-094:/dev/sdb:/dev/sdb
> [n10-075-094][DEBUG ] connected to host: n10-075-094
> [n10-075-094][DEBUG ] detect platform information from remote host
> [n10-075-094][DEBUG ] detect machine type
> [n10-075-094][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: debian 8.9 jessie
> [ceph_deploy.osd][DEBUG ] Deploying osd to n10-075-094
> [n10-075-094][DEBUG ] write cluster configuration to
> /etc/ceph/{cluster}.conf
> [ceph_deploy.osd][DEBUG ] Preparing host n10-075-094 disk /dev/sdb journal
> /dev/sdb activate True
> [n10-075-094][DEBUG ] find the location of an executable
> [n10-075-094][INFO  ] Running command: /usr/sbin/ceph-disk -v prepare
> --zap-disk --cluster ceph --fs-type xfs -- /dev/sdb /dev/sdb
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
> --cluster=ceph --show-config-value=fsid
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
> --check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
> --cluster ceph --setuser ceph --setgroup ceph
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
> --check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
> --cluster ceph --setuser ceph --setgroup ceph
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
> --check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
> --cluster ceph --setuser ceph --setgroup ceph
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
> /sys/dev/block/8:16/dm/uuid
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
> --cluster=ceph --show-config-value=osd_journal_size
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
> /sys/dev/block/8:16/dm/uuid
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
> /sys/dev/block/8:16/dm/uuid
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
> /sys/dev/block/8:16/dm/uuid
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is
> /sys/dev/block/8:17/dm/uuid
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb2 uuid path is
> /sys/dev/block/8:18/dm/uuid
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
> --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
> --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
> [n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
> --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
> [n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
> /sys/dev/block/8:16/dm/uuid
> [n10-075-094][WARNIN] zap: Zapping partition table on /dev/sdb
> [n10-075-094][WARNIN] 

Re: [ceph-users] Cluster network slower than public network

2017-11-15 Thread Gandalf Corvotempesta
Any idea?
I have 1 16 ports 10gb switch, 2 or more 24ports gigabit switches and 5
OSDs (MONs running over them) and 5 hypervisor servers to connect to the
storage

At least 10 ports are needed for each network, thus, 20 ports for both
cluster and public, right ?

I don't have 20 10gb ports

Il 15 nov 2017 2:10 PM, "Micha Krause"  ha scritto:

> Hi,
>
> Replication and rebalance should be slow, but what about public I/O ?
>>
>
> Writes to clients are only acked after replication, so you can't take
> advantage of your fast public network.
>
> http://docs.ceph.com/docs/giant/_images/ditaa-54719cc959473e
> 68a317f6578f9a2f0f3a8345ee.png
>
> Micha Krause
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Separation of public/cluster networks

2017-11-15 Thread Micha Krause

Hi,

I've build a few clusters with separated public/cluster network, but I'm 
wondering if this is really
the way to go.

http://docs.ceph.com/docs/jewel/rados/configuration/network-config-ref

states 2 reasons:

1. There is more traffic in the backend, which could cause latencies in the 
public network.

 Is a low latency public network really an advantage, if my cluster network has 
high latency?

2. Security: evil users could cause damage in the cluster net.

 Couldn't you cause the same kind, or even more damage in the public network?


On the other hand, if one host looses it's cluster network, it will report 
random OSDs down over the
remaining public net. (yes I know about the "mon osd min down reporters" 
workaround)


Advantages of a single, shared network:

1. Hosts with network problems, that can't reach other OSDs, all so can't reach 
the mon. So our mon server doesn't get conflicting informations.

2. Given the same network bandwidth overall, OSDs can use a bigger part of the 
bandwidth for backend traffic.

3. KISS principle.

So if my server has 4 x 10GB/s network should I really split them in 2 x 20GB/s 
(cluster/public) or am I
better off using 1 x 40GB/s (shared)?


Micha Krause
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster network slower than public network

2017-11-15 Thread Gandalf Corvotempesta
Small info: all of our nodes (3 to 5 plus 6 hypervisors) have 4 10gb ports
but we only have 2 10gb switches (small port numbers, only 16, so we can't
place both network on the same switch)

We use 2 switches for HA in active-backup mode

I was thinking to use both 10gb switches as public network (the one
connected to the hypervisors) and two 1gb switch as cluster network

Il 15 nov 2017 1:50 PM, "Gandalf Corvotempesta" <
gandalf.corvotempe...@gmail.com> ha scritto:

> As 10gb switches are expansive, what would happen by using a gigabit
> cluster network and a 10gb public network?
>
> Replication and rebalance should be slow, but what about public I/O ?
> When a client wants to write to a file, it does over the public network
> and the ceph automatically replicate it over the cluster network or the
> whole IO is made over the public?
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Cluster network slower than public network

2017-11-15 Thread Gandalf Corvotempesta
As 10gb switches are expansive, what would happen by using a gigabit
cluster network and a 10gb public network?

Replication and rebalance should be slow, but what about public I/O ?
When a client wants to write to a file, it does over the public network and
the ceph automatically replicate it over the cluster network or the whole
IO is made over the public?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy failed to deploy osd randomly

2017-11-15 Thread Wei Jin
Hi, List,

My machine has 12 ssd
There are some errors for ceph-deploy.
It failed randomly

root@n10-075-012:~# *ceph-deploy osd create --zap-disk n10-075-094:sdb:sdb*
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /usr/bin/ceph-deploy osd create
--zap-disk n10-075-094:sdb:sdb
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  block_db  : None
[ceph_deploy.cli][INFO  ]  disk  : [('n10-075-094',
'/dev/sdb', '/dev/sdb')]
[ceph_deploy.cli][INFO  ]  dmcrypt   : False
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  bluestore : None
[ceph_deploy.cli][INFO  ]  block_wal : None
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  subcommand: create
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
/etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   :

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  fs_type   : xfs
[ceph_deploy.cli][INFO  ]  filestore : None
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  zap_disk  : True
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks
n10-075-094:/dev/sdb:/dev/sdb
[n10-075-094][DEBUG ] connected to host: n10-075-094
[n10-075-094][DEBUG ] detect platform information from remote host
[n10-075-094][DEBUG ] detect machine type
[n10-075-094][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: debian 8.9 jessie
[ceph_deploy.osd][DEBUG ] Deploying osd to n10-075-094
[n10-075-094][DEBUG ] write cluster configuration to
/etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host n10-075-094 disk /dev/sdb journal
/dev/sdb activate True
[n10-075-094][DEBUG ] find the location of an executable
[n10-075-094][INFO  ] Running command: /usr/sbin/ceph-disk -v prepare
--zap-disk --cluster ceph --fs-type xfs -- /dev/sdb /dev/sdb
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
--cluster=ceph --show-config-value=fsid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
--check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
--cluster ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
--check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
--cluster ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
--check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log
--cluster ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd
--cluster=ceph --show-config-value=osd_journal_size
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is
/sys/dev/block/8:17/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb2 uuid path is
/sys/dev/block/8:18/dm/uuid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
--cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
--cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf
--cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] zap: Zapping partition table on /dev/sdb
[n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk
--zap-all -- /dev/sdb
[n10-075-094][WARNIN] Caution: invalid backup GPT header, but valid main
header; regenerating
[n10-075-094][WARNIN] backup header from main header.
[n10-075-094][WARNIN]
[n10-075-094][WARNIN] Warning! Main and backup partition tables differ! Use
the 'c' and 'e' options
[n10-075-094][WARNIN] on the recovery & transformation menu to examine the
two tables.
[n10-075-094][WARNIN]
[n10-075-094][WARNIN] Warning! One or more CRCs don't match. You should
repair the disk!
[n10-075-094][WARNIN]
[n10-075-094][DEBUG ] **

[ceph-users] ceph-deploy failed to deploy osd randomly

2017-11-15 Thread Wei Jin
Hi, List,

My machine has 12 SSDs disk, and I use ceph-deploy to deploy them. But for some 
machine/disks,it failed to start osd.
I tried many times, some success but others failed. But there is no error info.
Following is ceph-deploy log for one disk:


root@n10-075-012:~# ceph-deploy osd create --zap-disk n10-075-094:sdb:sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /usr/bin/ceph-deploy osd create 
--zap-disk n10-075-094:sdb:sdb
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  block_db  : None
[ceph_deploy.cli][INFO  ]  disk  : [('n10-075-094', 
'/dev/sdb', '/dev/sdb')]
[ceph_deploy.cli][INFO  ]  dmcrypt   : False
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  bluestore : None
[ceph_deploy.cli][INFO  ]  block_wal : None
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  subcommand: create
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   : 
/etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  fs_type   : xfs
[ceph_deploy.cli][INFO  ]  filestore : None
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  zap_disk  : True
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks 
n10-075-094:/dev/sdb:/dev/sdb
[n10-075-094][DEBUG ] connected to host: n10-075-094
[n10-075-094][DEBUG ] detect platform information from remote host
[n10-075-094][DEBUG ] detect machine type
[n10-075-094][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: debian 8.9 jessie
[ceph_deploy.osd][DEBUG ] Deploying osd to n10-075-094
[n10-075-094][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host n10-075-094 disk /dev/sdb journal 
/dev/sdb activate True
[n10-075-094][DEBUG ] find the location of an executable
[n10-075-094][INFO  ] Running command: /usr/sbin/ceph-disk -v prepare 
--zap-disk --cluster ceph --fs-type xfs -- /dev/sdb /dev/sdb
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd 
--cluster=ceph --show-config-value=fsid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log 
--cluster ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster 
ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster 
ceph --setuser ceph --setgroup ceph
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is 
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-osd 
--cluster=ceph --show-config-value=osd_journal_size
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is 
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is 
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is 
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is 
/sys/dev/block/8:17/dm/uuid
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb2 uuid path is 
/sys/dev/block/8:18/dm/uuid
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[n10-075-094][WARNIN] command: Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[n10-075-094][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdb uuid path is 
/sys/dev/block/8:16/dm/uuid
[n10-075-094][WARNIN] zap: Zapping partition table on /dev/sdb
[n10-075-094][WARNIN] command_check_call: Running command: /sbin/sgdisk 
--zap-all -- /dev/sdb
[n10-075-094][WARNIN] Caution: invalid backup GPT header, but valid main 
header; regenerating
[n10-075-094][WARNIN] backup header from main header.
[n10-075-094][WARNIN]
[n10-075-094][WARNIN] Warning! Main and backup partition tables differ! Use the 
'c' and 'e' options
[n10-075-094][WARNIN] on the recovery & transformation menu to examine the two 
tables.

Re: [ceph-users] who is using nfs-ganesha and cephfs?

2017-11-15 Thread Jens-U. Mozdzen

Hi all,

By Sage Weil :

Who is running nfs-ganesha's FSAL to export CephFS?  What has your
experience been?

(We are working on building proper testing and support for this into
Mimic, but the ganesha FSAL has been around for years.)


After we had moved most of our file-based data to a CephFS environment  
and suffering from what later turned out to be a (mis-)configuration  
issue with our existing nfsd server, I had decided to give Ganesha a  
try.


We run a Ceph cluster on three servers, openSUSE Leap 42.3, Ceph  
Luminous (latest stable). 2x10G interfaces for intra-cluster  
communication, 2x1G towards the NFS clients. CephFS meta-data is on an  
SSD pool, the actual data is on SAS HDDs, 12 OSDs. Ganesha version is  
2.5.2.0+git.1504275777.a9d23b98f-3.6. All Ganesha/nfsd server services  
are on one of the servers that are also Ceph nodes.


We run an automated, distributed build environment (tons of gcc  
compiles on multiple clients, some Java compiles, RPM builds etc.),  
with (among others) nightly test build runs. These usually take about  
8 hours, when using kernel nfsd and local storage on the same servers  
that also provide the Ceph service.


After switching to Ganesha (with CephFS FSAL, Ganesha running on the  
same server where we originally had run nfsd) and starting test runs  
of the same work load, we aborted the runs after about 12 hours - by  
then, only (estimated) 60 percent of the job were done.


For comparison, when now using kernel nfsd to serve the CephFS shares  
(mounting the single CephFS via kernel FS module on the server that's  
running nfsd, and exporting multiple sub-directories via nfsd), we see  
an increase of between none and eight percent of the original run time.


So to us, comparing "Ganesha+CephFS FSAL" to "kernel nfsd with kernel  
CephFS module", the latter wins. Or to put it the other way around,  
Ganesha seems unusable to us in its current state, judging by the  
slowness observed.


Other issues I noticed:

- the directory size, as shown by "ls -l" on the client, was very  
different from that shown when mounting via nfsd ;)


- "showmount" did not return any entries, with would have (later on,  
had we continued to use Ganesha) caused problems with our dynamic  
automouter maps


Please note that I did not have time to do intensive testing against  
different Ganesha parameters. The only runs I made were with or  
without "MaxRead = 1048576; MaxWrite = 1048576;" per share, per some  
comment about buffer sizes. These changes didn't seem to bring much  
difference, though.


We closely monitor our network and server performance, I could clearly  
see a huge drop of network traffic (NFS server to clients) when  
switching from nfsd to Ganesha, and an according increase when  
switching back to nfsd (sharing the CephFS mount). None of the servers  
seemed to be under excessive load during these tests but it was  
obvious that Ganesha took its share of CPU - maybe the bottle-neck  
were some single-threaded operations, so Ganesha could not make use of  
the other, idling cores. But I'm just guessing here.


Regards,
J


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-15 Thread Phil Schwarz
Hi,
thanks for the explanation, but...
Twisting the Ceph storage model as you plan it is not a good idea :
- You will decrease the support level(I'm not sure many people will
build such an architecture)
- You are certainly going to face strange issues with HW Raid on top of
Ceph OSD
- You should'nt want to go to size=2. I know the counterparts of size=3
(IOPS, Usable space), but it seems not really safe to downgrade to size=2.
- Your servers seem to have enough horsepower regarding CPU,RAM and
disks. But you havent't told us about Ceph replication Network. At least
10Gbe, i hope.
- Your public network should be more than 1Gbe too, far more..
- How will you export VM ? single KVM samba server ? Ceph authx clients ???
- Rapidly, with size=3, you have, with 4 servers : 4*8*2/3=22TB usable
space. With 100 VDI, 220 GB per VM.. Is it enough to expand those VM sizes ?


In a conclusion,i fully understand the issues doing a complete test lab
before buying a complete cluster. But, you should do a few tests before
to tweak the solution to your needs.

Good luck
Best regards


Le 14/11/2017 à 11:36, Oscar Segarra a écrit :
> Hi Anthony,
> 
> 
> o I think you might have some misunderstandings about how Ceph works. 
> Ceph is best deployed as a single cluster spanning multiple servers,
> generally at least 3.  Is that your plan?   
> 
> I want to deply servers for 100VDI Windows 10 each (at least 3 servers).
> I plan to sell servers dependingo of the number of VDI required by my
> customer. For 100 VDI --> 3 servers, for 400 VDI --> 4 servers
> 
> This is my proposal of configuration:
> 
> *Server1:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server2:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server3:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> *Server4:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> ...
> *ServerN:*
> CPU: 2x16 Core
> RAM: 512
> Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
> 
> If I create an OSD for each disk and I pin a core for each osd in a
> server I wil need 8 cores just for managing osd. If I create 4 RAID0 of
> 2 disks each, I will need just 4 osd, and so on:
> 
> 1 osd x 1 disk of 4TB
> 1 osd x 2 disks of 2TB
> 1 odd x 4 disks of 1 TB
> 
> If the CPU cycles used by Ceph are a problem, your architecture has IMHO
> bigger problems.  You need to design for a safety margin of RAM and CPU
> to accommodate spikes in usage, both by Ceph and by your desktops. 
> There is no way each of the systems you describe is going to have enough
> cycles for 100 desktops concurrently active.  You'd be allocating each
> of them only ~3GB of RAM -- I've not had to run MS Windows 10 but even
> with page sharing that seems awfully tight on RAM.
> 
> Sorry, I think my design has not been correctly explained. I hope my
> previous explanation clarifies it. The problem is i'm in the design
> phase and I don't know if ceph CPU cycles can be a problem and that is
> the principal object of this post.
> 
> With the numbers you mention throughout the thread, it would seem as
> though you would end up with potentially as little as 80GB of usable
> space per virtual desktop - will that meet your needs?
> 
> Sorry, I think 80GB is enough, nevertheless, I plan to use RBD clones
> and therefore even with size=2, I think I will have more than 80GB
> available for each vdi.
> 
> In this design phase where I am, every advice is really welcome!
> 
> Thanks a lot
> 
> 2017-11-13 23:40 GMT+01:00 Anthony D'Atri  >:
> 
> Oscar, a few thoughts:
> 
> o I think you might have some misunderstandings about how Ceph
> works.  Ceph is best deployed as a single cluster spanning multiple
> servers, generally at least 3.  Is that your plan?  It sort of
> sounds as though you're thinking of Ceph managing only the drives
> local to each of your converged VDI hosts, like local RAID would. 
> Ceph doesn't work that way.  Well, technically it could but wouldn't
> be a great architecture.  You would want to have at least 3 servers,
> with all of the Ceph OSDs in a single cluster.
> 
> o Re RAID0:
> 
> > Then, may I understand that your advice is a RAID0 for each 4TB? For a
> > balanced configuration...
> >
> > 1 osd x 1 disk of 4TB
> > 1 osd x 2 disks of 2TB
> > 1 odd x 4 disks of 1 TB
> 
> 
> For performance a greater number of smaller drives is generally
> going to be best.  VDI desktops are going to be fairly
> latency-sensitive and you'd really do best with SSDs.  All those
> desktops thrashing a small number of HDDs is not going to deliver
> tolerable performance.
> 
> Don't use RAID at all for the OSDs.  Even if you get hardware RAID
> HBAs, configure JBOD/passthrough mode so that OSDs are deployed
> directly on the drives.  This will minimize latency as well as
>  

Re: [ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread Ragan, Tj (Dr.)
Yes, I’ve done that.  I’ve also tried changing the priority field from 1 to 2, 
with no effect.


On 15 Nov 2017, at 09:58, Hans van den Bogert 
> wrote:

Never mind, you already said you are on the latest ceph-deploy, so that can’t 
be it.
I’m not familiar with deploying on Centos, but I can imagine that the last part 
of the checklist is important:

http://docs.ceph.com/docs/luminous/start/quick-start-preflight/#priorities-preferences

Can you verify that you did that part?

On Nov 15, 2017, at 10:41 AM, Hans van den Bogert  wrote:

Hi,

Can you show the contents of the file, /etc/yum.repos.d/ceph.repo ?

Regards,

Hans
On Nov 15, 2017, at 10:27 AM, Ragan, Tj (Dr.)  wrote:

Hi All,

I feel like I’m doing something silly.  I’m spinning up a new cluster, and 
followed the instructions on the pre-flight and quick start here:

http://docs.ceph.com/docs/luminous/start/quick-start-preflight/
http://docs.ceph.com/docs/luminous/start/quick-ceph-deploy/

but ceph-deploy always installs Jewel.

ceph-deploy is version 1.5.39 and I’m running CentOS 7 (7-4.1708)

Any help would be appreciated.

-TJ Ragan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread Ragan, Tj (Dr.)
$ cat /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=http://download.ceph.com/rpm-jewel/el7/$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-jewel/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-jewel/el7/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1



It’s worth noting that if I change these to rpm-luminous then it breaks too:

$ sudo sed -i 's/jewel/luminous/' /etc/yum.repos.d/ceph.repo
$ cat !$
cat /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=http://download.ceph.com/rpm-luminous/el7/$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-luminous/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-luminous/el7/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

$ ceph-deploy install --release luminous admin1
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ceph-deploy/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.37): /bin/ceph-deploy install --release 
luminous admin1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  testing   : None
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  dev_commit: None
[ceph_deploy.cli][INFO  ]  install_mds   : False
[ceph_deploy.cli][INFO  ]  stable: None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  adjust_repos  : True
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  install_all   : False
[ceph_deploy.cli][INFO  ]  repo  : False
[ceph_deploy.cli][INFO  ]  host  : ['admin1']
[ceph_deploy.cli][INFO  ]  install_rgw   : False
[ceph_deploy.cli][INFO  ]  install_tests : False
[ceph_deploy.cli][INFO  ]  repo_url  : None
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  install_osd   : False
[ceph_deploy.cli][INFO  ]  version_kind  : stable
[ceph_deploy.cli][INFO  ]  install_common: False
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  dev   : master
[ceph_deploy.cli][INFO  ]  nogpgcheck: False
[ceph_deploy.cli][INFO  ]  local_mirror  : None
[ceph_deploy.cli][INFO  ]  release   : luminous
[ceph_deploy.cli][INFO  ]  install_mon   : False
[ceph_deploy.cli][INFO  ]  gpg_url   : None
[ceph_deploy.install][DEBUG ] Installing stable version luminous on cluster 
ceph hosts admin1
[ceph_deploy.install][DEBUG ] Detecting platform for host admin1 ...
[admin1][DEBUG ] connection detected need for sudo
[admin1][DEBUG ] connected to host: admin1
[admin1][DEBUG ] detect platform information from remote host
[admin1][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[admin1][INFO  ] installing Ceph on admin1
[admin1][INFO  ] Running command: sudo yum clean all
[admin1][DEBUG ] Loaded plugins: fastestmirror, langpacks, priorities
[admin1][DEBUG ] Cleaning repos: Ceph Ceph-noarch base ceph-noarch ceph-source 
epel extras
[admin1][DEBUG ]   : updates
[admin1][DEBUG ] Cleaning up everything
[admin1][DEBUG ] Maybe you want: rm -rf /var/cache/yum, to also free up space 
taken by orphaned data from disabled or removed repos
[admin1][DEBUG ] Cleaning up list of fastest mirrors
[admin1][INFO  ] Running command: sudo yum -y install epel-release
[admin1][DEBUG ] Loaded plugins: fastestmirror, langpacks, priorities
[admin1][DEBUG ] Determining fastest mirrors
[admin1][DEBUG ]  * base: mirror.netw.io
[admin1][DEBUG ]  * epel: 
anorien.csc.warwick.ac.uk
[admin1][DEBUG ]  * extras: 
anorien.csc.warwick.ac.uk

Re: [ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread Hans van den Bogert
Never mind, you already said you are on the latest ceph-deploy, so that can’t 
be it.
I’m not familiar with deploying on Centos, but I can imagine that the last part 
of the checklist is important:

http://docs.ceph.com/docs/luminous/start/quick-start-preflight/#priorities-preferences

Can you verify that you did that part?

> On Nov 15, 2017, at 10:41 AM, Hans van den Bogert  
> wrote:
> 
> Hi,
> 
> Can you show the contents of the file, /etc/yum.repos.d/ceph.repo ?
> 
> Regards,
> 
> Hans
>> On Nov 15, 2017, at 10:27 AM, Ragan, Tj (Dr.)  
>> wrote:
>> 
>> Hi All,
>> 
>> I feel like I’m doing something silly.  I’m spinning up a new cluster, and 
>> followed the instructions on the pre-flight and quick start here:
>> 
>> http://docs.ceph.com/docs/luminous/start/quick-start-preflight/
>> http://docs.ceph.com/docs/luminous/start/quick-ceph-deploy/
>> 
>> but ceph-deploy always installs Jewel.   
>> 
>> ceph-deploy is version 1.5.39 and I’m running CentOS 7 (7-4.1708)
>> 
>> Any help would be appreciated.
>> 
>> -TJ Ragan
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread Hans van den Bogert
Hi,

Can you show the contents of the file, /etc/yum.repos.d/ceph.repo ?

Regards,

Hans
> On Nov 15, 2017, at 10:27 AM, Ragan, Tj (Dr.)  
> wrote:
> 
> Hi All,
> 
> I feel like I’m doing something silly.  I’m spinning up a new cluster, and 
> followed the instructions on the pre-flight and quick start here:
> 
> http://docs.ceph.com/docs/luminous/start/quick-start-preflight/
> http://docs.ceph.com/docs/luminous/start/quick-ceph-deploy/
> 
> but ceph-deploy always installs Jewel.   
> 
> ceph-deploy is version 1.5.39 and I’m running CentOS 7 (7-4.1708)
> 
> Any help would be appreciated.
> 
> -TJ Ragan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceps-deploy won't install luminous

2017-11-15 Thread Ragan, Tj (Dr.)
Hi All,

I feel like I’m doing something silly.  I’m spinning up a new cluster, and 
followed the instructions on the pre-flight and quick start here:

http://docs.ceph.com/docs/luminous/start/quick-start-preflight/
http://docs.ceph.com/docs/luminous/start/quick-ceph-deploy/

but ceph-deploy always installs Jewel.

ceph-deploy is version 1.5.39 and I’m running CentOS 7 (7-4.1708)

Any help would be appreciated.

-TJ Ragan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-15 Thread Maged Mokhtar
On 2017-11-14 21:54, Milanov, Radoslav Nikiforov wrote:

> Hi 
> 
> We have 3 node, 27 OSDs cluster running Luminous 12.2.1 
> 
> In filestore configuration there are 3 SSDs used for journals of 9 OSDs on 
> each hosts (1 SSD has 3 journal paritions for 3 OSDs). 
> 
> I've converted filestore to bluestore by wiping 1 host a time and waiting for 
> recovery. SSDs now contain block-db - again one SSD serving 3 OSDs. 
> 
> Cluster is used as storage for Openstack. 
> 
> Running fio on a VM in that Openstack reveals bluestore performance almost 
> twice slower than filestore. 
> 
> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G 
> --numjobs=2 --time_based --runtime=180 --group_reporting 
> 
> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G 
> --numjobs=2 --time_based --runtime=180 --group_reporting 
> 
> Filestore 
> 
> write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec 
> 
> write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec 
> 
> write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec 
> 
> read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec 
> 
> read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec 
> 
> read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec 
> 
> Bluestore 
> 
> write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec 
> 
> write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec 
> 
> write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec 
> 
> read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec 
> 
> read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec 
> 
> read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec 
> 
> - Rado 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

It will be useful to see how this filestore edge would perform when you
increase your queue depth (threads/jobs). For example to 32 or 64. This
would represent a more practical load. 

I can see an extreme case if you have a cluster with a large number of
OSDs and only 1 client thread that filestore may be faster: in this case
when the client io hits an OSD it will not be as busy syncing its
journal to hdd (which is the case under normal load), but again this is
not a practical setup.  

/Maged___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com