Re: [ceph-users] Librbd performance VS KRBD performance

2018-11-16 Thread
Thank you very much, Jason.Our cluster’s target workload is something like monitoring system data center, we need save a lot of video stream  into cluster.I have to reconsider test case.Besides, a lot tests  to do about the config parameters as you mentioned.Help me a lot, thanks.在 2018年11月16日,下午12:30,Jason Dillaman <jdill...@redhat.com> 写道:On Thu, Nov 15, 2018 at 2:30 PM 赵赵贺东 <zhaohed...@gmail.com> wrote:I test in 12 osds cluster, change objecter_inflight_op_bytes from 100MB to 300MB, performance seems not change obviously.But at the beginning , librbd works in better performance in 12 osds cluster. So it seems meaning less for me.In a small cluster(12 osds), 4m seq write performance for Librbd VS KRBD is about 0.89 : 1 (177MB/s : 198MB/s ).In a big cluster (72 osds), 4m seq write performance for Librbd VS KRBD is about  0.38: 1 (420MB/s : 1080MB/s).Our problem is librbd bad performance in big cluster (72 osds)But I can not test in 72 osds right now, some other tests are running .I will test in 72 osds when our cluster is ready.It is a little hard to understand that objecter_inflight_op_bytes=100MB works well in 12 osds cluster, but works poor in 72 osd clusters.Dose objecter_inflight_op_bytes not have an effect  on krbd, only effect librbd?Correct -- the "ceph.conf" config settings are for user-space toolingonly. Given the fact that you are writing full 4MiB objects in yourtest, any user-space performance degradation is probably going to bein the librados layer and below. That 100 MiB limit setting will blockthe IO path while it waits for in-flight IO to complete. You alsomight be just hitting the default throughput of the lower-levelmessenger code, so perhaps you need to throw more threads at it(ms_async_op_threads / ms_async_max_op_threads) or change itsthrottles (ms_dispatch_throttle_bytes). Also, depending on yourcluster and krbd versions, perhaps the OSDs are telling your clientsto back-off but only librados is responding to it. You should alsotake into account the validity of your test case -- does it reallymatch your expected workload that you are trying to optimize against?Thanks.在 2018年11月15日,下午3:50,赵赵贺东 <zhaohed...@gmail.com> 写道:Thanks you for your suggestion.It really give me a lot of inspirations.I will test as your suggestion, and browse through src/common/config_opts.h to see if I can find some configs performance related.But, our osd nodes hardware itself is very poor, that is the truth…we have to face it.Two osds in an arm board, two gb memory and 2*10T hdd disk on board, so one osd has 1gb memory to support 10TB hdd disk, we must try to make cluster works better as we can.Thanks.在 2018年11月15日,下午2:08,Jason Dillaman <jdill...@redhat.com> 写道:Attempting to send 256 concurrent 4MiB writes via librbd will prettyquickly hit the default "objecter_inflight_op_bytes = 100 MiB" limit,which will drastically slow (stall) librados. I would recommendre-testing librbd w/ a much higher throttle override.On Thu, Nov 15, 2018 at 11:34 AM 赵赵贺东 <zhaohed...@gmail.com> wrote:Thank you for your attention.Our test are in run in physical machine environments.Fio for KRBD:[seq-write]description="seq-write"direct=1ioengine=libaiofilename=/dev/rbd0numjobs=1iodepth=256group_reportingrw=writebs=4Msize=10Truntime=180*/dev/rbd0 mapped by rbd_pool/image2, so KRBD & librbd fio test use the same image.Fio for librbd:[global]direct=1numjobs=1ioengine=rbdclientname=adminpool=rbd_poolrbdname=image2invalidate=0    # mandatoryrw=writebs=4Msize=10Truntime=180[rbd_iodepth32]iodepth=256Image info:rbd image 'image2':size 50TiB in 13107200 objectsorder 22 (4MiB objects)data_pool: ec_rbd_poolblock_name_prefix: rbd_data.8.148bb6b8b4567format: 2features: layering, data-poolflags:create_timestamp: Wed Nov 14 09:21:18 2018* data_pool is a EC poolPool info:pool 8 'rbd_pool' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 82627 flags hashpspool stripe_width 0 application rbdpool 9 'ec_rbd_pool' erasure size 6 min_size 5 crush_rule 4 object_hash rjenkins pg_num 256 pgp_num 256 last_change 82649 flags hashpspool,ec_overwrites stripe_width 16384 application rbdRbd cache: Off (Because I think in tcmu , rbd cache will mandatory off, and our cluster will export disk by iscsi in furture.)Thanks!在 2018年11月15日,下午1:22,Gregory Farnum <gfar...@redhat.com> 写道:You'll need to provide more data about how your test is configured and run for us to have a good idea. IIRC librbd is often faster than krbd because it can support newer features and things, but krbd may have less overhead and is not dependent on the VM's driver configuration in QEMU...On Thu, Nov 15, 2018 at 8:22 AM 赵赵贺东 <zhaohed...@gmail.com> wrote:Hi cephers,All our cluster osds are deployed in armhf.Could someone say something about what is the rational performance rates for librbd VS KRBD ?Or rational performance loss range when we use librbd compare to KRBD.I go

Re: [ceph-users] Librbd performance VS KRBD performance

2018-11-15 Thread
I test in 12 osds cluster, change objecter_inflight_op_bytes from 100MB to 
300MB, performance seems not change obviously.
But at the beginning , librbd works in better performance in 12 osds cluster. 
So it seems meaning less for me.
>>>> In a small cluster(12 osds), 4m seq write performance for Librbd VS KRBD 
>>>> is about 0.89 : 1 (177MB/s : 198MB/s ).
>>>> In a big cluster (72 osds), 4m seq write performance for Librbd VS KRBD is 
>>>> about  0.38: 1 (420MB/s : 1080MB/s).


Our problem is librbd bad performance in big cluster (72 osds)
But I can not test in 72 osds right now, some other tests are running . 
I will test in 72 osds when our cluster is ready.

It is a little hard to understand that objecter_inflight_op_bytes=100MB works 
well in 12 osds cluster, but works poor in 72 osd clusters.
Dose objecter_inflight_op_bytes not have an effect  on krbd, only effect 
librbd? 

Thanks.



> 在 2018年11月15日,下午3:50,赵赵贺东  写道:
> 
> Thanks you for your suggestion.
> It really give me a lot of inspirations.
> 
> 
> I will test as your suggestion, and browse through src/common/config_opts.h 
> to see if I can find some configs performance related.
> 
> But, our osd nodes hardware itself is very poor, that is the truth…we have to 
> face it.
> Two osds in an arm board, two gb memory and 2*10T hdd disk on board, so one 
> osd has 1gb memory to support 10TB hdd disk, we must try to make cluster 
> works better as we can.
> 
> 
> Thanks.
> 
>> 在 2018年11月15日,下午2:08,Jason Dillaman  写道:
>> 
>> Attempting to send 256 concurrent 4MiB writes via librbd will pretty
>> quickly hit the default "objecter_inflight_op_bytes = 100 MiB" limit,
>> which will drastically slow (stall) librados. I would recommend
>> re-testing librbd w/ a much higher throttle override.
>> On Thu, Nov 15, 2018 at 11:34 AM 赵赵贺东  wrote:
>>> 
>>> Thank you for your attention.
>>> 
>>> Our test are in run in physical machine environments.
>>> 
>>> Fio for KRBD:
>>> [seq-write]
>>> description="seq-write"
>>> direct=1
>>> ioengine=libaio
>>> filename=/dev/rbd0
>>> numjobs=1
>>> iodepth=256
>>> group_reporting
>>> rw=write
>>> bs=4M
>>> size=10T
>>> runtime=180
>>> 
>>> */dev/rbd0 mapped by rbd_pool/image2, so KRBD & librbd fio test use the 
>>> same image.
>>> 
>>> Fio for librbd:
>>> [global]
>>> direct=1
>>> numjobs=1
>>> ioengine=rbd
>>> clientname=admin
>>> pool=rbd_pool
>>> rbdname=image2
>>> invalidate=0# mandatory
>>> rw=write
>>> bs=4M
>>> size=10T
>>> runtime=180
>>> 
>>> [rbd_iodepth32]
>>> iodepth=256
>>> 
>>> 
>>> Image info:
>>> rbd image 'image2':
>>> size 50TiB in 13107200 objects
>>> order 22 (4MiB objects)
>>> data_pool: ec_rbd_pool
>>> block_name_prefix: rbd_data.8.148bb6b8b4567
>>> format: 2
>>> features: layering, data-pool
>>> flags:
>>> create_timestamp: Wed Nov 14 09:21:18 2018
>>> 
>>> * data_pool is a EC pool
>>> 
>>> Pool info:
>>> pool 8 'rbd_pool' replicated size 2 min_size 1 crush_rule 0 object_hash 
>>> rjenkins pg_num 256 pgp_num 256 last_change 82627 flags hashpspool 
>>> stripe_width 0 application rbd
>>> pool 9 'ec_rbd_pool' erasure size 6 min_size 5 crush_rule 4 object_hash 
>>> rjenkins pg_num 256 pgp_num 256 last_change 82649 flags 
>>> hashpspool,ec_overwrites stripe_width 16384 application rbd
>>> 
>>> 
>>> Rbd cache: Off (Because I think in tcmu , rbd cache will mandatory off, and 
>>> our cluster will export disk by iscsi in furture.)
>>> 
>>> 
>>> Thanks!
>>> 
>>> 
>>> 在 2018年11月15日,下午1:22,Gregory Farnum  写道:
>>> 
>>> You'll need to provide more data about how your test is configured and run 
>>> for us to have a good idea. IIRC librbd is often faster than krbd because 
>>> it can support newer features and things, but krbd may have less overhead 
>>> and is not dependent on the VM's driver configuration in QEMU...
>>> 
>>> On Thu, Nov 15, 2018 at 8:22 AM 赵赵贺东  wrote:
>>>> 
>>>> Hi cephers,
>>>> 
>>>> 
>>>> All our cluster osds are deployed in armhf.
>>>> Could someone say something about what is the rational performance rates 
>>>> for librbd VS KRBD ?
>>>> Or ration

Re: [ceph-users] Librbd performance VS KRBD performance

2018-11-14 Thread
Thanks you for your suggestion.
It really give me a lot of inspirations.


I will test as your suggestion, and browse through src/common/config_opts.h to 
see if I can find some configs performance related.

But, our osd nodes hardware itself is very poor, that is the truth…we have to 
face it.
Two osds in an arm board, two gb memory and 2*10T hdd disk on board, so one osd 
has 1gb memory to support 10TB hdd disk, we must try to make cluster works 
better as we can.


Thanks.

> 在 2018年11月15日,下午2:08,Jason Dillaman  写道:
> 
> Attempting to send 256 concurrent 4MiB writes via librbd will pretty
> quickly hit the default "objecter_inflight_op_bytes = 100 MiB" limit,
> which will drastically slow (stall) librados. I would recommend
> re-testing librbd w/ a much higher throttle override.
> On Thu, Nov 15, 2018 at 11:34 AM 赵赵贺东  wrote:
>> 
>> Thank you for your attention.
>> 
>> Our test are in run in physical machine environments.
>> 
>> Fio for KRBD:
>> [seq-write]
>> description="seq-write"
>> direct=1
>> ioengine=libaio
>> filename=/dev/rbd0
>> numjobs=1
>> iodepth=256
>> group_reporting
>> rw=write
>> bs=4M
>> size=10T
>> runtime=180
>> 
>> */dev/rbd0 mapped by rbd_pool/image2, so KRBD & librbd fio test use the same 
>> image.
>> 
>> Fio for librbd:
>> [global]
>> direct=1
>> numjobs=1
>> ioengine=rbd
>> clientname=admin
>> pool=rbd_pool
>> rbdname=image2
>> invalidate=0# mandatory
>> rw=write
>> bs=4M
>> size=10T
>> runtime=180
>> 
>> [rbd_iodepth32]
>> iodepth=256
>> 
>> 
>> Image info:
>> rbd image 'image2':
>> size 50TiB in 13107200 objects
>> order 22 (4MiB objects)
>> data_pool: ec_rbd_pool
>> block_name_prefix: rbd_data.8.148bb6b8b4567
>> format: 2
>> features: layering, data-pool
>> flags:
>> create_timestamp: Wed Nov 14 09:21:18 2018
>> 
>> * data_pool is a EC pool
>> 
>> Pool info:
>> pool 8 'rbd_pool' replicated size 2 min_size 1 crush_rule 0 object_hash 
>> rjenkins pg_num 256 pgp_num 256 last_change 82627 flags hashpspool 
>> stripe_width 0 application rbd
>> pool 9 'ec_rbd_pool' erasure size 6 min_size 5 crush_rule 4 object_hash 
>> rjenkins pg_num 256 pgp_num 256 last_change 82649 flags 
>> hashpspool,ec_overwrites stripe_width 16384 application rbd
>> 
>> 
>> Rbd cache: Off (Because I think in tcmu , rbd cache will mandatory off, and 
>> our cluster will export disk by iscsi in furture.)
>> 
>> 
>> Thanks!
>> 
>> 
>> 在 2018年11月15日,下午1:22,Gregory Farnum  写道:
>> 
>> You'll need to provide more data about how your test is configured and run 
>> for us to have a good idea. IIRC librbd is often faster than krbd because it 
>> can support newer features and things, but krbd may have less overhead and 
>> is not dependent on the VM's driver configuration in QEMU...
>> 
>> On Thu, Nov 15, 2018 at 8:22 AM 赵赵贺东  wrote:
>>> 
>>> Hi cephers,
>>> 
>>> 
>>> All our cluster osds are deployed in armhf.
>>> Could someone say something about what is the rational performance rates 
>>> for librbd VS KRBD ?
>>> Or rational performance loss range when we use librbd compare to KRBD.
>>> I googled a lot, but I could not find a solid criterion.
>>> In fact , it confused me for a long time.
>>> 
>>> About our tests:
>>> In a small cluster(12 osds), 4m seq write performance for Librbd VS KRBD is 
>>> about 0.89 : 1 (177MB/s : 198MB/s ).
>>> In a big cluster (72 osds), 4m seq write performance for Librbd VS KRBD is 
>>> about  0.38: 1 (420MB/s : 1080MB/s).
>>> 
>>> We expect even increase  osd numbers, Librbd performance can keep being 
>>> close to KRBD.
>>> 
>>> PS: Librbd performance are tested both in  fio rbd engine & iscsi 
>>> (tcmu+librbd).
>>> 
>>> Thanks.
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> -- 
> Jason

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Librbd performance VS KRBD performance

2018-11-14 Thread
Thank you for your attention.

Our test are in run in physical machine environments.

Fio for KRBD:
[seq-write]
description="seq-write"
direct=1
ioengine=libaio
filename=/dev/rbd0
numjobs=1
iodepth=256
group_reporting
rw=write
bs=4M
size=10T
runtime=180

*/dev/rbd0 mapped by rbd_pool/image2, so KRBD & librbd fio test use the same 
image.

Fio for librbd:
[global]
direct=1
numjobs=1
ioengine=rbd
clientname=admin
pool=rbd_pool
rbdname=image2
invalidate=0# mandatory
rw=write
bs=4M
size=10T
runtime=180

[rbd_iodepth32]
iodepth=256


Image info:
rbd image 'image2':
size 50TiB in 13107200 objects
order 22 (4MiB objects)
data_pool: ec_rbd_pool
block_name_prefix: rbd_data.8.148bb6b8b4567
format: 2
features: layering, data-pool
flags: 
create_timestamp: Wed Nov 14 09:21:18 2018

* data_pool is a EC pool

Pool info:
pool 8 'rbd_pool' replicated size 2 min_size 1 crush_rule 0 object_hash 
rjenkins pg_num 256 pgp_num 256 last_change 82627 flags hashpspool stripe_width 
0 application rbd
pool 9 'ec_rbd_pool' erasure size 6 min_size 5 crush_rule 4 object_hash 
rjenkins pg_num 256 pgp_num 256 last_change 82649 flags 
hashpspool,ec_overwrites stripe_width 16384 application rbd


Rbd cache: Off (Because I think in tcmu , rbd cache will mandatory off, and our 
cluster will export disk by iscsi in furture.) 


Thanks!


> 在 2018年11月15日,下午1:22,Gregory Farnum  写道:
> 
> You'll need to provide more data about how your test is configured and run 
> for us to have a good idea. IIRC librbd is often faster than krbd because it 
> can support newer features and things, but krbd may have less overhead and is 
> not dependent on the VM's driver configuration in QEMU...
> 
> On Thu, Nov 15, 2018 at 8:22 AM 赵赵贺东  <mailto:zhaohed...@gmail.com>> wrote:
> Hi cephers,
> 
> 
> All our cluster osds are deployed in armhf.
> Could someone say something about what is the rational performance rates for 
> librbd VS KRBD ?
> Or rational performance loss range when we use librbd compare to KRBD.
> I googled a lot, but I could not find a solid criterion.  
> In fact , it confused me for a long time.
> 
> About our tests:
> In a small cluster(12 osds), 4m seq write performance for Librbd VS KRBD is 
> about 0.89 : 1 (177MB/s : 198MB/s ). 
> In a big cluster (72 osds), 4m seq write performance for Librbd VS KRBD is 
> about  0.38: 1 (420MB/s : 1080MB/s).
> 
> We expect even increase  osd numbers, Librbd performance can keep being close 
> to KRBD.
> 
> PS: Librbd performance are tested both in  fio rbd engine & iscsi 
> (tcmu+librbd).
> 
> Thanks.
> 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Librbd performance VS KRBD performance

2018-11-14 Thread
Hi cephers,


All our cluster osds are deployed in armhf.
Could someone say something about what is the rational performance rates for 
librbd VS KRBD ?
Or rational performance loss range when we use librbd compare to KRBD.
I googled a lot, but I could not find a solid criterion.  
In fact , it confused me for a long time.

About our tests:
In a small cluster(12 osds), 4m seq write performance for Librbd VS KRBD is 
about 0.89 : 1 (177MB/s : 198MB/s ). 
In a big cluster (72 osds), 4m seq write performance for Librbd VS KRBD is 
about  0.38: 1 (420MB/s : 1080MB/s).

We expect even increase  osd numbers, Librbd performance can keep being close 
to KRBD.

PS: Librbd performance are tested both in  fio rbd engine & iscsi 
(tcmu+librbd).

Thanks.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] understanding PG count for a file

2018-08-02 Thread
what is the size of your file?What about a big size file?
If the file is big enough, it can not be stored by only two osds.
If the file is very small, as you know object size is 4MB, so it can be stored 
by only one object in one primary osd, and slave osd.


> 在 2018年8月2日,下午6:56,Surya Bala  写道:
> 
> I understood your explaination.
> The result of 'ceph osd map   ' command always gives only 
> 2 OSDs(1 primary, 1 secondary). But it is not mandatory the objects are 
> stored only in 2 OSDs it should be spreaded many OSDs.
> 
> So my doubt is why the command gives this result 
> 
> Regards
> Surya Balan
> 
> 
> On Thu, Aug 2, 2018 at 1:30 PM, 赵贺东  > wrote:
> Hello,
> 
> file -> many objects-> many PG(each pg has two copies, because your 
> replication count is two)-> many OSD 
> pgs can be distributed in OSDs, no limitation for only 2, replication count 
> 2only determine pg copies is 2.
> 
> Hope this will help. 
> 
> > 在 2018年8月2日,下午3:43,Surya Bala  > > 写道:
> > 
> > Hi folks,
> > 
> > From the ceph documents i understood about PG and why should PG number 
> > should be optimal. But i dont find any info about the below point
> > 
> > I am using cephfs client in my ceph cluster. When we store a file(consider 
> > replication count is 2) , it will be splitted into objects and each object 
> > will be stored in different PG and each PG will be mapped to a OSD. It 
> > means there can be many OSD for a single file . But why are we getting only 
> > 2 OSDs by the command 'ceph OSD map'
> > 
> > file -> many objects-> many PG-> many OSD 
> > 
> > Is that all objects of a file will be stored in only 2 OSD(in case of 
> > replication count is 2)?
> > 
> > Regards
> > Surya Balan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > 
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-osd start failed because of PG::peek_map_epoch() assertion

2018-06-26 Thread
Hi Cephers,


One of our cluster’s osd can not start because of pg in the osd can not load 
infover_key from rocksdb,  log as the following.
Could someone talk something about this,  thank you guys!
 
Log:

2018-06-26 15:09:16.036832 b66c6000  0 osd.41 3712 load_pgs
2056114 2018-06-26 15:09:16.036921 b66c6000 10 osd.41 3712 load_pgs ignoring 
unrecognized meta
2056115 2018-06-26 15:09:16.037002 b66c6000 15 
bluestore(/var/lib/ceph/osd/ceph-41) omap_get_values 4.b_head oid 
#4:d000head#
2056116 2018-06-26 15:09:16.037023 b66c6000 30 bluestore.OnodeSpace(0xa0a4aec 
in 0x5eccbd0) lookup
2056117 2018-06-26 15:09:16.037030 b66c6000 30 bluestore.OnodeSpace(0xa0a4aec 
in 0x5eccbd0) lookup #4:d000head# miss.  // not 
found in cache
2056118 2018-06-26 15:09:16.037045 b66c6000 20 
bluestore(/var/lib/ceph/osd/ceph-41).collection(4.b_head 0xa0a4a00) get_onode 
oid #4:d000head# key 0x7f
8004d00021213dfffe’o’.   // 
found in db
2056119 2018-06-26 15:09:16.038876 aa44c8e0 10 trim shard target 5734 k 
meta/data ratios 0.16875 + 0.05 (967 k + 286 k),  current 59662  (30990  + 
28672 )
2056120 2018-06-26 15:09:16.038933 aa44c8e0 10 trim shard target 5734 k 
meta/data ratios 0.16875 + 0.05 (967 k + 286 k),  current 0  (0  + 0 )
2056121 2018-06-26 15:09:16.038948 aa44c8e0 10 trim shard target 5734 k 
meta/data ratios 0.16875 + 0.05 (967 k + 286 k),  current 0  (0  + 0 )
2056122 2018-06-26 15:09:16.038959 aa44c8e0 10 trim shard target 5734 k 
meta/data ratios 0.16875 + 0.05 (967 k + 286 k),  current 0  (0  + 0 )
2056123 2018-06-26 15:09:16.038969 aa44c8e0 10 trim shard target 5734 k 
meta/data ratios 0.16875 + 0.05 (967 k + 286 k),  current 0  (0  + 0 )
2056124 2018-06-26 15:09:16.046036 b66c6000 20 
bluestore(/var/lib/ceph/osd/ceph-41).collection(4.b_head 0xa0a4a00)  r 0 v.len 
29
2056125 2018-06-26 15:09:16.046095 b66c6000 30 bluestore.OnodeSpace(0xa0a4aec 
in 0x5eccbd0) add #4:d000head# 0x5eecf00
2056126 2018-06-26 15:09:16.046118 b66c6000 20 bluestore.onode(0x5eecf00).flush 
flush done. 
// flush into cache
2056127 2018-06-26 15:09:16.046176 b66c6000 10 
bluestore(/var/lib/ceph/osd/ceph-41) omap_get_values 4.b_head oid 
#4:d000head# = 0
2056128 2018-06-26 15:09:16.046199 b66c6000 10 osd.41 3712 pgid 4.b coll 
4.b_head
2056129 2018-06-26 15:09:16.046217 b66c6000 15 
bluestore(/var/lib/ceph/osd/ceph-41) omap_get_values 4.b_head oid 
#4:d000head#
2056130 2018-06-26 15:09:16.046225 b66c6000 30 bluestore.OnodeSpace(0xa0a4aec 
in 0x5eccbd0) lookup
2056131 2018-06-26 15:09:16.046231 b66c6000 30 bluestore.OnodeSpace(0xa0a4aec 
in 0x5eccbd0) lookup #4:d000head# hit 0x5eecf00// cache hit
2056132 2018-06-26 15:09:16.046238 b66c6000 20 bluestore.onode(0x5eecf00).flush 
flush done
2056133 2018-06-26 15:09:16.046629 b66c6000 30 
bluestore(/var/lib/ceph/osd/ceph-41) omap_get_values  got 
0x06ea'._epoch' -> _epoch//Only got ‘_epoch', but not 
‘_infover’, so the assertion triggered!
2056134 2018-06-26 15:09:16.046683 b66c6000 10 
bluestore(/var/lib/ceph/osd/ceph-41) omap_get_values 4.b_head oid 
#4:d000head# = 0
2056135 2018-06-26 15:09:16.049543 b66c6000 -1 
/home/ceph01/projects/master/ceph/src/osd/PG.cc : In function 
'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, 
ceph::bufferlist*)' thread b66c6000 time 2018-06-26 15:09:16.046701
2056136 /home/ceph01/projects/master/ceph/src/osd/PG.cc : 3136: 
FAILED assert(values.size() == 2)


Source code v12.2.4

int PG::peek_map_epoch(ObjectStore *store,
   spg_t pgid,
   epoch_t *pepoch,
   bufferlist *bl)
{
  …
  set keys;
  keys.insert(infover_key);
  keys.insert(epoch_key);
  map values;
  int r = store->omap_get_values(coll, pgmeta_oid, keys, );
  if (r == 0) {
assert(values.size() == 2);
…

}___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] XFS Metadata corruption while activating OSD

2018-03-20 Thread
I’m sorry for my late reply.
Thank you for your reply.
Yes, this error only exists while backend is xfs.
Ext4 will not trigger the error.



> 在 2018年3月12日,下午6:31,Peter Woodman <pe...@shortbus.org> 写道:
> 
> from what i've heard, xfs has problems on arm. use btrfs, or (i
> believe?) ext4+bluestore will work.
> 
> On Sun, Mar 11, 2018 at 9:49 PM, Christian Wuerdig
> <christian.wuer...@gmail.com> wrote:
>> Hm, so you're running OSD nodes with 2GB of RAM and 2x10TB = 20TB of
>> storage? Literally everything posted on this list in relation to HW
>> requirements and related problems will tell you that this simply isn't going
>> to work. The slightest hint of a problem will simply kill the OSD nodes with
>> OOM. Have you tried with smaller disks - like 1TB models (or even smaller
>> like 256GB SSDs) and see if the same problem persists?
>> 
>> 
>> On Tue, 6 Mar 2018 at 10:51, 赵赵贺东 <zhaohed...@gmail.com> wrote:
>>> 
>>> Hello ceph-users,
>>> 
>>> It is a really really Really tough problem for our team.
>>> We investigated in the problem for a long time, try a lot of efforts, but
>>> can’t solve the problem, even the concentrate cause of the problem is still
>>> unclear for us!
>>> So, Anyone give any solution/suggestion/opinion whatever  will be highly
>>> highly appreciated!!!
>>> 
>>> Problem Summary:
>>> When we activate osd, there will be  metadata corrupttion in the
>>> activating disk, probability is 100% !
>>> 
>>> Admin Nodes node:
>>> Platform: X86
>>> OS: Ubuntu 16.04
>>> Kernel: 4.12.0
>>> Ceph: Luminous 12.2.2
>>> 
>>> OSD nodes:
>>> Platform: armv7
>>> OS:   Ubuntu 14.04
>>> Kernel:   4.4.39
>>> Ceph: Lominous 12.2.2
>>> Disk: 10T+10T
>>> Memory: 2GB
>>> 
>>> Deploy log:
>>> 
>>> 
>>> dmesg log:(Sorry arms001-01 dmesg log has log has been lost, but error
>>> message about metadata corruption on arms003-10 are the same with
>>> arms001-01)
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.534232] XFS (sda1): Unmount and
>>> run xfs_repair
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.539100] XFS (sda1): First 64
>>> bytes of corrupted metadata buffer:
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.545504] eb82f000: 58 46 53 42 00
>>> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.553569] eb82f010: 00 00 00 00 00
>>> 00 00 00 00 00 00 00 00 00 00 00  
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.561624] eb82f020: fc 4e e3 89 50
>>> 8f 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.569706] eb82f030: 00 00 00 00 80
>>> 00 00 07 ff ff ff ff ff ff ff ff  
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.58] XFS (sda1): metadata I/O
>>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.602944] XFS (sda1): Metadata
>>> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data
>>> block 0x48b9ff80
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.614170] XFS (sda1): Unmount and
>>> run xfs_repair
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.619030] XFS (sda1): First 64
>>> bytes of corrupted metadata buffer:
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.625403] eb901000: 58 46 53 42 00
>>> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.633441] eb901010: 00 00 00 00 00
>>> 00 00 00 00 00 00 00 00 00 00 00  
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.641474] eb901020: fc 4e e3 89 50
>>> 8f 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.649519] eb901030: 00 00 00 00 80
>>> 00 00 07 ff ff ff ff ff ff ff ff  
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.657554] XFS (sda1): metadata I/O
>>> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.675056] XFS (sda1): Metadata
>>> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data
>>> block 0x48b9ff80
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.686228] XFS (sda1): Unmount and
>>> run xfs_repair
>>> Mar  5 11:08:49 arms003-10 kernel: [  252.691054] XFS (sda1): First 64
>>> bytes of corrupted metadata 

Re: [ceph-users] XFS Metadata corruption while activating OSD

2018-03-20 Thread


> 在 2018年3月12日,上午9:49,Christian Wuerdig <christian.wuer...@gmail.com> 写道:
> 
> Hm, so you're running OSD nodes with 2GB of RAM and 2x10TB = 20TB of storage? 
> Literally everything posted on this list in relation to HW requirements and 
> related problems will tell you that this simply isn't going to work. The 
> slightest hint of a problem will simply kill the OSD nodes with OOM. Have you 
> tried with smaller disks - like 1TB models (or even smaller like 256GB SSDs) 
> and see if the same problem persists?

Thank you for your reply.
I am sorry for my late reply.
You are right , when the backend is bluestore , there was OOM from time to time.
Now will upgrade our HW to see whether we avoid OOM.
Besides, after we upgrade kernel from 4.4.39 to 4.4.120, the activating osd xfs 
error seems to be fixed.

> 
> 
> On Tue, 6 Mar 2018 at 10:51, 赵赵贺东 <zhaohed...@gmail.com 
> <mailto:zhaohed...@gmail.com>> wrote:
> Hello ceph-users,
> 
> It is a really really Really tough problem for our team.
> We investigated in the problem for a long time, try a lot of efforts, but 
> can’t solve the problem, even the concentrate cause of the problem is still 
> unclear for us!
> So, Anyone give any solution/suggestion/opinion whatever  will be highly 
> highly appreciated!!!
> 
> Problem Summary:
> When we activate osd, there will be  metadata corrupttion in the activating 
> disk, probability is 100% !
> 
> Admin Nodes node:
> Platform: X86
> OS:   Ubuntu 16.04
> Kernel:   4.12.0
> Ceph: Luminous 12.2.2
> 
> OSD nodes:
> Platform: armv7
> OS:   Ubuntu 14.04
> Kernel:   4.4.39
> Ceph: Lominous 12.2.2
> Disk: 10T+10T
> Memory:   2GB
> 
> Deploy log:
> 
> 
> dmesg log:(Sorry arms001-01 dmesg log has log has been lost, but error 
> message about metadata corruption on arms003-10 are the same with arms001-01)
> Mar  5 11:08:49 arms003-10 kernel: [  252.534232] XFS (sda1): Unmount and run 
> xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.539100] XFS (sda1): First 64 bytes 
> of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.545504] eb82f000: 58 46 53 42 00 00 
> 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.553569] eb82f010: 00 00 00 00 00 00 
> 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.561624] eb82f020: fc 4e e3 89 50 8f 
> 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.569706] eb82f030: 00 00 00 00 80 00 
> 00 07 ff ff ff ff ff ff ff ff  
> Mar  5 11:08:49 arms003-10 kernel: [  252.58] XFS (sda1): metadata I/O 
> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
> Mar  5 11:08:49 arms003-10 kernel: [  252.602944] XFS (sda1): Metadata 
> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data 
> block 0x48b9ff80
> Mar  5 11:08:49 arms003-10 kernel: [  252.614170] XFS (sda1): Unmount and run 
> xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.619030] XFS (sda1): First 64 bytes 
> of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.625403] eb901000: 58 46 53 42 00 00 
> 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.633441] eb901010: 00 00 00 00 00 00 
> 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.641474] eb901020: fc 4e e3 89 50 8f 
> 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.649519] eb901030: 00 00 00 00 80 00 
> 00 07 ff ff ff ff ff ff ff ff  
> Mar  5 11:08:49 arms003-10 kernel: [  252.657554] XFS (sda1): metadata I/O 
> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
> Mar  5 11:08:49 arms003-10 kernel: [  252.675056] XFS (sda1): Metadata 
> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data 
> block 0x48b9ff80
> Mar  5 11:08:49 arms003-10 kernel: [  252.686228] XFS (sda1): Unmount and run 
> xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.691054] XFS (sda1): First 64 bytes 
> of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.697425] eb901000: 58 46 53 42 00 00 
> 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.705459] eb901010: 00 00 00 00 00 00 
> 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.713489] eb901020: fc 4e e3 89 50 8f 
> 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.721520] eb901030: 00 00 00 00 80 00 
> 00 07 ff ff ff ff ff ff ff ff  
&

Re: [ceph-users] /var/lib/ceph/osd/ceph-xxx/current/meta shows "Structure needs cleaning"

2018-03-08 Thread
Hi Brad,

Thank you for your attention.

> 在 2018年3月8日,下午4:47,Brad Hubbard  写道:
> 
> On Thu, Mar 8, 2018 at 5:01 PM, 赵贺东  wrote:
>> Hi All,
>> 
>> Every time after we activate osd, we got “Structure needs cleaning” in 
>> /var/lib/ceph/osd/ceph-xxx/current/meta.
>> 
>> 
>> /var/lib/ceph/osd/ceph-xxx/current/meta
>> # ls -l
>> ls: reading directory .: Structure needs cleaning
>> total 0
>> 
>> Could Anyone say something about this error?
> 
> It's an indication of possible corruption on the filesystem containing "meta".
> 
> Can you unmount it and run a filesystem check on it?
I did some xfs_repair operation, but no effect.Structure needs cleaning” still 
exist.



> 
> At the time the filesystem first detected the corruption it would have
> logged it to dmesg and possibly syslog which may give you a clue. Did
> you lose power or have a kernel panic or something?
We did not lose power.
You are right, we get a metadata corruption in dmesg every time just 
following the osd activating operation.

[  399.513525] XFS (sda1): Metadata corruption detected at 
xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
[  399.524709] XFS (sda1): Unmount and run xfs_repair
[  399.529511] XFS (sda1): First 64 bytes of corrupted metadata buffer:
[  399.535917] dd8f2000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
XFSB.s..
[  399.543959] dd8f2010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

[  399.551983] dd8f2020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
.0@"Q.O..sV.q..$
[  399.560037] dd8f2030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  

[  399.568118] XFS (sda1): metadata I/O error: block 0x48b9ff80 
("xfs_trans_read_buf_map") error 117 numblks 8
[  399.583179] XFS (sda1): Metadata corruption detected at 
xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
[  399.594378] XFS (sda1): Unmount and run xfs_repair
[  399.599182] XFS (sda1): First 64 bytes of corrupted metadata buffer:
[  399.605575] e47db000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
XFSB.s..
[  399.613613] e47db010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

[  399.621637] e47db020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
.0@"Q.O..sV.q..$
[  399.629679] e47db030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  

[  399.637856] XFS (sda1): metadata I/O error: block 0x48b9ff80 
("xfs_trans_read_buf_map") error 117 numblks 8
[  399.648165] XFS (sda1): Metadata corruption detected at 
xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
[  399.659378] XFS (sda1): Unmount and run xfs_repair
[  399.664196] XFS (sda1): First 64 bytes of corrupted metadata buffer:
[  399.670570] e47db000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
XFSB.s..
[  399.678610] e47db010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

[  399.686643] e47db020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
.0@"Q.O..sV.q..$
[  399.694681] e47db030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  

[  399.702794] XFS (sda1): metadata I/O error: block 0x48b9ff80 
("xfs_trans_read_buf_map") error 117 numblks 8


Thank you !


> 
>> 
>> Thank you!
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> -- 
> Cheers,
> Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] XFS Metadata corruption while activating OSD

2018-03-05 Thread
Hello ceph-users,It is a really really Really tough problem for our team.We investigated in the problem for a long time, try a lot of efforts, but can’t solve the problem, even the concentrate cause of the problem is still unclear for us!So, Anyone give any solution/suggestion/opinion whatever  will be highly highly appreciated!!!Problem Summary:When we activate osd, there will be  metadata corrupttion in the activating disk, probability is 100% !Admin Nodes node:Platform:	X86OS:		Ubuntu 16.04Kernel:	4.12.0Ceph:	Luminous 12.2.2OSD nodes:Platform:	armv7OS:      	Ubuntu 14.04Kernel:  	4.4.39Ceph:	Lominous 12.2.2Disk:	10T+10TMemory:	2GBDeploy log:
root@mnc000:/home/mnvadmin/ceph# ceph-deploy disk zap arms001-01:sda
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.39): /usr/bin/ceph-deploy disk zap 
arms001-01:sda
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : 
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : 
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : [('arms001-01', '/dev/sda', None)]
[ceph_deploy.osd][DEBUG ] zapping /dev/sda on arms001-01
[arms001-01][DEBUG ] connection detected need for sudo
[arms001-01][DEBUG ] connected to host: arms001-01
[arms001-01][DEBUG ] detect platform information from remote host
[arms001-01][DEBUG ] detect machine type
[arms001-01][DEBUG ] find the location of an executable
[arms001-01][INFO ] Running command: sudo /sbin/initctl version
[arms001-01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[arms001-01][DEBUG ] zeroing last few blocks of device
[arms001-01][DEBUG ] find the location of an executable
[arms001-01][INFO ] Running command: sudo /usr/local/bin/ceph-disk zap /dev/sda
[arms001-01][WARNIN] 
/usr/local/lib/python2.7/dist-packages/ceph_disk-1.0.0-py2.7.egg/ceph_disk/main.py:5653:
 UserWarning:
[arms001-01][WARNIN] 
***
[arms001-01][WARNIN] This tool is now deprecated in favor of ceph-volume.
[arms001-01][WARNIN] It is recommended to use ceph-volume for OSD deployments. 
For details see:
[arms001-01][WARNIN]
[arms001-01][WARNIN] http://docs.ceph.com/docs/master/ceph-volume/#migrating
[arms001-01][WARNIN]
[arms001-01][WARNIN] 
***
[arms001-01][WARNIN]
[arms001-01][DEBUG ] 4 bytes were erased at offset 0x0 (xfs)
[arms001-01][DEBUG ] they were: 58 46 53 42
[arms001-01][WARNIN] 10+0 records in
[arms001-01][WARNIN] 10+0 records out
[arms001-01][WARNIN] 10485760 bytes (10 MB) copied, 0.0610462 s, 172 MB/s
[arms001-01][WARNIN] 10+0 records in
[arms001-01][WARNIN] 10+0 records out
[arms001-01][WARNIN] 10485760 bytes (10 MB) copied, 0.129642 s, 80.9 MB/s
[arms001-01][WARNIN] Caution: invalid backup GPT header, but valid main header; 
regenerating
[arms001-01][WARNIN] backup header from main header.
[arms001-01][WARNIN]
[arms001-01][WARNIN] Warning! Main and backup partition tables differ! Use the 
'c' and 'e' options
[arms001-01][WARNIN] on the recovery & transformation menu to examine the two 
tables.
[arms001-01][WARNIN]
[arms001-01][WARNIN] Warning! One or more CRCs don't match. You should repair 
the disk!
[arms001-01][WARNIN]
[arms001-01][DEBUG ] 

[arms001-01][DEBUG ] Caution: Found protective or hybrid MBR and corrupt GPT. 
Using GPT, but disk
[arms001-01][DEBUG ] verification and recovery are STRONGLY recommended.
[arms001-01][DEBUG ] 

[arms001-01][DEBUG ] GPT data structures destroyed! You may now partition the 
disk using fdisk or
[arms001-01][DEBUG ] other utilities.
[arms001-01][DEBUG ] Creating new GPT entries.
[arms001-01][DEBUG ] The operation has completed successfully.
[arms001-01][WARNIN] 
/usr/local/lib/python2.7/dist-packages/ceph_disk-1.0.0-py2.7.egg/ceph_disk/main.py:5685:
 UserWarning:
[arms001-01][WARNIN] 
***
[arms001-01][WARNIN] This tool is now deprecated in favor of ceph-volume.
[arms001-01][WARNIN] It is recommended to use ceph-volume for OSD deployments. 
For details see:
[arms001-01][WARNIN]
[arms001-01][WARNIN] http://docs.ceph.com/docs/master/ceph-volume/#migrating
[arms001-01][WARNIN]
[arms001-01][WARNIN] 
***
[arms001-01][WARNIN]


root@mnc000:/home/mnvadmin/ceph# ceph-deploy osd prepare --filestore 
arms001-01:sda

Re: [ceph-users] ceph-volume does not support upstart

2018-01-10 Thread
Hello,
I am sorry for the delay.
Thank you for your suggestion.

It is better to update system or keep using ceph-disk in fact. 
Thank you Alfredo Deza & Cary.


> 在 2018年1月8日,下午11:41,Alfredo Deza <ad...@redhat.com> 写道:
> 
> ceph-volume relies on systemd, it will not work with upstart. Going
> the fstab way might work, but most of the lvm implementation will want
> to do systemd-related calls like enabling units and placing files.
> 
> For upstart you might want to keep using ceph-disk, unless upgrading
> to a newer OS is an option in which case ceph-volume would work (as
> long as systemd is available)
> 
> On Sat, Dec 30, 2017 at 9:11 PM, 赵赵贺东 <zhaohed...@gmail.com> wrote:
>> Hello Cary,
>> 
>> Thank you for your detailed description, it’s really helpful for me!
>> I will have a try when I get back to my office!
>> 
>> Thank you for your attention to this matter.
>> 
>> 
>> 在 2017年12月30日,上午3:51,Cary <dynamic.c...@gmail.com> 写道:
>> 
>> Hello,
>> 
>> I mount my Bluestore OSDs in /etc/fstab:
>> 
>> vi /etc/fstab
>> 
>> tmpfs   /var/lib/ceph/osd/ceph-12  tmpfs   rw,relatime 0 0
>> =
>> Then mount everyting in fstab with:
>> mount -a
>> ==
>> I activate my OSDs this way on startup: You can find the fsid with
>> 
>> cat /var/lib/ceph/osd/ceph-12/fsid
>> 
>> Then add file named ceph.start so ceph-volume will be run at startup.
>> 
>> vi /etc/local.d/ceph.start
>> ceph-volume lvm activate 12 827f4a2c-8c1b-427b-bd6c-66d31a0468ac
>> ==
>> Make it excitable:
>> chmod 700 /etc/local.d/ceph.start
>> ==
>> cd /etc/local.d/
>> ./ceph.start
>> ==
>> I am a Gentoo user and use OpenRC, so this may not apply to you.
>> ==
>> cd /etc/init.d/
>> ln -s ceph ceph-osd.12
>> /etc/init.d/ceph-osd.12 start
>> rc-update add ceph-osd.12 default
>> 
>> Cary
>> 
>> On Fri, Dec 29, 2017 at 8:47 AM, 赵赵贺东 <zhaohed...@gmail.com> wrote:
>> 
>> Hello Cary!
>> It’s really big surprise for me to receive your reply!
>> Sincere thanks to you!
>> I know it’s a fake execute file, but it works!
>> 
>> >
>> $ cat /usr/sbin/systemctl
>> #!/bin/bash
>> exit 0
>> <
>> 
>> I can start my osd by following command
>> /usr/bin/ceph-osd --cluster=ceph -i 12 -f --setuser ceph --setgroup ceph
>> 
>> But, threre are still problems.
>> 1.Though ceph-osd can start successfully, prepare log and activate log looks
>> like errors occurred.
>> 
>> Prepare log:
>> ===>
>> # ceph-volume lvm prepare --bluestore --data vggroup/lv
>> Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-12
>> Running command: chown -R ceph:ceph /dev/dm-0
>> Running command: sudo ln -s /dev/vggroup/lv /var/lib/ceph/osd/ceph-12/block
>> Running command: sudo ceph --cluster ceph --name client.bootstrap-osd
>> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
>> /var/lib/ceph/osd/ceph-12/activate.monmap
>> stderr: got monmap epoch 1
>> Running command: ceph-authtool /var/lib/ceph/osd/ceph-12/keyring
>> --create-keyring --name osd.12 --add-key
>> AQAQ+UVa4z2ANRAAmmuAExQauFinuJuL6A56ww==
>> stdout: creating /var/lib/ceph/osd/ceph-12/keyring
>> stdout: added entity osd.12 auth auth(auid = 18446744073709551615
>> key=AQAQ+UVa4z2ANRAAmmuAExQauFinuJuL6A56ww== with 0 caps)
>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/keyring
>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/
>> Running command: sudo ceph-osd --cluster ceph --osd-objectstore bluestore
>> --mkfs -i 12 --monmap /var/lib/ceph/osd/ceph-12/activate.monmap --key
>>  --osd-data
>> /var/lib/ceph/osd/ceph-12/ --osd-uuid 827f4a2c-8c1b-427b-bd6c-66d31a0468ac
>> --setuser ceph --setgroup ceph
>> stderr: warning: unable to create /var/run/ceph: (13) Permission denied
>> stderr: 2017-12-29 08:13:08.609127 b66f3000 -1 asok(0x850c62a0)
>> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to
>> bind the UNIX domain socket to '/var/run/ceph/ceph-osd.12.asok': (2) No such
>> file or directory
>> stderr:
>> stderr: 201

Re: [ceph-users] ceph-volume does not support upstart

2017-12-30 Thread
Hello Cary,

Thank you for your detailed description, it’s really helpful for me!
I will have a try when I get back to my office!

Thank you for your attention to this matter.

> 在 2017年12月30日,上午3:51,Cary <dynamic.c...@gmail.com> 写道:
> 
> Hello,
> 
> I mount my Bluestore OSDs in /etc/fstab:
> 
> vi /etc/fstab
> 
> tmpfs   /var/lib/ceph/osd/ceph-12  tmpfs   rw,relatime 0 0
> =
> Then mount everyting in fstab with:
> mount -a
> ==
> I activate my OSDs this way on startup: You can find the fsid with
> 
> cat /var/lib/ceph/osd/ceph-12/fsid
> 
> Then add file named ceph.start so ceph-volume will be run at startup.
> 
> vi /etc/local.d/ceph.start
> ceph-volume lvm activate 12 827f4a2c-8c1b-427b-bd6c-66d31a0468ac
> ==
> Make it excitable:
> chmod 700 /etc/local.d/ceph.start
> ==
> cd /etc/local.d/
> ./ceph.start
> ==
> I am a Gentoo user and use OpenRC, so this may not apply to you.
> ==
> cd /etc/init.d/
> ln -s ceph ceph-osd.12
> /etc/init.d/ceph-osd.12 start
> rc-update add ceph-osd.12 default
> 
> Cary
> 
> On Fri, Dec 29, 2017 at 8:47 AM, 赵赵贺东 <zhaohed...@gmail.com> wrote:
>> Hello Cary!
>> It’s really big surprise for me to receive your reply!
>> Sincere thanks to you!
>> I know it’s a fake execute file, but it works!
>> 
>> >
>> $ cat /usr/sbin/systemctl
>> #!/bin/bash
>> exit 0
>> <
>> 
>> I can start my osd by following command
>> /usr/bin/ceph-osd --cluster=ceph -i 12 -f --setuser ceph --setgroup ceph
>> 
>> But, threre are still problems.
>> 1.Though ceph-osd can start successfully, prepare log and activate log looks
>> like errors occurred.
>> 
>> Prepare log:
>> ===>
>> # ceph-volume lvm prepare --bluestore --data vggroup/lv
>> Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-12
>> Running command: chown -R ceph:ceph /dev/dm-0
>> Running command: sudo ln -s /dev/vggroup/lv /var/lib/ceph/osd/ceph-12/block
>> Running command: sudo ceph --cluster ceph --name client.bootstrap-osd
>> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
>> /var/lib/ceph/osd/ceph-12/activate.monmap
>> stderr: got monmap epoch 1
>> Running command: ceph-authtool /var/lib/ceph/osd/ceph-12/keyring
>> --create-keyring --name osd.12 --add-key
>> AQAQ+UVa4z2ANRAAmmuAExQauFinuJuL6A56ww==
>> stdout: creating /var/lib/ceph/osd/ceph-12/keyring
>> stdout: added entity osd.12 auth auth(auid = 18446744073709551615
>> key=AQAQ+UVa4z2ANRAAmmuAExQauFinuJuL6A56ww== with 0 caps)
>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/keyring
>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/
>> Running command: sudo ceph-osd --cluster ceph --osd-objectstore bluestore
>> --mkfs -i 12 --monmap /var/lib/ceph/osd/ceph-12/activate.monmap --key
>>  --osd-data
>> /var/lib/ceph/osd/ceph-12/ --osd-uuid 827f4a2c-8c1b-427b-bd6c-66d31a0468ac
>> --setuser ceph --setgroup ceph
>> stderr: warning: unable to create /var/run/ceph: (13) Permission denied
>> stderr: 2017-12-29 08:13:08.609127 b66f3000 -1 asok(0x850c62a0)
>> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to
>> bind the UNIX domain socket to '/var/run/ceph/ceph-osd.12.asok': (2) No such
>> file or directory
>> stderr:
>> stderr: 2017-12-29 08:13:08.643410 b66f3000 -1
>> bluestore(/var/lib/ceph/osd/ceph-12//block) _read_bdev_label unable to
>> decode label at offset 66: buffer::malformed_input: void
>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past
>> end of struct encoding
>> stderr: 2017-12-29 08:13:08.644055 b66f3000 -1
>> bluestore(/var/lib/ceph/osd/ceph-12//block) _read_bdev_label unable to
>> decode label at offset 66: buffer::malformed_input: void
>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past
>> end of struct encoding
>> stderr: 2017-12-29 08:13:08.644722 b66f3000 -1
>> bluestore(/var/lib/ceph/osd/ceph-12//block) _read_bdev_label unable to
>> decode label at offset 66: buffer::malformed_input: void
>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past
>> end of struct encoding
>> stderr: 2017-12

Re: [ceph-users] ceph-volume does not support upstart

2017-12-29 Thread
s  or hints about the problems will be appreciated!




> 在 2017年12月29日,下午2:06,Cary <dynamic.c...@gmail.com> 写道:
> 
> 
> You could add a file named  /usr/sbin/systemctl and add:
> exit 0
> to it.
>  
> Cary
> 
> On Dec 28, 2017, at 18:45, 赵赵贺东 <zhaohed...@gmail.com 
> <mailto:zhaohed...@gmail.com>> wrote:
> 
> 
> Hello ceph-users!
> 
> I am a ceph user from china.
> Our company deploy ceph on arm ubuntu 14.04. 
> Ceph Version is luminous 12.2.2.
> When I try to activate osd by ceph-volume, I got the following error.(osd 
> prepare stage seems work normally)
> It seems that ceph-volume only work under systemd, but ubuntu 14.04 does not 
> support systemd.
> How can I deploy osd in ubuntu 14.04 by ceph-volume?
> Will ceph-volume support upstart in the future?
> 
> ===>
> # ceph-volume lvm activate --bluestore 12 03fa2757-412d-4892-af8a-f2260294a2dc
> Running command: sudo ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
> /dev/vggroup/lvdata --path /var/lib/ceph/osd/ceph-12
> Running command: sudo ln -snf /dev/vggroup/lvdata 
> /var/lib/ceph/osd/ceph-12/block
> Running command: chown -R ceph:ceph /dev/dm-2
> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12
> Running command: sudo systemctl enable 
> ceph-volume@lvm-12-03fa2757-412d-4892-af8a-f2260294a2dc
>  stderr: sudo: systemctl: command not found
> -->  RuntimeError: command returned non-zero exit status: 1
> <
> 
> 
> Your reply will be appreciated!
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-volume does not support upstart

2017-12-28 Thread

Hello ceph-users!

I am a ceph user from china.
Our company deploy ceph on arm ubuntu 14.04. 
Ceph Version is luminous 12.2.2.
When I try to activate osd by ceph-volume, I got the following error.(osd 
prepare stage seems work normally)
It seems that ceph-volume only work under systemd, but ubuntu 14.04 does not 
support systemd.
How can I deploy osd in ubuntu 14.04 by ceph-volume?
Will ceph-volume support upstart in the future?

===>
# ceph-volume lvm activate --bluestore 12 03fa2757-412d-4892-af8a-f2260294a2dc
Running command: sudo ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
/dev/vggroup/lvdata --path /var/lib/ceph/osd/ceph-12
Running command: sudo ln -snf /dev/vggroup/lvdata 
/var/lib/ceph/osd/ceph-12/block
Running command: chown -R ceph:ceph /dev/dm-2
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-12
Running command: sudo systemctl enable 
ceph-volume@lvm-12-03fa2757-412d-4892-af8a-f2260294a2dc
 stderr: sudo: systemctl: command not found
-->  RuntimeError: command returned non-zero exit status: 1
<


Your reply will be appreciated!

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com