date:20190216

Re: [ceph-users] Understanding EC properties for CephFS / small files.

2019-02-16 Thread jesper

> I'm trying to understand the nuts and bolts of EC / CephFS
> We're running an EC4+2 pool on top of 72 x 7.2K rpm 10TB drives. Pretty
> slow bulk / archive storage.

Ok, did some more searching and found this:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021642.html.

Which to some degree confirms my understanding, I'd still like to get
even more insight though.

Gregory Farnum comes with this comments:
"Unfortunately any logic like this would need to be handled in your
application layer. Raw RADOS does not do object sharding or aggregation on
its own.
CERN did contribute the libradosstriper, which will break down your
multi-gigabyte objects into more typical sizes, but a generic system for
packing many small objects into larger ones is tough  the choices depend
so much on likely access patterns and such.

I would definitely recommend working out something like that, though!
"
An idea about how to advance this stuff:

I can see that this would be "very hard" by the Ceph concepts to do
at the objects level, but a suggestion would be to do it at the
CephFS/MDS level.

A basic thing that "often" would work, would be to on a "directory level"
have a special type of "packed" object, where multiple files went into
the same CephFS object. For common access patterns people are reading
through entire catalogs in the first place, which would also limits IO
on the overall system for tree traversals (Think tar cxvf
linux.kernel.tar.gz git-checkout)
I have no idea about how cephfs is dealing with concurrent updates
around entitites, but in this situation, dealing with concurrency
at the packed-object level.

It would be harder to "pack files across catalogs", since that is
not the native way of the MDS to keep track of things.

A third way would be to more "agressively" inline data on the MDS.
How mature - well tested - efficient is that feature?

http://docs.ceph.com/docs/master/cephfs/experimental-features/

The unfortunate consequence of bumping the 2KB size upwards to meet
the point where EC-pools become efficient would mean that we end
up hitting the MDS way harder than what we do today. 2KB seem
like a safe limit.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Understanding EC properties for CephFS / small files.

2019-02-16 Thread jesper

Hi List.

I'm trying to understand the nuts and bolts of EC / CephFS
We're running an EC4+2 pool on top of 72 x 7.2K rpm 10TB drives. Pretty
slow bulk / archive storage.

# getfattr -n ceph.dir.layout /mnt/home/cluster/mysqlbackup
getfattr: Removing leading '/' from absolute path names
# file: mnt/home/cluster/mysqlbackup
ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304
pool=cephfs_data_ec42"

This configuration is taken directly out of the online documentation:
(Which may have been where it went all wrong from our perspective):

http://docs.ceph.com/docs/master/cephfs/file-layouts/

Ok, this means that a 16MB file will be split at 4 chuncks of 4MB each
with 2 erasure coding chuncks? I dont really understand the stripe_count
element?

And since erasure-coding works at the object level, striping individual
objects across - here 4 replicas - it'll end up filling 16MB ? Or
is there an internal optimization causing this not to be the case?

Additionally, when reading the file, all 4 chunck need to be read to
assemble the object. Causing (at a minumum) 4 IOPS per file.

Now, my common file size is < 8MB and commonly 512KB files are on
this pool.

Will that cause a 512KB file to be padded to 4MB with 3 empty chuncks
to fill the erasure coded profile and then 2 coding chuncks on top?
In total 24MB for storing 512KB ?

And when reading it I'll hit 4 random IO's to read 512KB or can
it optimize around not reading "empty" chuncks?

If this is true, then I would be both performance and space/cost-wise
way better off with 3x replication.

Or is it less worse than what I get to here?

If the math is true, then we can begin to calculate chunksize and
EC profiles for when EC begins to deliver benefits.

In terms of IO it seems like I'll always suffer a 1:4 ratio on IOPS in
a reading scenario on a 4+2 EC pool, compared to a 3x replication.

Side-note: I'm trying to get bacula (tape-backup) to read off my archive
to tape in a "resonable time/speed".

Thanks in advance.

-- 
Jesper

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Second radosgw install

2019-02-16 Thread Adrian Nicolae


Hi all,

I know that it seems like a stupid question, but I have some concerns 
about this, maybe someone can clear the things for me.


I read in the offical docs that , when I create a rgw server with 
'ceph-deploy rgw create', the rgw scripts will automatically create the 
rgw system pools. I'm not sure what happens with the existing system 
pools if I already have a working rgw server...


Thanks.


On 2/15/2019 6:35 PM, Adrian Nicolae wrote:

Hi,

I want to install a second radosgw to my existing ceph cluster (mimic) 
on another server. Should I create it like the first one, with 
'ceph-deploy rgw create' ?


I don't want to mess with the existing rgw system pools.

Thanks.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG_AVAILABILITY with one osd down?

2019-02-16 Thread Maks Kowalik

Clients' experience depends on whether at the very moment they need to
read/write to those particular PGs involved in peering.
If their objects are placed in another PGs, then I/O operations shouldn't
be impacted.
If clients were performing I/O ops to those PGs that went into peering,
then they will notice increased latency. That's the case for Object and RBD.
In case of CephFS I have no experience.

Peering of several PGs does not mean the whole cluster is unavailable
during that time. Only a tiny part of it.
Also, those 6 seconds is a period of the "PG_AVAIL health check warning"
duration. It is not the length of each PG unavailablity.
It's the cluster which noticed that during that time some groups performed
peering.
In a proper setup and healthy conditions one group peers in fractions of
second.

Restarting an OSD causes the same thing. However is more "smooth" than an
unexpected death (going into the details would require quite a long
elaboration).
If your setup is correct, you should be able to perform a cluster-wide
restart of everything and only effect visible outside would be a slightly
increased latency.

Kind regards,
Maks


sob., 16 lut 2019 o 21:39  napisał(a):

> > Hello,
> > your log extract shows that:
> >
> > 2019-02-15 21:40:08 OSD.29 DOWN
> > 2019-02-15 21:40:09 PG_AVAILABILITY warning start
> > 2019-02-15 21:40:15 PG_AVAILABILITY warning cleared
> >
> > 2019-02-15 21:44:06 OSD.29 UP
> > 2019-02-15 21:44:08 PG_AVAILABILITY warning start
> > 2019-02-15 21:44:15 PG_AVAILABILITY warning cleared
> >
> > What you saw is the natural consequence of OSD state change. Those two
> > periods of limited PG availability (6s each) are related to peering
> > that happens shortly after an OSD goes down or up.
> > Basically, the placement groups stored on that OSD need peering, so
> > the incoming connections are directed to other (alive) OSDs. And, yes,
> > during those few seconds the data are not accessible.
>
> Thanks, bear over with my questions. I'm pretty new to Ceph.
> What will clients  (CephFS, Object) experience?
> .. will they just block until time has passed and they get through or?
>
> Which means that I'll get 72 x 6 seconds unavailabilty when doing
> a rolling restart of my OSD's during upgrades and such? Or is a
> controlled restart different than a crash?
>
> --
> Jesper.
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Some ceph config parameters default values

2019-02-16 Thread Oliver Freyermuth

Dear Cephalopodians,

in some recent threads on this list, I have read about the "knobs":

  pglog_hardlimit (false by default, available at least with 12.2.11 and 
13.2.5)
  bdev_enable_discard (false by default, advanced option, no description)
  bdev_async_discard  (false by default, advanced option, no description)

I am wondering about the defaults for these settings, and why these settings 
seem mostly undocumented. 

It seems to me that on SSD / NVMe devices, you would always want to enable 
discard for significantly increased lifetime,
or run fstrim regularly (which you can't with bluestore since it's a filesystem 
of its own). From personal experience, 
I have already lost two eMMC devices in Android phones early due to trimming 
not working fine. 
Of course, on first generation SSD devices, "discard" may lead to data loss 
(which for most devices has been fixed with firmware updates, though). 

I would presume that async-discard is also advantageous, since it seems to 
queue the discards and work on these in bulk later
instead of issuing them immediately (that's what I grasp from the code). 

Additionally, it's unclear to me whether the bdev-discard settings also affect 
WAL/DB devices, which are very commonly SSD/NVMe devices
in the Bluestore age. 

Concerning the pglog_hardlimit, I read on that list that it's safe and limits 
maximum memory consumption especially for backfills / during recovery. 
So it "sounds" like this is also something that could be on by default. But 
maybe that is not the case yet to allow downgrades after failed upgrades? 


So in the end, my question is: 
Is there a reason why these values are not on by default, and are also not 
really mentioned in the documentation? 
Are they just "not ready yet" / unsafe to be on by default, or are the defaults 
just like that because they have always been at this value,
and defaults will change with the next major release (nautilus)? 

Cheers,
Oliver



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG_AVAILABILITY with one osd down?

2019-02-16 Thread jesper

> Hello,
> your log extract shows that:
>
> 2019-02-15 21:40:08 OSD.29 DOWN
> 2019-02-15 21:40:09 PG_AVAILABILITY warning start
> 2019-02-15 21:40:15 PG_AVAILABILITY warning cleared
>
> 2019-02-15 21:44:06 OSD.29 UP
> 2019-02-15 21:44:08 PG_AVAILABILITY warning start
> 2019-02-15 21:44:15 PG_AVAILABILITY warning cleared
>
> What you saw is the natural consequence of OSD state change. Those two
> periods of limited PG availability (6s each) are related to peering
> that happens shortly after an OSD goes down or up.
> Basically, the placement groups stored on that OSD need peering, so
> the incoming connections are directed to other (alive) OSDs. And, yes,
> during those few seconds the data are not accessible.

Thanks, bear over with my questions. I'm pretty new to Ceph.
What will clients  (CephFS, Object) experience?
.. will they just block until time has passed and they get through or?

Which means that I'll get 72 x 6 seconds unavailabilty when doing
a rolling restart of my OSD's during upgrades and such? Or is a
controlled restart different than a crash?

-- 
Jesper.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG_AVAILABILITY with one osd down?

2019-02-16 Thread Maks Kowalik

Hello,
your log extract shows that:

2019-02-15 21:40:08 OSD.29 DOWN
2019-02-15 21:40:09 PG_AVAILABILITY warning start
2019-02-15 21:40:15 PG_AVAILABILITY warning cleared

2019-02-15 21:44:06 OSD.29 UP
2019-02-15 21:44:08 PG_AVAILABILITY warning start
2019-02-15 21:44:15 PG_AVAILABILITY warning cleared

What you saw is the natural consequence of OSD state change. Those two
periods of limited PG availability (6s each) are related to peering
that happens shortly after an OSD goes down or up.
Basically, the placement groups stored on that OSD need peering, so
the incoming connections are directed to other (alive) OSDs. And, yes,
during those few seconds the data are not accessible.

Kind regards,
Maks


sob., 16 lut 2019 o 07:25  napisał(a):

> Yesterday I saw this one.. it puzzles me:
> 2019-02-15 21:00:00.000126 mon.torsk1 mon.0 10.194.132.88:6789/0 604164 :
> cluster [INF] overall HEALTH_OK
> 2019-02-15 21:39:55.793934 mon.torsk1 mon.0 10.194.132.88:6789/0 604304 :
> cluster [WRN] Health check failed: 2 slow requests are blocked > 32 sec.
> Implicated osds 58 (REQUEST_SLOW)
> 2019-02-15 21:40:00.887766 mon.torsk1 mon.0 10.194.132.88:6789/0 604305 :
> cluster [WRN] Health check update: 6 slow requests are blocked > 32 sec.
> Implicated osds 9,19,52,58,68 (REQUEST_SLOW)
> 2019-02-15 21:40:06.973901 mon.torsk1 mon.0 10.194.132.88:6789/0 604306 :
> cluster [WRN] Health check update: 14 slow requests are blocked > 32 sec.
> Implicated osds 3,9,19,29,32,52,55,58,68,69 (REQUEST_SLOW)
> 2019-02-15 21:40:08.466266 mon.torsk1 mon.0 10.194.132.88:6789/0 604307 :
> cluster [INF] osd.29 failed (root=default,host=bison) (6 reporters from
> different host after 33.862482 >= grace 29.247323)
> 2019-02-15 21:40:08.473703 mon.torsk1 mon.0 10.194.132.88:6789/0 604308 :
> cluster [WRN] Health check failed: 1 osds down (OSD_DOWN)
> 2019-02-15 21:40:09.489494 mon.torsk1 mon.0 10.194.132.88:6789/0 604310 :
> cluster [WRN] Health check failed: Reduced data availability: 6 pgs
> peering (PG_AVAILABILITY)
> 2019-02-15 21:40:11.008906 mon.torsk1 mon.0 10.194.132.88:6789/0 604312 :
> cluster [WRN] Health check failed: Degraded data redundancy:
> 3828291/700353996 objects degraded (0.547%), 77 pgs degraded (PG_DEGRADED)
> 2019-02-15 21:40:13.474777 mon.torsk1 mon.0 10.194.132.88:6789/0 604313 :
> cluster [WRN] Health check update: 9 slow requests are blocked > 32 sec.
> Implicated osds 3,9,32,55,58,69 (REQUEST_SLOW)
> 2019-02-15 21:40:15.060165 mon.torsk1 mon.0 10.194.132.88:6789/0 604314 :
> cluster [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data
> availability: 17 pgs peering)
> 2019-02-15 21:40:17.128185 mon.torsk1 mon.0 10.194.132.88:6789/0 604315 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897139/700354131 objects degraded (1.413%), 200 pgs degraded
> (PG_DEGRADED)
> 2019-02-15 21:40:17.128219 mon.torsk1 mon.0 10.194.132.88:6789/0 604316 :
> cluster [INF] Health check cleared: REQUEST_SLOW (was: 2 slow requests are
> blocked > 32 sec. Implicated osds 32,55)
> 2019-02-15 21:40:22.137090 mon.torsk1 mon.0 10.194.132.88:6789/0 604317 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897140/700354194 objects degraded (1.413%), 200 pgs degraded
> (PG_DEGRADED)
> 2019-02-15 21:40:27.249354 mon.torsk1 mon.0 10.194.132.88:6789/0 604318 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897142/700354287 objects degraded (1.413%), 200 pgs degraded
> (PG_DEGRADED)
> 2019-02-15 21:40:33.335147 mon.torsk1 mon.0 10.194.132.88:6789/0 604322 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897143/700354356 objects degraded (1.413%), 200 pgs degraded
> (PG_DEGRADED)
> ... shortened ..
> 2019-02-15 21:43:48.496536 mon.torsk1 mon.0 10.194.132.88:6789/0 604366 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897168/700356693 objects degraded (1.413%), 200 pgs degraded, 201 pgs
> undersized (PG_DEGRADED)
> 2019-02-15 21:43:53.496924 mon.torsk1 mon.0 10.194.132.88:6789/0 604367 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897170/700356804 objects degraded (1.413%), 200 pgs degraded, 201 pgs
> undersized (PG_DEGRADED)
> 2019-02-15 21:43:58.497313 mon.torsk1 mon.0 10.194.132.88:6789/0 604368 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897172/700356879 objects degraded (1.413%), 200 pgs degraded, 201 pgs
> undersized (PG_DEGRADED)
> 2019-02-15 21:44:03.497696 mon.torsk1 mon.0 10.194.132.88:6789/0 604369 :
> cluster [WRN] Health check update: Degraded data redundancy:
> 9897174/700356996 objects degraded (1.413%), 200 pgs degraded, 201 pgs
> undersized (PG_DEGRADED)
> 2019-02-15 21:44:06.939331 mon.torsk1 mon.0 10.194.132.88:6789/0 604372 :
> cluster [INF] Health check cleared: OSD_DOWN (was: 1 osds down)
> 2019-02-15 21:44:06.965401 mon.torsk1 mon.0 10.194.132.88:6789/0 604373 :
> cluster [INF] osd.29 10.194.133.58:6844/305358 boot
> 2019-02-15 21:44:08.498060 mon.torsk1 mon.0

Re: [ceph-users] Placing replaced disks to correct buckets.

2019-02-16 Thread Konstantin Shalygin


I recently replaced failed HDDs and removed them from their respective
buckets as per procedure.

But I’m now facing an issue when trying to place new ones back into the
buckets. I’m getting an error of ‘osd nr not found’ OR ‘file or
directory not found’ OR command sintax error.

I have been using the commands below:

ceph osd crush set   
ceph osd crush  set   

I do however find the OSD number when i run command:

ceph osd find 

Your assistance/response to this will be highly appreciated.

Regards
John.


Please, paste your `ceph osd tree`, your version and what exactly error 
you get include osd number.


Less obfuscation is better in this, perhaps, simple case.


k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Placing replaced disks to correct buckets.

2019-02-16 Thread John Molefe

Hi Everyone,

I recently replaced failed HDDs and removed them from their respective
buckets as per procedure.

But I’m now facing an issue when trying to place new ones back into the
buckets. I’m getting an error of ‘osd nr not found’ OR ‘file or
directory not found’ OR command sintax error.

I have been using the commands below:

ceph osd crush set   
ceph osd crush  set   

I do however find the OSD number when i run command:

ceph osd find 

Your assistance/response to this will be highly appreciated.

Regards
John.

Sent from my iPhone
Vrywaringsklousule / Disclaimer: 
http://www.nwu.ac.za/it/gov-man/disclaimer.html 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph auth caps 'create rbd image' permission

2019-02-16 Thread Marc Roos



Currently I am using 'profile rbd' on mon and osd. Is it possible with 
the caps to allow a user to

- List rbd images
- get state of images
- write/read to images
Etc

But do not allow to have it create new images?








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-16 Thread Alexandre DERUMIER

>>There are 10 OSDs in these systems with 96GB of memory in total. We are 
>>runnigh with memory target on 6G right now to make sure there is no 
>>leakage. If this runs fine for a longer period we will go to 8GB per OSD 
>>so it will max out on 80GB leaving 16GB as spare. 

Thanks Wido. I send results monday with my increased memory



@Igor:

I have also notice, that sometime when I have bad latency on an osd on node1 
(restarted 12h ago for example).
(op_w_process_latency).

If I restart osds on other nodes (last restart some days ago, so with bigger 
latency), it's reducing latency on osd of node1 too.

does "op_w_process_latency" counter include replication time ?

- Mail original -
De: "Wido den Hollander" 
À: "aderumier" 
Cc: "Igor Fedotov" , "ceph-users" 
, "ceph-devel" 
Envoyé: Vendredi 15 Février 2019 14:59:30
Objet: Re: [ceph-users] ceph osd commit latency increase over time, until 
restart

On 2/15/19 2:54 PM, Alexandre DERUMIER wrote: 
>>> Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe 
>>> OSDs as well. Over time their latency increased until we started to 
>>> notice I/O-wait inside VMs. 
> 
> I'm also notice it in the vms. BTW, what it your nvme disk size ? 

Samsung PM983 3.84TB SSDs in both clusters. 

> 
> 
>>> A restart fixed it. We also increased memory target from 4G to 6G on 
>>> these OSDs as the memory would allow it. 
> 
> I have set memory to 6GB this morning, with 2 osds of 3TB for 6TB nvme. 
> (my last test was 8gb with 1osd of 6TB, but that didn't help) 

There are 10 OSDs in these systems with 96GB of memory in total. We are 
runnigh with memory target on 6G right now to make sure there is no 
leakage. If this runs fine for a longer period we will go to 8GB per OSD 
so it will max out on 80GB leaving 16GB as spare. 

As these OSDs were all restarted earlier this week I can't tell how it 
will hold up over a longer period. Monitoring (Zabbix) shows the latency 
is fine at the moment. 

Wido 

> 
> 
> - Mail original - 
> De: "Wido den Hollander"  
> À: "Alexandre Derumier" , "Igor Fedotov" 
>  
> Cc: "ceph-users" , "ceph-devel" 
>  
> Envoyé: Vendredi 15 Février 2019 14:50:34 
> Objet: Re: [ceph-users] ceph osd commit latency increase over time, until 
> restart 
> 
> On 2/15/19 2:31 PM, Alexandre DERUMIER wrote: 
>> Thanks Igor. 
>> 
>> I'll try to create multiple osds by nvme disk (6TB) to see if behaviour is 
>> different. 
>> 
>> I have other clusters (same ceph.conf), but with 1,6TB drives, and I don't 
>> see this latency problem. 
>> 
>> 
> 
> Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe 
> OSDs as well. Over time their latency increased until we started to 
> notice I/O-wait inside VMs. 
> 
> A restart fixed it. We also increased memory target from 4G to 6G on 
> these OSDs as the memory would allow it. 
> 
> But we noticed this on two different 12.2.10/11 clusters. 
> 
> A restart made the latency drop. Not only the numbers, but the 
> real-world latency as experienced by a VM as well. 
> 
> Wido 
> 
>> 
>> 
>> 
>> 
>> 
>> - Mail original - 
>> De: "Igor Fedotov"  
>> Cc: "ceph-users" , "ceph-devel" 
>>  
>> Envoyé: Vendredi 15 Février 2019 13:47:57 
>> Objet: Re: [ceph-users] ceph osd commit latency increase over time, until 
>> restart 
>> 
>> Hi Alexander, 
>> 
>> I've read through your reports, nothing obvious so far. 
>> 
>> I can only see several times average latency increase for OSD write ops 
>> (in seconds) 
>> 0.002040060 (first hour) vs. 
>> 
>> 0.002483516 (last 24 hours) vs. 
>> 0.008382087 (last hour) 
>> 
>> subop_w_latency: 
>> 0.000478934 (first hour) vs. 
>> 0.000537956 (last 24 hours) vs. 
>> 0.003073475 (last hour) 
>> 
>> and OSD read ops, osd_r_latency: 
>> 
>> 0.000408595 (first hour) 
>> 0.000709031 (24 hours) 
>> 0.004979540 (last hour) 
>> 
>> What's interesting is that such latency differences aren't observed at 
>> neither BlueStore level (any _lat params under "bluestore" section) nor 
>> rocksdb one. 
>> 
>> Which probably means that the issue is rather somewhere above BlueStore. 
>> 
>> Suggest to proceed with perf dumps collection to see if the picture 
>> stays the same. 
>> 
>> W.r.t. memory usage you observed I see nothing suspicious so far - No 
>> decrease in RSS report is a known artifact that seems to be safe. 
>> 
>> Thanks, 
>> Igor 
>> 
>> On 2/13/2019 11:42 AM, Alexandre DERUMIER wrote: 
>>> Hi Igor, 
>>> 
>>> Thanks again for helping ! 
>>> 
>>> 
>>> 
>>> I have upgrade to last mimic this weekend, and with new autotune memory, 
>>> I have setup osd_memory_target to 8G. (my nvme are 6TB) 
>>> 
>>> 
>>> I have done a lot of perf dump and mempool dump and ps of process to 
>> see rss memory at different hours, 
>>> here the reports for osd.0: 
>>> 
>>> http://odisoweb1.odiso.net/perfanalysis/ 
>>> 
>>> 
>>> osd has been started the 12-02-2019 at 08:00 
>>> 
>>> first report after 1h running 
>>>

Re: [ceph-users] Openstack RBD EC pool

2019-02-16 Thread Konstantin Shalygin


### ceph.conf
[global]
fsid = b5e30221-a214-353c-b66b-8c37b4349123
mon host = ceph-mon.service.i.ewcs.ch
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
###


## ceph.ec.conf
[global]
fsid = b5e30221-a214-353c-b66b-8c37b4349123
mon host = ceph-mon.service.i..
auth cluster required = cephx
auth service required = cephx
auth client required = cephx

[client.cinder-ec]
rbd default data pool = ewos1-prod_cinder_ec
#
There is not necessary to split this settings to two files. Use one 
ceph.conf instead.



[client.cinder-ec]
rbd default data pool = ewos1-prod_cinder_ec


But your pool is:


ceph osd pool create cinder_ec 512 512 erasure ec32




k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Understanding EC properties for CephFS / small files.

[ceph-users] Understanding EC properties for CephFS / small files.

Re: [ceph-users] Second radosgw install

Re: [ceph-users] PG_AVAILABILITY with one osd down?

[ceph-users] Some ceph config parameters default values

Re: [ceph-users] PG_AVAILABILITY with one osd down?

Re: [ceph-users] PG_AVAILABILITY with one osd down?

Re: [ceph-users] Placing replaced disks to correct buckets.

[ceph-users] Placing replaced disks to correct buckets.

[ceph-users] Ceph auth caps 'create rbd image' permission

Re: [ceph-users] ceph osd commit latency increase over time, until restart

Re: [ceph-users] Openstack RBD EC pool

12 matches

Site Navigation

Mail list logo

Footer information