Re: [ceph-users] unable to do regionmap update
On Wed, Jan 11, 2017 at 2:53 PM, Marko Stojanovicwrote: > > Hello all, > > I have issue with radosgw-admin regionmap update . It doesn't update map. > > With zone configured like this: > > radosgw-admin zone get > { > "id": "fc12ac44-e27e-44e3-9b13-347162d3c1d2", > "name": "oak-1", > "domain_root": "oak-1.rgw.data.root", > "control_pool": "oak-1.rgw.control", > "gc_pool": "oak-1.rgw.gc", > "log_pool": "oak-1.rgw.log", > "intent_log_pool": "oak-1.rgw.intent-log", > "usage_log_pool": "oak-1.rgw.usage", > "user_keys_pool": "oak-1.rgw.users.keys", > "user_email_pool": "oak-1.rgw.users.email", > "user_swift_pool": "oak-1.rgw.users.swift", > "user_uid_pool": "oak-1.rgw.users.uid", > "system_key": { > "access_key": "XX", > "secret_key": "XX" > }, > "placement_pools": [ > { > "key": "default-placement", > "val": { > "index_pool": "oak-1.rgw.buckets.index", > "data_pool": "oak-1.rgw.buckets.data", > "data_extra_pool": "oak-1.rgw.buckets.non-ec", > "index_type": 0 > } > }, > { > "key": "ssd-placement", > "val": { > "index_pool": "oak-1.rgw.buckets.index-ssd", > "data_pool": "oak-1.rgw.buckets.data-ssd", > "data_extra_pool": "oak-1.rgw.buckets.non-ec-ssd", > "index_type": 0 > } > } > ], > "metadata_heap": "oak-1.rgw.meta", > "realm_id": "67e26f6b-4774-4b14-9668-a5cf76b9e9ce" > } > > And region > > radosgw-admin region get > { > "id": "dbec3557-87bb-4460-8546-b59b4fde7e10", > "name": "oak", > "api_name": "oak", > "is_master": "true", > "endpoints": [], > "hostnames": [], > "hostnames_s3website": [], > "master_zone": "fc12ac44-e27e-44e3-9b13-347162d3c1d2", > "zones": [ > { > "id": "fc12ac44-e27e-44e3-9b13-347162d3c1d2", > "name": "oak-1", > "endpoints": [ > "http:\/\/ceph1.oak.vast.com:7480" > ], > "log_meta": "true", > "log_data": "false", > "bucket_index_max_shards": 0, > "read_only": "false" > } > ], > "placement_targets": [ > { > "name": "default-placement", > "tags": [ > "default-placement" > ] > }, > { > "name": "ssd-placement", > "tags": [ > "ssd-placement" > ] > } > ], > "default_placement": "default-placement", > "realm_id": "67e26f6b-4774-4b14-9668-a5cf76b9e9ce" > > > > When I run radosgw-admin regionmap update I don't get ssd-placement as > placement_target: > > { > "zonegroups": [ > { > "key": "dbec3557-87bb-4460-8546-b59b4fde7e10", > "val": { > "id": "dbec3557-87bb-4460-8546-b59b4fde7e10", > "name": "oak", > "api_name": "oak", > "is_master": "true", > "endpoints": [], > "hostnames": [], > "hostnames_s3website": [], > "master_zone": "fc12ac44-e27e-44e3-9b13-347162d3c1d2", > "zones": [ > { > "id": "fc12ac44-e27e-44e3-9b13-347162d3c1d2", > "name": "oak-1", > "endpoints": [ > "http:\/\/ceph1.oak.vast.com:7480" > ], > "log_meta": "true", > "log_data": "false", > "bucket_index_max_shards": 0, > "read_only": "false" > } > ], > "placement_targets": [ > { > "name": "default-placement", > "tags": [] > } > ], > "default_placement": "default-placement", > "realm_id": "67e26f6b-4774-4b14-9668-a5cf76b9e9ce" > } > } > ], > "master_zonegroup": "dbec3557-87bb-4460-8546-b59b4fde7e10", > "bucket_quota": { > "enabled": false, > "max_size_kb": -1, > "max_objects": -1 > }, > "user_quota": { > "enabled": false, > "max_size_kb": -1, > "max_objects": -1 > } > } > > Ceph version is: > ceph --version > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > > Any advises? > First I recommend using zonegroup for jewel as region was renamed to zonegroup. How did you create/update the zones and zonegroup? Did you executed period update? Orit > > Thanks in advance > > Marko Stojanovic > > > > ___ > ceph-users mailing list
[ceph-users] 答复: 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5
Thanks for your help:-) I checked the source code again, and in read_message, it does hold the Connection::lock: while (left > 0) { // wait for data if (tcp_read_wait() < 0) goto out_dethrottle; // get a buffer connection_state->lock.Lock(); map>::iterator p = connection_state->rx_buffers.find(header.tid); if (p != connection_state->rx_buffers.end()) { if (rxbuf.length() == 0 || p->second.second != rxbuf_version) { ldout(msgr->cct,10) << "reader seleting rx buffer v " << p->second.second << " at offset " << offset << " len " << p->second.first.length() << dendl; rxbuf = p->second.first; rxbuf_version = p->second.second; // make sure it's big enough if (rxbuf.length() < data_len) rxbuf.push_back( buffer::create(data_len - rxbuf.length())); blp = p->second.first.begin(); blp.advance(offset); } } else { if (!newbuf.length()) { ldout(msgr->cct,20) << "reader allocating new rx buffer at offset " << offset << dendl; alloc_aligned_buffer(newbuf, data_len, data_off); blp = newbuf.begin(); blp.advance(offset); } } bufferptr bp = blp.get_current_ptr(); int read = MIN(bp.length(), left); ldout(msgr->cct,20) << "reader reading nonblocking into " << (void*) bp.c_str() << " len " << bp.length() << dendl; int got = tcp_read_nonblocking(bp.c_str(), read); ldout(msgr->cct,30) << "reader read " << got << " of " << read << dendl; connection_state->lock.Unlock(); if (got < 0) goto out_dethrottle; if (got > 0) { blp.advance(got); data.append(bp, 0, got); offset += got; left -= got; } // else we got a signal or something; just loop. } As shown in the above code, in the reading loop, it first lock connection_state->lock and then do tcp_read_nonblocking. connection_state is of type PipeConnectionRef, connection_state->lock is Connection::lock. On the other hand, I'll check that whether there are a lot of message to send as you suggested. Thanks:-) 发件人: Gregory Farnum [gfar...@redhat.com] 发送时间: 2017年1月14日 9:39 收件人: 许雪寒 Cc: jiajia zhong; ceph-users@lists.ceph.com 主题: Re: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5 On Thu, Jan 12, 2017 at 7:58 PM, 许雪寒 wrote: Thank you for your continuous helpJ. We are using hammer 0.94.5 version, and what I read is the version of the source code. However, on the other hand, if Pipe::do_recv do act as blocked, is it reasonable for the Pipe::reader_thread to block threads calling SimpleMessenger::submit_message by holding Connection::lock? I think maybe a different mutex should be used in Pipe::read_message rather than Connection::lock. I don't think it does use that lock. Pipe::read_message() is generally called while the pipe_lock is held, but not Connection::lock. (They are separate.) I haven't dug into the relevant OSD code in a while, but I think it's a lot more likely your OSD is just overloaded and is taking a while to send a lot of different messages, and that the loop
Re: [ceph-users] Mixing disks
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Marc > Roos > Sent: 14 January 2017 12:56 > To: ceph-users> Subject: [ceph-users] Mixing disks > > > For a test cluster, we like to use some 5400rpm and 7200rpm drives, is it > advisable to customize the configuration then as described on > this page. Or is the speed difference to so small, and should this only be > done when adding ssd's to the same osd node? I wouldn't add two different disk types to the same pool, it will likely just bring the speed of the pool down to the slowest disk. You could either use them as two different pools to provide different tiers of storage, or use tiering to allow automatic data migration between them. Although I would have to question the logic of what you are trying to achieve. The speed/cost difference between the two types isn't that great and I'm not sure if it's worth the hassle? > > https://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/ > > > > > > > -Original Message- > From: David Turner [mailto:david.tur...@storagecraft.com] > Sent: vrijdag 13 januari 2017 21:36 > To: Chris Jones > Cc: ceph-us...@ceph.com > Subject: Re: [ceph-users] Ceph Monitoring > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Mixing disks
For a test cluster, we like to use some 5400rpm and 7200rpm drives, is it advisable to customize the configuration then as described on this page. Or is the speed difference to so small, and should this only be done when adding ssd's to the same osd node? https://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/ -Original Message- From: David Turner [mailto:david.tur...@storagecraft.com] Sent: vrijdag 13 januari 2017 21:36 To: Chris Jones Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] Ceph Monitoring ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Change Partition Schema on OSD Possible?
> Op 14 januari 2017 om 11:05 schreef Hauke Homburg: > > > Hello, > > In our Ceph Cluster are our HDD in the OSD with 50% DATA in GPT > Partitions configured. Can we change this Schema to have more Data Storage? > How do you mean? > Our HDD are 5TB so i hope to have more Space when i change the GPT > bigger from 2TB to 3 oder 4 TB. > On a 5TB disks only 50% is used for data? What is the other 50% being used for? > Can we modify the Partitions without install reinstall the Server? > Sure! Just like changing any other GPT partition. Don't forget to resize XFS afterwards with xfs_growfs. However, test this on one OSD/disk first before doing it on all. Wido > Whats the best Way to do this? Boot the Node with a Rescue CD and change > the Partition with gparted, and boot the Server again? > > Thanks for help > > Regards > > Hauke > > -- > www.w3-creative.de > > www.westchat.de > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Change Partition Schema on OSD Possible?
Hello, In our Ceph Cluster are our HDD in the OSD with 50% DATA in GPT Partitions configured. Can we change this Schema to have more Data Storage? Our HDD are 5TB so i hope to have more Space when i change the GPT bigger from 2TB to 3 oder 4 TB. Can we modify the Partitions without install reinstall the Server? Whats the best Way to do this? Boot the Node with a Rescue CD and change the Partition with gparted, and boot the Server again? Thanks for help Regards Hauke -- www.w3-creative.de www.westchat.de ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] All SSD cluster performance
> Op 14 januari 2017 om 6:41 schreef Christian Balzer: > > > > Hello, > > On Fri, 13 Jan 2017 13:18:35 -0500 Mohammed Naser wrote: > > > These Intel SSDs are more than capable of handling the workload, in > > addition, this cluster is used as an RBD backend for an OpenStack cluster. > > > > I haven't tested the S3520s yet, them being the first 3D NAND offering > from Intel they are slightly slower than the predecessors in terms of BW > and IOPS, but have supposedly a slightly lower latency if the specs are to > believed. > > Given the history of Intel DC S SSDs I think it is safe to assume that they > use the same/similar controller setup as their predecessors, meaning a > large powercap backed cache which enables them to deal correctly and > quickly with SYNC and DIRECT writes. > > What would worry me slight more (even at their 960GB size) is the endurance > of 1 DWPD, which with journals inline comes down to 0.5 and with FS > overhead and write amplification (depends a lot on the type of operations) > you're looking a something along 0.3 DWPD to base your expectations on. > Mind, that still leaves you with about 9.6TB per day, which is a decent > enough number, but only equates to about 112MB/s. > > Finally, most people start with looking at bandwidth/throughput when > penultimately they discover it was IOPS they needed first and foremost. Yes! Bandwidth isn't what people usually need, they need IOps. Low latency. I see a lot of clusters doing 10k ~ 20k IOps with somewhere around 1Gbit/s of traffic. Wido > > Christian > > > Sent from my iPhone > > > > > On Jan 13, 2017, at 1:08 PM, Somnath Roy wrote: > > > > > > Also, there are lot of discussion about SSDs not suitable for Ceph write > > > workload (with filestore) in community as those are not good for > > > odirect/odsync kind of writes. Hope your SSDs are tolerant of that. > > > > > > -Original Message- > > > From: Somnath Roy > > > Sent: Friday, January 13, 2017 10:06 AM > > > To: 'Mohammed Naser'; Wido den Hollander > > > Cc: ceph-users@lists.ceph.com > > > Subject: RE: [ceph-users] All SSD cluster performance > > > > > > << Both OSDs are pinned to two cores on the system Is there any reason > > > you are pinning osds like that ? I would say for object workload there is > > > no need to pin osds. > > > The configuration you mentioned , Ceph with 4M object PUT it should be > > > saturating your network first. > > > > > > Have you run say 4M object GET to see what BW you are getting ? > > > > > > Thanks & Regards > > > Somnath > > > > > > -Original Message- > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > > Mohammed Naser > > > Sent: Friday, January 13, 2017 9:51 AM > > > To: Wido den Hollander > > > Cc: ceph-users@lists.ceph.com > > > Subject: Re: [ceph-users] All SSD cluster performance > > > > > > > > >> On Jan 13, 2017, at 12:41 PM, Wido den Hollander wrote: > > >> > > >> > > >>> Op 13 januari 2017 om 18:39 schreef Mohammed Naser > > >>> : > > >>> > > >>> > > >>> > > On Jan 13, 2017, at 12:37 PM, Wido den Hollander wrote: > > > > > > > Op 13 januari 2017 om 18:18 schreef Mohammed Naser > > > : > > > > > > > > > Hi everyone, > > > > > > We have a deployment with 90 OSDs at the moment which is all SSD > > > that’s not hitting quite the performance that it should be in my > > > opinion, a `rados bench` run gives something along these numbers: > > > > > > Maintaining 16 concurrent writes of 4194304 bytes to objects of > > > size 4194304 for up to 10 seconds or 0 objects Object prefix: > > > benchmark_data_bench.vexxhost._30340 > > > sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > > > lat(s) > > > 0 0 0 0 0 0 - > > >0 > > > 1 16 158 142 568.513 568 0.0965336 > > > 0.0939971 > > > 2 16 287 271 542.191 516 0.0291494 > > > 0.107503 > > > 3 16 375 359478.75 352 0.0892724 > > > 0.118463 > > > 4 16 477 461 461.042 408 0.0243493 > > > 0.126649 > > > 5 16 540 524 419.216 2520.239123 > > > 0.132195 > > > 6 16 644 628418.67 4160.347606 > > > 0.146832 > > > 7 16 734 718 410.281 360 0.0534447 > > > 0.147413 > > > 8 16 811 795 397.487 308 0.0311927 > > > 0.15004 > > > 9 16 879 863 383.537 272 0.0894534 > > > 0.158513 > > > 10 16 980 964 385.578 404 0.0969865 > > > 0.162121 > > > 11