Re: [ceph-users] Is Ceph appropriate for small installations?
On 08/28/2015 05:37 PM, John Spray wrote: On Fri, Aug 28, 2015 at 3:53 PM, Tony Nelson tnel...@starpoint.com wrote: I recently built a 3 node Proxmox cluster for my office. I’d like to get HA setup, and the Proxmox book recommends Ceph. I’ve been reading the documentation and watching videos, and I think I have a grasp on the basics, but I don’t need anywhere near a petabyte of storage. I’m considering servers w/ 12 drive bays, 2 SDD mirrored for the OS, 2 SDDs for journals and the other 8 for OSDs. I was going to purchase 3 identical servers, and use my 3 Proxmox servers as the monitors, with of course GB networking in between. Obviously this is very vague, but I’m just getting started on the research. My concern is that I won’t have enough physical disks, and therefore I’ll end up with performance issues. That's impossible to know without knowing what kind of performance you need. True, true. But I personally think that Ceph doesn't perform well on small 10 node clusters. I’ve seen many petabyte+ builds discussed, but not much on the smaller side. Does anyone have any guides or reference material I may have missed? The practicalities of fault tolerance are very different in a minimum-size system (e.g. 3 servers configured for 3 replicas). * when one disk fails, the default rules require that the only place ceph can re-replicate the PGs that were on that disk is to other disks on the same server where the failure occurred. One full disk's worth of data will have to flow into the server where the failure occurred, preferably quite fast (to avoid the risk of a double failure). Recovering from a 2TB disk failure will take as long as it takes to stream that much data over your 1gbps link. Your recovery time will be similar to conventional RAID, unless you install a faster network. * when one server fails, you're losing a full third of your bandwidth. That means that your client workloads would have to be sized to usually only use 2/3 of the theoretical bandwidth, or that you would have to shut down some workloads when a server failed. In larger systems this isn't such a worry as losing 1 of 32 servers is only a 3% throughput loss. Yes, the failure domain should be as small as possible. I prefer that loosing one machine is 10% of the cluster size. So with three nodes it's 33,3% failure domain. Wido You should compare the price of getting the same amount of disk+ram, but spread it between twice as many servers. The other option is of course a traditional dual ported raid controller. John ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 hour until Ceph Tech Talk
Hi Patrick, On Thu, Aug 27, 2015 at 12:00 PM, Patrick McGarry pmcga...@redhat.com wrote: Just a reminder that our Performance Ceph Tech Talk with Mark Nelson will be starting in 1 hour. If you are unable to attend there will be a recording posted on the Ceph YouTube channel and linked from the page at: http://ceph.com/ceph-tech-talks/ That is an excellent talk and I am wondering if there's any more info on compiling Ceph with jemalloc. Is that something that you would discourage at the moment or OK to try on test systems? Regards, Alex ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] when one osd is out of cluster network, how does the mon can make sure this osd is down?
hi, 3osd(B\C\D)in cluster network,3 replication, monitor (A) [cid:image003.png@01D0E27F.A7634850] if osdB is out of cluster network(still can communicate with mon(A)),then osdC and osdD will report to A that B is down,then MON(A) will mark B down,at this time,does the ceph-osd process on B will kill himself? if the ceph-osd process on B is still alive,then B still can send message to A,is it all right? Would mon(A) think B is alive and mark it up? can sb help me ? - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to back up RGW buckets or RBD snapshots
Thanks Abhishek, will try this. BTW, can anybody give some insight regarding backing up RGW data ? Regards Somnath -Original Message- From: Abhishek L [mailto:abhishek.lekshma...@gmail.com] Sent: Friday, August 28, 2015 9:55 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] How to back up RGW buckets or RBD snapshots Somnath Roy writes: Hi, I wanted to know how RGW users are backing up the bucket contents , so that in the disaster scenario user can recreate the setup. I know there is geo replication support but it could be an expensive proposition. I wanted to know if there is any simple solution like plugging in traditional backup application to RGW. The same problem applies for RBD as well, how people are backing up RBD snapshots ? I am sure production ceph users have something already in place and appreciate any suggestion on this. As far as RBD is concerned you could backup the volumes to a different pool (or even a different cluster), as explained here: https://ceph.com/dev-notes/incremental-snapshots-with-rbd/ ie. doing a deep copy first and then copying incremental snapshots above that. Openstack cinder , for eg., supports this functionality in form of *backups* allowing backups to a different pool or cluster. Thanks Regards Somnath PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Opensource plugin for pulling out cluster recovery and client IO metric
Date: Fri, 28 Aug 2015 12:07:39 +0100 From: gfar...@redhat.com To: vickey.singh22...@gmail.com CC: ceph-users@lists.ceph.com; ceph-us...@ceph.com; ceph-de...@vger.kernel.org Subject: Re: [ceph-users] Opensource plugin for pulling out cluster recovery and client IO metric On Mon, Aug 24, 2015 at 4:03 PM, Vickey Singh vickey.singh22...@gmail.com wrote: Hello Ceph Geeks I am planning to develop a python plugin that pulls out cluster recovery IO and client IO operation metrics , that can be further used with collectd. For example , i need to take out these values recovery io 814 MB/s, 101 objects/s client io 85475 kB/s rd, 1430 kB/s wr, 32 op/s The calculation *window* for those stats are very small, IIRC, they are two PG version which most likely map to two seconds (average of the last two seconds), you may increase mon_stat_smooth_intervals to enlarge the window, but I didn't try it myself. I found the 'ceph status -f json' has better formatted output and more information. Could you please help me in understanding how ceph -s and ceph -w outputs prints cluster recovery IO and client IO information. Where this information is coming from. Is it coming from perf dump ? If yes then which section of perf dump output is should focus on. If not then how can i get this values. I tried ceph --admin-daemon /var/run/ceph/ceph-osd.48.asok perf dump , but it generates hell lot of information and i am confused which section of output should i use. perf counters have a tone of information which needs time to understand the details, but if the purpose is just to dump as what they are and do better aggregation/reporting, you can check 'perf schema' first to get the type of the field, can cross check the perf_counter's definition for each type, to determine how you collection/aggregate those data. This information is generated only on the monitors based on pg stats from the OSDs, is slightly laggy, and can be most easily accessed by calling ceph -s on a regular basis. You can get it with json output that is easier to parse, and you can optionally set up an API server for more programmatic access. I'm not sure on the details of doing that last, though. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PGs stuck stale during data migration and OSD restart
Dear all, During a cluster reconfiguration (change of crush tunables from legacy to TUNABLES2) with large data replacement, several OSDs get overloaded and had to be restarted; when OSDs stabilize, I got a number of PGs marked stale, even when all OSDs where this data used to be located show up again. When I look at the OSDs current directory for the last placement, there is still some data. But it never shows up again. Is there any way to force these OSDs to resume being used? regards. -- As informa��es contidas nesta mensagem s�o CONFIDENCIAIS, protegidas pelo sigilo legal e por direitos autorais. A divulga��o, distribui��o, reprodu��o ou qualquer forma de utiliza��o do teor deste documento depende de autoriza��o do emissor, sujeitando-se o infrator �s san��es legais. Caso esta comunica��o tenha sido recebida por engano, favor avisar imediatamente, respondendo esta mensagem. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph-deploy error
Hi, I'm trying to install ceph for the first time following the quick installation guide. I'm getting the below error, can someone please help? ceph-deploy install --release=firefly ceph-vm-mon1 [*ceph_deploy.conf*][*DEBUG* ] found configuration file at: /home/cloud-user/.cephdeploy.conf [*ceph_deploy.cli*][*INFO* ] Invoked (1.5.28): /usr/bin/ceph-deploy install --release=firefly ceph-vm-mon1 [*ceph_deploy.cli*][*INFO* ] ceph-deploy options: [*ceph_deploy.cli*][*INFO* ] verbose : False [*ceph_deploy.cli*][*INFO* ] testing : None [*ceph_deploy.cli*][*INFO* ] cd_conf : ceph_deploy.conf.cephdeploy.Conf instance at 0xaab248 [*ceph_deploy.cli*][*INFO* ] cluster : ceph [*ceph_deploy.cli*][*INFO* ] install_mds : False [*ceph_deploy.cli*][*INFO* ] stable: None [*ceph_deploy.cli*][*INFO* ] default_release : False [*ceph_deploy.cli*][*INFO* ] username : None [*ceph_deploy.cli*][*INFO* ] adjust_repos : True [*ceph_deploy.cli*][*INFO* ] func : function install at 0x7f34b410e938 [*ceph_deploy.cli*][*INFO* ] install_all : False [*ceph_deploy.cli*][*INFO* ] repo : False [*ceph_deploy.cli*][*INFO* ] host : ['ceph-vm-mon1'] [*ceph_deploy.cli*][*INFO* ] install_rgw : False [*ceph_deploy.cli*][*INFO* ] repo_url : None [*ceph_deploy.cli*][*INFO* ] ceph_conf : None [*ceph_deploy.cli*][*INFO* ] install_osd : False [*ceph_deploy.cli*][*INFO* ] version_kind : stable [*ceph_deploy.cli*][*INFO* ] install_common: False [*ceph_deploy.cli*][*INFO* ] overwrite_conf: False [*ceph_deploy.cli*][*INFO* ] quiet : False [*ceph_deploy.cli*][*INFO* ] dev : master [*ceph_deploy.cli*][*INFO* ] local_mirror : None [*ceph_deploy.cli*][*INFO* ] release : firefly [*ceph_deploy.cli*][*INFO* ] install_mon : False [*ceph_deploy.cli*][*INFO* ] gpg_url : None [*ceph_deploy.install*][*DEBUG* ] Installing stable version firefly on cluster ceph hosts ceph-vm-mon1 [*ceph_deploy.install*][*DEBUG* ] Detecting platform for host ceph-vm-mon1 ... [*ceph-vm-mon1*][*DEBUG* ] connection detected need for sudo [*ceph-vm-mon1*][*DEBUG* ] connected to host: ceph-vm-mon1 [*ceph-vm-mon1*][*DEBUG* ] detect platform information from remote host [*ceph-vm-mon1*][*DEBUG* ] detect machine type [*ceph_deploy.install*][*INFO* ] Distro info: Red Hat Enterprise Linux Server 7.1 Maipo [*ceph-vm-mon1*][*INFO* ] installing Ceph on ceph-vm-mon1 [*ceph-vm-mon1*][*INFO* ] Running command: sudo yum clean all [*ceph-vm-mon1*][*DEBUG* ] Loaded plugins: fastestmirror, priorities [*ceph-vm-mon1*][*DEBUG* ] Cleaning repos: epel rhel-7-ha-rpms rhel-7-optional-rpms rhel-7-server-rpms [*ceph-vm-mon1*][*DEBUG* ] : rhel-7-supplemental-rpms [*ceph-vm-mon1*][*DEBUG* ] Cleaning up everything [*ceph-vm-mon1*][*DEBUG* ] Cleaning up list of fastest mirrors [*ceph-vm-mon1*][*INFO* ] Running command: sudo yum -y install epel-release [*ceph-vm-mon1*][*DEBUG* ] Loaded plugins: fastestmirror, priorities [*ceph-vm-mon1*][*DEBUG* ] Determining fastest mirrors [*ceph-vm-mon1*][*DEBUG* ] * epel: kdeforge2.unl.edu [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-ha-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-optional-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-server-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-supplemental-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] Package epel-release-7-5.noarch already installed and latest version [*ceph-vm-mon1*][*DEBUG* ] Nothing to do [*ceph-vm-mon1*][*INFO* ] Running command: sudo yum -y install yum-plugin-priorities [*ceph-vm-mon1*][*DEBUG* ] Loaded plugins: fastestmirror, priorities [*ceph-vm-mon1*][*DEBUG* ] Loading mirror speeds from cached hostfile [*ceph-vm-mon1*][*DEBUG* ] * epel: kdeforge2.unl.edu [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-ha-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-optional-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-server-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] * rhel-7-supplemental-rpms: rhel-repo.eu-biere-1.t-systems.cloud.cisco.com [*ceph-vm-mon1*][*DEBUG* ] Package yum-plugin-priorities-1.1.31-29.el7.noarch already installed and latest version [*ceph-vm-mon1*][*DEBUG* ] Nothing to do
[ceph-users] OSD won't go up after node reboot
I'm running 3-node cluster with Ceph (it's Deis cluster, so Ceph daemons are containerized). There are 3 OSDs and 3 mons. After rebooting all nodes one by one all monitors are up, but only two OSDs of three are up. 'Down' OSD is really running but is never marked up/in. All three mons are reachable from inside the OSD container. I've run `log dump` for this OSD and found this line: Aug 29 06:19:39 staging-coreos-1 sh[7393]: -99 2015-08-29 06:18:51.855432 7f5902009700 3 osd.0 0 handle_osd_map epochs [1,90], i have 0, src has [1,90] Is it the reason why OSD cannot connect to the cluster? If yes, why could it happen? I haven't removed any data from /var/lib/ceph/osd. Is it possible to bring this OSD back to cluster without completely recreating it? Ceph version is: root@staging-coreos-1:/# ceph -v ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fwd: [Ceph-community]Improve Read Performance
Yes, reads are served from primary OSDs.. Adding OSD nodes should definitely increase your performance, but, you need to see first whether you are getting the desired performance (or maxed out) with the existing cluster or not. Please give some more information about your cluster and let us know what IOPs are you getting for what block size ? Thanks Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Le Quang Long Sent: Saturday, August 29, 2015 9:50 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] Fwd: [Ceph-community]Improve Read Performance Hello, I'm having a concern about CEPH Read IOPS. I'm not sure how Ceph settles Read request: - Whether only primary OSD or all replica OSDs reply read requests ? - Do I increase Read IOPS by supplementing OSD nodes to cluster? If not, is there any method to do? Thanks and regards. PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Fwd: [Ceph-community]Improve Read Performance
Hello, I'm having a concern about CEPH Read IOPS. I'm not sure how Ceph settles Read request: - Whether only primary OSD or all replica OSDs reply read requests ? - Do I increase Read IOPS by supplementing OSD nodes to cluster? If not, is there any method to do? Thanks and regards. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] a couple of radosgw questions
I'm not the OP, but in my particular case, gc is proceeding normally (since 94.2, i think) -- i just have millions of older objects (months-old) which will not go away. (see my other post -- http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-August/003967.html ) -Ben On Fri, Aug 28, 2015 at 5:14 PM, Brad Hubbard bhubb...@redhat.com wrote: - Original Message - From: Ben Hines bhi...@gmail.com To: Brad Hubbard bhubb...@redhat.com Cc: Tom Deneau tom.den...@amd.com, ceph-users ceph-us...@ceph.com Sent: Saturday, 29 August, 2015 9:49:00 AM Subject: Re: [ceph-users] a couple of radosgw questions 16:22:38 root@sm-cephrgw4 /etc/ceph $ radosgw-admin temp remove unrecognized arg remove usage: radosgw-admin cmd [options...] commands: temp removeremove temporary objects that were created up to specified date (and optional time) Looking into this ambiguity, thanks. On Fri, Aug 28, 2015 at 4:24 PM, Brad Hubbard bhubb...@redhat.com wrote: emove an object, it is no longer visible from the S3 API, but the objects that comprised it are still there in .rgw.buckets pool. When do they get removed? Does the following command remove them? http://ceph.com/docs/master/radosgw/purge-temp/ Does radosgw-admin gc list show anything? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com