Re: [ceph-users] Poor performance on all SSD cluster

2014-06-24 Thread Mark Kirkwood
On 24/06/14 18:15, Robert van Leeuwen wrote: All of which means that Mysql performance (looking at you binlog) may still suffer due to lots of small block size sync writes. Which begs the question: Anyone running a reasonable busy Mysql server on Ceph backed storage? We tried and it did not

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-24 Thread Mark Kirkwood
On 23/06/14 19:16, Mark Kirkwood wrote: For database types (and yes I'm one of those)...you want to know that your writes (particularly your commit writes) are actually making it to persistent storage (that ACID thing you know). Now I see RBD cache very like battery backed RAID cards - your

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-24 Thread Mark Kirkwood
On 24/06/14 23:39, Mark Nelson wrote: On 06/24/2014 03:45 AM, Mark Kirkwood wrote: On 24/06/14 18:15, Robert van Leeuwen wrote: All of which means that Mysql performance (looking at you binlog) may still suffer due to lots of small block size sync writes. Which begs the question: Anyone

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-26 Thread Mark Kirkwood
On 26/06/14 03:15, Josef Johansson wrote: Hi, On 25/06/14 00:27, Mark Kirkwood wrote: Yes - same kind of findings, specifically: - random read and write (e.g index access) faster than local disk - sequential write (e.g batch inserts) similar or faster than local disk - sequential read (e.g

Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Mark Kirkwood
On 13/07/14 17:07, Andrija Panic wrote: Hi, Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect. I did basic yum update ceph on the first MON leader, and all CEPH services on that HOST, have been restarted

Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Mark Kirkwood
On 13/07/14 19:15, Mark Kirkwood wrote: On 13/07/14 18:38, Andrija Panic wrote: Any suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does NOT need to be recompiled Thinking about this a bit more - Wido *may* have meant: - *libvirt* does not need

Re: [ceph-users] librbd tuning?

2014-08-05 Thread Mark Kirkwood
On 05/08/14 03:52, Tregaron Bayly wrote: Does anyone have any insight on how we can tune librbd to perform closer to the level of the rbd kernel module? In our lab we have a four node cluster with 1GbE public network and 10GbE cluster network. A client node connects to the public network with

Re: [ceph-users] librbd tuning?

2014-08-05 Thread Mark Kirkwood
On 05/08/14 23:44, Mark Nelson wrote: On 08/05/2014 02:48 AM, Mark Kirkwood wrote: On 05/08/14 03:52, Tregaron Bayly wrote: Does anyone have any insight on how we can tune librbd to perform closer to the level of the rbd kernel module? In our lab we have a four node cluster with 1GbE public

Re: [ceph-users] Using Crucial MX100 for journals or cache pool

2014-08-05 Thread Mark Kirkwood
It claims to have power loss protection, and reviews appear to back this up (http://www.anandtech.com/show/8066/crucial-mx100-256gb-512gb-review). I can't see a capacitor on the board... so I'm not sure of the mechanism Micron are using on these guys. The thing that requires attention would

Re: [ceph-users] Using Crucial MX100 for journals or cache pool

2014-08-05 Thread Mark Kirkwood
A better picture here (http://img1.lesnumeriques.com/test/90/9096/crucial_mx100_512gb_pcb_hq.jpg). A row of small caps clearly visible on right of the left hand image... On 06/08/14 12:40, Mark Kirkwood wrote: It claims to have power loss protection, I can't see a capacitor on the board... so

[ceph-users] Fresh deploy of ceph 0.83 has OSD down

2014-08-06 Thread Mark Kirkwood
Hi, I'm doing a fresh install of ceph 0.83 (src build) to an Ubuntu 14.04 VM using ceph-deploy 1.59. Everything goes well until the osd creation, which fails to start with a journal open error. The steps are shown below (ceph is the deploy target host): (ceph1) $ uname -a Linux ceph1

Re: [ceph-users] ceph-deploy activate actually didn't activate the OSD

2014-08-07 Thread Mark Kirkwood
On 08/08/14 07:07, German Anders wrote: Hi to all, I'm having some issues while trying to deploy a OSD: ceph@cephmon01:~$ *sudo ceph osd tree* # idweighttype nameup/downreweight -12.73root default -22.73host cephosd01 02.73osd.0

Re: [ceph-users] Fresh deploy of ceph 0.83 has OSD down

2014-08-11 Thread Mark Kirkwood
On 07/08/14 11:06, Mark Kirkwood wrote: Hi, I'm doing a fresh install of ceph 0.83 (src build) to an Ubuntu 14.04 VM using ceph-deploy 1.59. Everything goes well until the osd creation, which fails to start with a journal open error. The steps are shown below (ceph is the deploy target host

Re: [ceph-users] Fresh deploy of ceph 0.83 has OSD down

2014-08-11 Thread Mark Kirkwood
On 11/08/14 20:52, Mark Kirkwood wrote: On 07/08/14 11:06, Mark Kirkwood wrote: Hi, I'm doing a fresh install of ceph 0.83 (src build) to an Ubuntu 14.04 VM using ceph-deploy 1.59. Everything goes well until the osd creation, which fails to start with a journal open error. The steps are shown

Re: [ceph-users] ceph.conf changes and restarting ceph.

2013-09-29 Thread Mark Kirkwood
I think this is not quite right now: Upstart does not require you to define daemon instances in the Ceph configuration file (*although, they are still required for sysvinit should you choose to use it*). I find that simply doing: $ mv upstart sysvinit in the various mon/osd/mds etc dirs

Re: [ceph-users] About the data movement in Ceph

2013-09-29 Thread Mark Kirkwood
You might find it easier to use the python implementation for this (I certainly did). See attached (I was only interested in number of bytes, but the other metrics are available too)! Cheers Mark On 28/09/13 14:48, Zh Chen wrote: And recently i have another questions as follows, 5. I

Re: [ceph-users] ...-all-starter documentation available?

2013-10-09 Thread Mark Kirkwood
Upstart itself could do with better docs :-( I'd recommend starting with 'man initctl', should help clarify things a bit! Cheers Mark On 10/10/13 17:50, John Wilkins wrote: Ceph deployed by ceph-deploy on Ubuntu uses upstart. On Wed, Oct 9, 2013 at 1:48 PM, Snider, Tim tim.sni...@netapp.com

[ceph-users] Very unbalanced osd data placement with differing sized devices

2013-10-16 Thread Mark Kirkwood
I stumbled across this today: 4 osds on 4 hosts (names ceph1 - ceph4). They are KVM guests (this is a play setup). - ceph1 and ceph2 each have a 5G volume for osd data (+ 2G vol for journal) - ceph3 and ceph4 each have a 10G volume for osd data (+ 2G vol for journal) I do a standard

Re: [ceph-users] ceph-deploy zap disk failure

2013-10-18 Thread Mark Kirkwood
I'd guess that your sudo config has a very limited path list. On the target hosts check the 'secure_path' entry in /etc/sudoers. E.g mine is (Ubuntu 13.10): $ sudo grep secure_path /etc/sudoers Defaults secure_path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin On 19/10/13 03:19,

Re: [ceph-users] SSD question

2013-10-21 Thread Mark Kirkwood
On 22/10/13 15:05, Martin Catudal wrote: Hi, I have purchase my hardware for my Ceph storage cluster but did not open any of my 960GB SSD drive box since I need to answer my question first. Here's my hardware. THREE server Dual 6 core Xeon 2U capable with 8 hotswap tray plus 2 SSD mount

Re: [ceph-users] Radosgw and large files

2013-10-27 Thread Mark Kirkwood
I was looking at the same thing myself, and Boto seems to work ok (tested a 6G file - some sample code attached). Regards Mark On 27/10/13 11:46, Derek Yarnell wrote: Hi Shain, Yes we have tested and have working S3 Multipart support for files 5GB (RHEL64/0.67.4). However, crossftp unless

[ceph-users] Radosgw partial gc

2013-10-28 Thread Mark Kirkwood
I have a radosgw instance (ceph 0.71-299-g5cba838 src build), running on Ubuntu 13.10. I've been experimenting with multipart uploads (which are working fine). However while *most* objects (from radosgw perspective) have their storage space gc'd after a while post deletion, I'm seeing what

Re: [ceph-users] radosgw-agent error

2013-10-30 Thread Mark Kirkwood
On 29/10/13 20:53, lixuehui wrote: Hi,list From the document that a radosgw-agent's right info should like this INFO:radosgw_agent.sync:Starting incremental sync INFO:radosgw_agent.worker:17910 is processing shard number 0 INFO:radosgw_agent.worker:shard 0 has 0 entries

Re: [ceph-users] radosgw-agent error

2013-10-30 Thread Mark Kirkwood
On 31/10/13 06:31, Josh Durgin wrote: Note that the wip in the url means it's a work-in-progress branch, so it's not totally ready yet either. If anything is confusing or missing, let us know. It's great people are interested in trying this early. It's very helpful to find issues sooner (like

Re: [ceph-users] Radosgw and large files

2013-10-30 Thread Mark Kirkwood
to cancel the upload if the program needs to abort - but it is still possible to get failed uploads for other reasons, so it probably still useful to have something to find any! Cheers Mark On 28/10/13 18:04, Mark Kirkwood wrote: I was looking at the same thing myself, and Boto seems to work ok

Re: [ceph-users] Radosgw partial gc

2013-10-30 Thread Mark Kirkwood
On 29/10/13 18:08, Mark Kirkwood wrote: On 29/10/13 17:46, Yehuda Sadeh wrote: The multipart abort operation is supposed to remove the objects (no gc needed for these). Were there any other issues during the run, e.g., restarted gateways, failed requests, etc.? Note that the objects here

Re: [ceph-users] Radosgw and large files

2013-10-31 Thread Mark Kirkwood
, October 31, 2013 1:27 PM To: Mark Kirkwood; de...@umiacs.umd.edu; ceph-us...@ceph.com Subject: Re: [ceph-users] Radosgw and large files Mark, Thanks for the update. Just an FYI I ran into an issue using the script when it turned out that the last part of the file was exactly 0 bytes. in length

Re: [ceph-users] Radosgw and large files

2013-10-31 Thread Mark Kirkwood
for your enjoyment). Cheers Mark On 01/11/13 09:51, Mark Kirkwood wrote: Blast -I must have some shoddy arithmetic around the bit where I work out the final pirce size. I'll experient... Cheers Mark On 01/11/13 06:35, Shain Miley wrote: PS...I tested the cancel script it worked like a charm

Re: [ceph-users] Very frustrated with Ceph!

2013-11-04 Thread Mark Kirkwood
On 05/11/13 06:37, Alfredo Deza wrote: On Mon, Nov 4, 2013 at 12:25 PM, Gruher, Joseph R joseph.r.gru...@intel.com wrote: Could these problems be caused by running a purgedata but not a purge? It could be, I am not clear on what the expectation was for just doing purgedata without a purge.

Re: [ceph-users] Very frustrated with Ceph!

2013-11-05 Thread Mark Kirkwood
mailto:s...@newdream.net wrote: Purgedata is only meant to be run *after* the package is uninstalled. We should make it do a check to enforce that. Otherwise we run into these problems... Mark Kirkwood mark.kirkw...@catalyst.net.nz mailto:mark.kirkw...@catalyst.net.nz wrote

Re: [ceph-users] Very frustrated with Ceph!

2013-11-05 Thread Mark Kirkwood
... forgot to add: maybe 'uninstall' should be target for ceph-deploy that removes just the actual software daemons... On 06/11/13 14:16, Mark Kirkwood wrote: I think purge of several data containing packages will ask if you want to destroy that too (Mysql comes to mind - asks if you want

Re: [ceph-users] Very frustrated with Ceph!

2013-11-05 Thread Mark Kirkwood
a bad choice. I'd much rather that users be annoyed with me that they have to go manually clean up old data vs users who can't get their data back without herculean efforts. Mark On 11/05/2013 07:19 PM, Mark Kirkwood wrote: ... forgot to add: maybe 'uninstall' should be target for ceph-deploy

Re: [ceph-users] USB pendrive as boot disk

2013-11-06 Thread Mark Kirkwood
On 07/11/13 13:54, Craig Lewis wrote: On 11/6/13 15:41 , Gandalf Corvotempesta wrote: With the suggested adapter why not using a standard 2.5'' sata disk? Sata for OS should be enough, no need for an ssd At the time, the smallest SSDs were about half the price of the smallest HDDs. My Ceph

Re: [ceph-users] USB pendrive as boot disk

2013-11-06 Thread Mark Kirkwood
On 07/11/13 20:22, ja...@peacon.co.uk wrote: On 2013-11-07 01:02, Mark Kirkwood wrote: The SSD failures I've seen have all been firmware bugs rather than flash wearout. This has the effect that a RAID1 pair are likley to fail at the same time! Very interesting... and good reason to use two

Re: [ceph-users] ceph (deploy?) and drive paths / mounting / best practice.

2013-11-18 Thread Mark Kirkwood
On 19/11/13 18:56, Robert van Leeuwen wrote: Hi, Since the /dev/sdX device location could shuffle things up (and that would mess things up) I'd like to use a more-persistent device path. Since I'd like to be able to replace a disk without adjusting anything (e.g. just formatting the disk)

Re: [ceph-users] How to replace a failed OSD

2013-11-20 Thread Mark Kirkwood
On 20/11/13 22:27, Robert van Leeuwen wrote: Hi, What is the easiest way to replace a failed disk / OSD. It looks like the documentation here is not really compatible with ceph_deploy: http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ It is talking about adding stuff to the

Re: [ceph-users] radosgw-agent AccessDenied 403

2013-11-20 Thread Mark Kirkwood
On 13/11/13 21:16, lixuehui wrote: Hi ,list We've ever reflected that ,radosgw-agent sync data failed all the time ,before. We paste the concert log here to seek any help now . application/json; charset=UTF-8 Wed, 13 Nov 2013 07:24:45 GMT x-amz-copy-source:sss%2Frgwconf /sss/rgwconf

Re: [ceph-users] PG state diagram

2013-11-25 Thread Mark Kirkwood
That's rather cool (very easy to change). However given that the current generated size is kinda a big thumbnail and too small to be actually read meaningfully, would it not make sense to generate a larger resolution version by default and make the current one a link to it? Cheers Mark On

Re: [ceph-users] Failed to execute command: ceph-disk list

2013-12-07 Thread Mark Kirkwood
On 15/11/13 01:40, Alfredo Deza wrote: On Wed, Nov 13, 2013 at 8:30 PM, Xuan Bai baixuan1...@gmail.com wrote: Hi All, I am testing install ceph cluster from ceph-deploy 1.3.2, I get a python error when execute ceph-deploy disk list. Here is my output: [root@ceph-02 my-cluster]# ceph-deploy

Re: [ceph-users] Failed to execute command: ceph-disk list

2013-12-07 Thread Mark Kirkwood
On 08/12/13 12:14, Mark Kirkwood wrote: I wonder if it might be worth adding a check at the start of either ceph-deploy to look for binaries we are gonna need. ...growl: either ceph-deploy *or ceph-disk* was what I was thinking! ___ ceph-users

Re: [ceph-users] Emergency! Production Cluster is down

2013-12-07 Thread Mark Kirkwood
On 08/12/13 19:28, Howie C. wrote: Hello Guys, Tonight when I was trying to remove 2 monitors from the production cluster, everything seems fine but all the sudden I cannot connect to the cluster no more, showing root@mon01:~# ceph mon dump 2013-12-07 22:24:57.693246 7f7ee21cc700 0

Re: [ceph-users] 1MB/s throughput to 33-ssd test cluster

2013-12-08 Thread Mark Kirkwood
On 09/12/13 17:07, Greg Poirier wrote: Hi. So, I have a test cluster made up of ludicrously overpowered machines with nothing but SSDs in them. Bonded 10Gbps NICs (802.3ad layer 2+3 xmit hash policy, confirmed ~19.8 Gbps throughput with 32+ threads). I'm running rados bench, and I am currently

[ceph-users] Noticing lack of saucyness when installing on Ubuntu (particularly with ceph deploy)

2013-12-09 Thread Mark Kirkwood
Just noticed that Ubuntu 13.10 (saucy) is still causing failures when attempting to naively install ceph (in particular when using ceph-deploy). Now I know this is pretty easy to work around (e.g s/saucy/raring/ in ceph.list) but it seems highly undesirable to make installing ceph *harder*

Re: [ceph-users] Failed to execute command: ceph-disk list

2013-12-10 Thread Mark Kirkwood
On 10/12/13 03:34, Alfredo Deza wrote: On Sat, Dec 7, 2013 at 7:17 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 08/12/13 12:14, Mark Kirkwood wrote: I wonder if it might be worth adding a check at the start of either ceph-deploy to look for binaries we are gonna need. ...growl

Re: [ceph-users] Upcoming Erasure coding

2013-12-24 Thread Mark Kirkwood
On 25/12/13 04:33, Loic Dachary wrote: On 24/12/2013 10:22, Wido den Hollander wrote: IIRC Erasure Encoding doesn't work well with RBD, if it even works at all due to the fact that you can't update a object, but you have to completely rewrite the whole object. So Erasure encoding works

[ceph-users] Inconsistent pgs after update to 0.73 - 0.74

2014-01-09 Thread Mark Kirkwood
I've noticed this on 2 (development) clusters that I have with pools having size 1. I guess my first question would be - is this expected? Here's some detail from one of the clusters: $ ceph -v ceph version 0.74-621-g6fac2ac (6fac2acc5e6f77651ffcd7dc7aa833713517d8a6) $ ceph osd dump epoch 104

Re: [ceph-users] Inconsistent pgs after update to 0.73 - 0.74

2014-01-09 Thread Mark Kirkwood
On 10/01/14 16:18, David Zafman wrote: With pool size of 1 the scrub can still do some consistency checking. These are things like missing attributes, on-disk size doesn’t match attribute size, non-clone without a head, expected clone. You could check the osd logs to see what they were.

Re: [ceph-users] Inconsistent pgs after update to 0.73 - 0.74

2014-01-13 Thread Mark Kirkwood
On 10/01/14 17:16, Mark Kirkwood wrote: On 10/01/14 16:18, David Zafman wrote: With pool size of 1 the scrub can still do some consistency checking. These are things like missing attributes, on-disk size doesn’t match attribute size, non-clone without a head, expected clone. You could check

Re: [ceph-users] Is it possible to have One Ceph-OSD-Daemon managing more than one OSD

2014-02-14 Thread Mark Kirkwood
On 14/02/14 22:07, Vikrant Verma wrote: Hi All, I was trying to define QoS on volumes in the openstack setup. Ceph Cluster is configured as Storage back-end for images and volumes. As part of my experimentation i thought of clubbing few disks (say HDD) with one type of QoS and other few disks

Re: [ceph-users] Cannot create keys for new 0.78 deployment - protocol mismatch

2014-03-24 Thread Mark Kirkwood
Further - checking with 0.77 from 18th Mar shows the same problem, but 0.73 from 12 Dec 2013 does not have this issue. So anyway, looks like it is not a 0.78 problem, but is some sort of problem! Regards Mark On 25/03/14 15:09, Mark Kirkwood wrote: Fresh clone and rebuild results

Re: [ceph-users] Cannot create keys for new 0.78 deployment - protocol mismatch

2014-03-25 Thread Mark Kirkwood
of the new ones, so I wonder if you somehow have an old library floating around somewhere in there which supports everything except for that one feature. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Mar 24, 2014 at 9:53 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz

Re: [ceph-users] Cannot create keys for new 0.78 deployment - protocol mismatch

2014-03-25 Thread Mark Kirkwood
). Regards Mark On 26/03/14 13:26, Mark Kirkwood wrote: I see I have librbd1 and librados2 at 0.72.2 (due to having qemu installed on this machine). That could be the source of the problem, I'll see if I can update them (I have pending updates I think), and report back. Cheers Mark On 26

[ceph-users] Removal of object from erasure coded pool does not free up space

2014-03-31 Thread Mark Kirkwood
I'm taking a look at erasure coded pools in (ceph 0.78-336-gb9e29ca). I'm doing a simple test where I use 'rados put' to load a 1G file into an erasure coded pool, and then 'rados rm' to remove it later. Checking with 'rados df' shows no objects in the pool and no KB, but the object space is

Re: [ceph-users] ceph-deploy progress and CDS session

2013-08-06 Thread Mark Kirkwood
One thing that comes to mind is the ability to create (or activate) osd's with a custom crush specification from (say) a supplied file. Regards Mark On 03/08/13 06:02, Sage Weil wrote: There is a session at CDS scheduled to discuss ceph-deploy (4:40pm PDT on Monday). We'll be going over

Re: [ceph-users] Usage pattern and design of Ceph

2013-08-19 Thread Mark Kirkwood
On 19/08/13 18:17, Guang Yang wrote: 3. Some industry research shows that one issue of file system is the metadata-to-data ratio, in terms of both access and storage, and some technic uses the mechanism to combine small files to large physical files to reduce the ratio (Haystack for

Re: [ceph-users] Usage pattern and design of Ceph

2013-08-19 Thread Mark Kirkwood
On 20/08/13 13:27, Guang Yang wrote: Thanks Mark. What is the design considerations to break large files into 4M chunk rather than storing the large file directly? Quoting Wolfgang from previous reply: = which is a good thing in terms of replication and OSD usage distribution ...which

Re: [ceph-users] Radosgw S3 - can't authenticate user

2013-09-03 Thread Mark Kirkwood
On 03/09/13 15:25, Yehuda Sadeh wrote: Boto prog: #!/usr/bin/python import boto import boto.s3.connection access_key = 'X5E5BXJHCZGGII3HAWBB', secret_key = '' # redacted conn = boto.connect_s3( aws_access_key_id = access_key, aws_secret_access_key =

Re: [ceph-users] New to ceph, auth/permission error

2013-09-05 Thread Mark Kirkwood
On 06/09/13 11:07, Gary Mazzaferro wrote: Hi Installed the latest ceph and having an issue with permission and don't know where to start looking. My Config: (2) ods data nodes (1) monitor node (1) mds node (1) admin node (1) deploy node (1) client node (not configured) All on vmware I

Re: [ceph-users] ceph-deploy state of documentation [was: OSD JOURNAL not associated - ceph-disk list ?]

2014-12-22 Thread Mark Kirkwood
On 22/12/14 07:37, Nico Schottelius wrote: Hello list, I am a bit wondering about ceph-deploy and the development of ceph: I see that many people in the community are pushing towards the use of ceph-deploy, likely to ease use of ceph. However, I have run multiple times into issues using

Re: [ceph-users] xfs/nobarrier

2014-12-27 Thread Mark Kirkwood
On 27/12/14 20:32, Lindsay Mathieson wrote: I see a lot of people mount their xfs osd's with nobarrier for extra performance, certainly it makes a huge difference to my small system. However I don't do it as my understanding is this runs a risk of data corruption in the event of power failure -

Re: [ceph-users] xfs/nobarrier

2014-12-27 Thread Mark Kirkwood
On 28/12/14 15:51, Kyle Bader wrote: do people consider a UPS + Shutdown procedures a suitable substitute? I certainly wouldn't, I've seen utility power fail and the transfer switch fail to transition to UPS strings. Had this happened to me with nobarrier it would have been a very sad day.

Re: [ceph-users] xfs/nobarrier

2014-12-28 Thread Mark Kirkwood
On 29/12/14 02:46, Lindsay Mathieson wrote: On Sat, 27 Dec 2014 09:41:19 PM you wrote: I certainly wouldn't, I've seen utility power fail and the transfer switch fail to transition to UPS strings. Had this happened to me with nobarrier it would have been a very sad day. I'd second that.

Re: [ceph-users] redundancy with 2 nodes

2014-12-31 Thread Mark Kirkwood
The number of monitors recommended and the fact that a voting quorum is the way it works is covered here: http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/ but I agree that you should probably not get a HEALTH OK status when you have just setup 2 (or in fact any even number of)

Re: [ceph-users] Repetitive builds for Ceph

2015-02-02 Thread Mark Kirkwood
On 03/02/15 01:28, Loic Dachary wrote: On 02/02/2015 13:27, Ritesh Raj Sarraf wrote: By the way, I'm trying to build Ceph from master, on Ubuntu Trusty. I hope that is supported ? Yes, that's also what I have. Same here - in the advent you need to rebuild the whole thing, using

Re: [ceph-users] RGW region metadata sync prevents writes to non-master region

2015-02-02 Thread Mark Kirkwood
On 30/01/15 13:39, Mark Kirkwood wrote: On 30/01/15 12:34, Yehuda Sadeh wrote: On Thu, Jan 29, 2015 at 3:27 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 30/01/15 11:08, Yehuda Sadeh wrote: How does your regionmap look like? Is it updated correctly on all zones? Regionmap

[ceph-users] RGW Unexpectedly high number of objects in .rgw pool

2015-01-21 Thread Mark Kirkwood
We have a cluster running RGW (Giant release). We've noticed that the .rgw pool has an unexpectedly high number of objects: $ ceph df ... POOLS: NAME ID USED %USED MAX AVAIL OBJECTS ... .rgw.root 5 840 029438G

[ceph-users] RGW Enabling non default region on existing cluster - data migration

2015-01-21 Thread Mark Kirkwood
I've been looking at the steps required to enable (say) multi region metadata sync where there is an existing RGW that has been in use (i.e non trivial number of buckets and objects) which been setup without any region parameters. Now given that the existing objects are all in the pools

Re: [ceph-users] mongodb on top of rbd volumes (through krbd) ?

2015-02-12 Thread Mark Kirkwood
On 12/02/15 23:18, Alexandre DERUMIER wrote: What is the behavior of mongo when a shard is unavailable for some reason (crash or network partition) ? If shard3 is on the wrong side of a network partition and uses RBD, it will hang. Is it something that mongo will gracefully handle ? If one

[ceph-users] RGW region metadata sync prevents writes to non-master region

2015-01-28 Thread Mark Kirkwood
Hi, I am following http://docs.ceph.com/docs/master/radosgw/federated-config/ using cepg 0.91 (0.91-665-g6f44f7a): - 2 regions (US and EU). US is the master region - 2 ceph clusters, one per region - 4 zones (us east and west, eu east and west - 4 hosts (ceph1 + ceph2 being us-west + us-east

Re: [ceph-users] RGW region metadata sync prevents writes to non-master region

2015-01-28 Thread Mark Kirkwood
On 29/01/15 13:58, Mark Kirkwood wrote: However if I try to write to eu-west I get: Sorry - that should have said: However if I try to write to eu-*east* I get: The actual code is (see below) connecting to the endpoint for eu-east (ceph4:80), so seeing it redirected to us-*west* is pretty

Re: [ceph-users] Ceph as backend for Swift

2015-01-09 Thread Mark Kirkwood
It is not too difficult to get going, once you add various patches so it works: - missing __init__.py - Allow to set ceph.conf - Fix write issue: ioctx.write() does not return the written length - Add param to async_update call (for swift in Juno) There are a number of forks/pulls etc for

Re: [ceph-users] Ceph vs Hardware RAID: No battery backed cache

2015-02-10 Thread Mark Kirkwood
On 10/02/15 20:40, Thomas Güttler wrote: Hi, does the lack of a battery backed cache in Ceph introduce any disadvantages? We use PostgreSQL and our servers have UPS. But I want to survive a power outage, although it is unlikely. But hope is not an option ... You can certainly make use of

Re: [ceph-users] Regarding Federated Gateways - Zone Sync Issues

2015-01-07 Thread Mark Kirkwood
On 07/01/15 16:22, Mark Kirkwood wrote: FWIW I can reproduce this too (ceph 0.90-663-ge1384af). The *user* replicates ok (complete with its swift keys and secret). I can authenticate to both zones ok using S3 api (boto version 2.29), but only to the master using swift (swift client versions

Re: [ceph-users] Regarding Federated Gateways - Zone Sync Issues

2015-01-07 Thread Mark Kirkwood
On 06/01/15 06:45, hemant burman wrote: One more thing Yehuda, In radosgw log in Slave Zone: 2015-01-05 17:22:42.188108 7fe4b66d2780 20 enqueued request req=0xbc1f50 2015-01-05 17:22:42.188125 7fe4b66d2780 20 RGWWQ: 2015-01-05 17:22:42.188126 7fe4b66d2780 20 req: 0xbc1f50 2015-01-05

Re: [ceph-users] Regarding Federated Gateways - Zone Sync Issues

2015-01-07 Thread Mark Kirkwood
On 07/01/15 17:43, hemant burman wrote: Hello Yehuda, The issue seem to be with the user data file for swift subser not getting synced properly. FWIW, I'm seeing exactly the same thing as well (Hermant - that was well spotted)! ___ ceph-users

Re: [ceph-users] RGW region metadata sync prevents writes to non-master region

2015-01-29 Thread Mark Kirkwood
On 30/01/15 06:31, Yehuda Sadeh wrote: On Wed, Jan 28, 2015 at 8:04 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 29/01/15 13:58, Mark Kirkwood wrote: However if I try to write to eu-west I get: Sorry - that should have said: However if I try to write to eu-*east* I get

Re: [ceph-users] RGW region metadata sync prevents writes to non-master region

2015-01-29 Thread Mark Kirkwood
On 30/01/15 11:08, Yehuda Sadeh wrote: How does your regionmap look like? Is it updated correctly on all zones? Regionmap listed below - checking it on all 4 zones produces exactly the same output (md5sum is same): { regions: [ { key: eu, val: {

Re: [ceph-users] RGW region metadata sync prevents writes to non-master region

2015-01-29 Thread Mark Kirkwood
On 30/01/15 12:34, Yehuda Sadeh wrote: On Thu, Jan 29, 2015 at 3:27 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 30/01/15 11:08, Yehuda Sadeh wrote: How does your regionmap look like? Is it updated correctly on all zones? Regionmap listed below - checking it on all 4 zones

Re: [ceph-users] Building Ceph

2015-04-02 Thread Mark Kirkwood
I think you want to do: $ dch $ dpkg-buildpackage You can muck about with what the package is gonna be called (versions, revisions etc) from dch, without changing the src. Cheers Mark On 03/04/15 10:17, Garg, Pankaj wrote: Hi, I am building Ceph Debian Packages off of the 0.80.9 (latest

Re: [ceph-users] ceph-deploy : systemd unit files not deployed to a centos7 nodes

2015-04-27 Thread Mark Kirkwood
I have just run into this after upgrading to Ubuntu 15.04 and trying to deploy ceph 0.94. Initially tried to get things going by changing relevant code for ceph-deploy and ceph-disk to use systemd for this release - however the unit files in ./systemd do not contain a ceph-create-keys step,

Re: [ceph-users] Help with CEPH deployment

2015-05-03 Thread Mark Kirkwood
On 04/05/15 05:42, Venkateswara Rao Jujjuri wrote: Here is the output..I am still stuck at this step. :( (multiple times tried to by purging and restarting from scratch) vjujjuri@rgulistan-wsl10:~/ceph-cluster$ ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configuration file

Re: [ceph-users] Rados Gateway and keystone

2015-05-07 Thread Mark Kirkwood
On 07/05/15 20:21, ghislain.cheval...@orange.com wrote: HI all, After adding the nss and the keystone admin url parameters in ceph.conf and creating the openSSL certificates, all is working well. If I had followed the doc and processed by copy/paste, I wouldn't have encountered any

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Mark Kirkwood
On 05/05/15 04:16, Venkateswara Rao Jujjuri wrote: Thanks Mark. I switched to completely different machine and started from scratch, things were much smoother this time. Cluster was up in 30 mins. I guess purgedata , droplets and and purge is Not enough to bring the machine back clean? What I

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Mark Kirkwood
Yes, it sure is - my experience with 'consumer' SSD is that they die with obscure firmware bugs (wrong capacity, zero capacity, not detected in bios anymore) rather than flash wearout. It seems that the 'enterprise' tagged drives are less inclined to suffer this fate. Regards Mark On

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-06-04 Thread Mark Kirkwood
if this was on a test system)! Cheers Mark On 05/06/15 15:28, Christian Balzer wrote: Hello Mark, On Thu, 04 Jun 2015 20:34:55 +1200 Mark Kirkwood wrote: Sorry Christian, I did briefly wonder, then thought, oh yeah, that fix is already merged in...However - on reflection, perhaps

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-06-08 Thread Mark Kirkwood
Mark Kirkwood wrote: Trying out some tests on my pet VMs with 0.80.9 does not elicit any journal failures...However ISTR that running on the bare metal was the most reliable way to reproduce...(proceeding - currently cannot get ceph-deploy to install this configuration...I'll investigate

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-06-06 Thread Mark Kirkwood
:49, Christian Balzer wrote: Hello, On Fri, 05 Jun 2015 16:33:46 +1200 Mark Kirkwood wrote: Well, whatever it is, I appear to not be the only one after all: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=773361 Looking quickly at the relevant code: FileJournal::stop_writer() in src/os

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-06-08 Thread Mark Kirkwood
)! Cheers Mark On 06/06/15 18:04, Mark Kirkwood wrote: Righty - I'll see if I can replicate what you see if I setup an 0.80.9 cluster using the same workstation hardware (WD Raptors and Intel 520s) that showed up the issue previously at 0.83 (I wonder if I never tried a fresh install using the 0.80

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-06-04 Thread Mark Kirkwood
eyeball I think I might be seeing this: --- osd: fix journal direct-io shutdown (#9073 Mark Kirkwood, Ma Jianpeng, Somnath Roy) --- The details in the various related bug reports certainly make it look related. Funny that nobody involved in those bug reports noticed the similarity. Now I wouldn't

Re: [ceph-users] A tiny quesion about the object id

2015-10-05 Thread Mark Kirkwood
If you look at the rados api (e.g http://docs.ceph.com/docs/master/rados/api/python/), there is no explicit call for the object id - the closest is the 'key', which is actually the object's name. If you are using the python bindings you can see this by calling dir() on a rados object and

Re: [ceph-users] Incomplete MON removal

2015-07-08 Thread Mark Kirkwood
On 09/07/15 00:03, Steve Thompson wrote: Ceph newbie here; ceph 0.94.2, CentOS 6.6 x86_64. Kernel 2.6.32. Initial test cluster of five OSD nodes, 3 MON, 1 MDS. Working well. I was testing the removal of two MONs, just to see how it works. The second MON was stopped and removed: no problems. The

Re: [ceph-users] purpose of different default pools created by radosgw instance

2015-09-09 Thread Mark Kirkwood
On 16/09/14 17:10, pragya jain wrote: > Hi all! > > As document says, ceph has some default pools for radosgw instance. These > pools are: > * .rgw.root > * .rgw.control > * .rgw.gc > * .rgw.buckets > * .rgw.buckets.index > * .log > * .intent-log >

Re: [ceph-users] purpose of different default pools created by radosgw instance

2015-09-09 Thread Mark Kirkwood
On 10/09/15 11:27, Shinobu Kinjo wrote: > That's good point actually. > Probably saves our life -; > > Shinobu > > - Original Message - > From: "Ben Hines" <bhi...@gmail.com> > To: "Mark Kirkwood" <mark.kirkw...@catalyst.net.nz&g

Re: [ceph-users] Basic object storage question

2015-09-24 Thread Mark Kirkwood
Glance (and friends - Cinder etc) work with the RBD layer, so yeah the big 'devices' visible to Openstack are made up of many (usually 4MB) Rados objects. Cheers Mark On 25/09/15 12:13, Cory Hawkless wrote: > > Upon bolting openstack Glance onto Ceph I can see hundreds of smaller objects >

[ceph-users] RGW: swift stat double counts objects

2016-01-27 Thread Mark Kirkwood
I'm using swift client talking to ceph 0.94.5 on Ubuntu 14.04: $ swift stat Account: v1 Containers: 0 Objects: 0 Bytes: 0 Server: Apache/2.4.7 (Ubuntu) X-Account-Bytes-Used-Actual: 0

Re: [ceph-users] Spreading deep-scrubbing load

2016-08-19 Thread Mark Kirkwood
On 19/08/16 17:33, Christian Balzer wrote: On Fri, 19 Aug 2016 15:39:13 +1200 Mark Kirkwood wrote: It would be cool to have a command or api to alter/set the last deep scrub timestamp - as it seems to me that the only way to change the distribution of deep scrubs is to perform deep scrubs

Re: [ceph-users] Spreading deep-scrubbing load

2016-08-18 Thread Mark Kirkwood
On 15/06/16 13:18, Christian Balzer wrote: "osd_scrub_min_interval": "86400", "osd_scrub_max_interval": "604800", "osd_scrub_interval_randomize_ratio": "0.5", Latest Hammer and afterwards can randomize things (spreading the load out), but if you want things to happen within a

[ceph-users] Changing the distribution of pgs to be deep-scrubbed

2016-08-25 Thread Mark Kirkwood
Deep scrubbing is a pain point for some (many?) Ceph installations. We have recently been hit by deep scrubbing causing noticeable latency increases to the entire cluster, but only on certain (infrequent) days. This led me to become more interested in the distribution of pgdeep scrubs.

Re: [ceph-users] Luminous: ceph mgr crate error - mon disconnected

2017-07-23 Thread Mark Kirkwood
On 22/07/17 23:50, Oscar Segarra wrote: Hi, I have upgraded from kraken version with a simple "yum upgrade command". Later the upgrade, I'd like to deploy the mgr daemon on one node of my ceph infrastrucute. But, for any reason, It gets stuck! Let's see the complete set of commands:

Re: [ceph-users] Luminous: ceph mgr crate error - mon disconnected

2017-07-23 Thread Mark Kirkwood
3.service to /lib/systemd/system/ceph-mgr@.service. [nuc3][INFO ] Running command: sudo systemctl start ceph-mgr@nuc3 [nuc3][INFO ] Running command: sudo systemctl enable ceph.target # Status roger@desktop:~/ceph-cluster$ ceph -s ... services: mon: 3 daemons, quorum nuc1,nuc2,nuc3 mgr: n

<    1   2   3   >