Re: [ceph-users] performance tests

2014-07-10 Thread Xabier Elkano
El 09/07/14 16:53, Christian Balzer escribió: On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote: On 07/09/2014 06:52 AM, Xabier Elkano wrote: El 09/07/14 13:10, Mark Nelson escribió: On 07/09/2014 05:57 AM, Xabier Elkano wrote: Hi, I was doing some tests in my cluster with fio tool,

Re: [ceph-users] performance tests

2014-07-10 Thread Christian Balzer
On Thu, 10 Jul 2014 08:57:56 +0200 Xabier Elkano wrote: El 09/07/14 16:53, Christian Balzer escribió: On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote: On 07/09/2014 06:52 AM, Xabier Elkano wrote: El 09/07/14 13:10, Mark Nelson escribió: On 07/09/2014 05:57 AM, Xabier Elkano wrote:

Re: [ceph-users] performance tests

2014-07-10 Thread Xabier Elkano
El 10/07/14 09:18, Christian Balzer escribió: On Thu, 10 Jul 2014 08:57:56 +0200 Xabier Elkano wrote: El 09/07/14 16:53, Christian Balzer escribió: On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote: On 07/09/2014 06:52 AM, Xabier Elkano wrote: El 09/07/14 13:10, Mark Nelson escribió: On

Re: [ceph-users] Some OSD and MDS crash

2014-07-10 Thread Pierre BLONDEAU
Hi, Great. All my OSD restart : osdmap e438044: 36 osds: 36 up, 36 in All PG page are active and some in recovery : 1604040/49575206 objects degraded (3.236%) 1780 active+clean 17 active+degraded+remapped+backfilling 61 active+degraded+remapped+wait_backfill 11 active+clean+scrubbing+deep

Re: [ceph-users] performance tests

2014-07-10 Thread Mark Nelson
On 07/09/2014 09:53 AM, Christian Balzer wrote: On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote: On 07/09/2014 06:52 AM, Xabier Elkano wrote: El 09/07/14 13:10, Mark Nelson escribió: On 07/09/2014 05:57 AM, Xabier Elkano wrote: Hi, I was doing some tests in my cluster with fio tool,

Re: [ceph-users] performance tests

2014-07-10 Thread Mark Nelson
On 07/10/2014 03:24 AM, Xabier Elkano wrote: El 10/07/14 09:18, Christian Balzer escribió: On Thu, 10 Jul 2014 08:57:56 +0200 Xabier Elkano wrote: El 09/07/14 16:53, Christian Balzer escribió: On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote: On 07/09/2014 06:52 AM, Xabier Elkano

Re: [ceph-users] ceph.com centos7 repository ?

2014-07-10 Thread Erik Logtenberg
Hi, RHEL7 repository works just as well. CentOS 7 is effectively a copy of RHEL7 anyway. Packages for CentOS 7 wouldn't actually be any different. Erik. On 07/10/2014 06:14 AM, Alexandre DERUMIER wrote: Hi, I would like to known if a centos7 respository will be available soon ? Or can I

Re: [ceph-users] Temporary degradation when adding OSD's

2014-07-10 Thread Erik Logtenberg
Yeah, Ceph will never voluntarily reduce the redundancy. I believe splitting the degraded state into separate wrongly placed and degraded (reduced redundancy) states is currently on the menu for the Giant release, but it's not been done yet. That would greatly improve the accuracy of ceph's

[ceph-users] [ANN] ceph-deploy 1.5.8 released

2014-07-10 Thread Alfredo Deza
Hi All, There is a new bug-fix release of ceph-deploy, the easy deployment tool for Ceph. The full list of fixes for this release can be found in the changelog: http://ceph.com/ceph-deploy/docs/changelog.html#id1 Make sure you update! ___ ceph-users

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Travis Rhoden
I can also say that after a recent upgrade to Firefly, I have experienced massive uptick in scrub errors. The cluster was on cuttlefish for about a year, and had maybe one or two scrub errors. After upgrading to Firefly, we've probably seen 3 to 4 dozen in the last month or so (was getting 2-3 a

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Travis Rhoden
And actually just to follow-up, it does seem like there are some additional smarts beyond just using the primary to overwrite the secondaries... Since I captured md5 sums before and after the repair, I can say that in this particular instance, the secondary copy was used to overwrite the primary.

[ceph-users] Ceph wthin vmware-cluster

2014-07-10 Thread Marc Hansen
Hi, we try to set up a Webcluster, may somebody give a hint. 3 Webserver with Typo3 The Typo3-Cache on a central Storage with ceph. May this be usefull within a vmware-cluster? I need something different than a central NFS-Store. This is to slow. Regards Marc

Re: [ceph-users] Creating a bucket on a non-master region in a multi-region configuration with unified namespace/replication

2014-07-10 Thread Bachelder, Kurt
OK - I found this information (http://comments.gmane.org/gmane.comp.file-systems.ceph.user/4992): When creating a bucket users can specify which placement target they want to use with that specific bucket (by using the S3 create bucket location constraints field, format is

[ceph-users] ceph mons die unexpected

2014-07-10 Thread Iban Cabrillo
Hi, I am getting some troubles with the ceph mon stability. Every couple of days mons die. I only see this error in the logs: 2014-07-08 14:24:53.056805 7f713bb5b700 -1 mon.cephmon02@1(peon) e2 *** Got Signal Interrupt *** 2014-07-08 14:24:53.061795 7f713bb5b700 1 mon.cephmon02@1(peon) e2

[ceph-users] inktank-mellanox webinar access ?

2014-07-10 Thread Alexandre DERUMIER
Hi, sorry to spam the mailing list, but they are a inktank mellanox webinar in 10minutes, and I don't have receive access since I have been registered yesterday (same for my co-worker). and the webinar mellanox contact email (conta...@mellanox.com), does not exist Maybe somebody from

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Christian Eichelmann
I can also confirm that after upgrading to firefly both of our clusters (test and live) were going from 0 scrub errors each for about 6 Month to about 9-12 per week... This also makes me kind of nervous, since as far as I know everything ceph pg repair does, is to copy the primary object to all

[ceph-users] Placing different pools on different OSDs in the same physical servers

2014-07-10 Thread Nikola Pajtic
Hello to all, I was wondering is it possible to place different pools on different OSDs, but using only two physical servers? I was thinking about this: http://tinypic.com/r/30tgt8l/8 I would like to use osd.0 and osd.1 for Cinder/RBD pool, and osd.2 and osd.3 for Nova instances. I was

Re: [ceph-users] inktank-mellanox webinar access ?

2014-07-10 Thread Georgios Dimitrakakis
The same here...Neither do I or my colleagues G. On Thu, 10 Jul 2014 16:55:22 +0200 (CEST), Alexandre DERUMIER wrote: Hi, sorry to spam the mailing list, but they are a inktank mellanox webinar in 10minutes, and I don't have receive access since I have been registered yesterday (same for my

Re: [ceph-users] inktank-mellanox webinar access ?

2014-07-10 Thread Alexandre DERUMIER
Ok, sorry, we have finally receive the login a bit late. Sorry again to have spam the mailing list - Mail original - De: Alexandre DERUMIER aderum...@odiso.com À: ceph-users ceph-us...@ceph.com Envoyé: Jeudi 10 Juillet 2014 16:55:22 Objet: [ceph-users] inktank-mellanox webinar access

Re: [ceph-users] Temporary degradation when adding OSD's

2014-07-10 Thread Gregory Farnum
On Thursday, July 10, 2014, Erik Logtenberg e...@logtenberg.eu wrote: Yeah, Ceph will never voluntarily reduce the redundancy. I believe splitting the degraded state into separate wrongly placed and degraded (reduced redundancy) states is currently on the menu for the Giant release, but

[ceph-users] logrotate

2014-07-10 Thread James Eckersall
Hi, I've just upgraded a ceph cluster from Ubuntu 12.04 with 0.72.1 to Ubuntu 14.04 with 0.80.1. I've noticed that the log rotation doesn't appear to work correctly. The OSD's are just not logging to the current ceph-osd-X.log file. If I restart the OSD's or run service ceph-osd reload id=X,

Re: [ceph-users] inktank-mellanox webinar access ?

2014-07-10 Thread Georgios Dimitrakakis
That makes two of us... G. On Thu, 10 Jul 2014 17:12:08 +0200 (CEST), Alexandre DERUMIER wrote: Ok, sorry, we have finally receive the login a bit late. Sorry again to have spam the mailing list - Mail original - De: Alexandre DERUMIER aderum...@odiso.com À: ceph-users

[ceph-users] Suggested best practise for Ceph node online/offline?

2014-07-10 Thread Joe Hewitt
Hi there Recently I got a problem triggered by rebooting ceph nodes, which eventually wound up by rebuilding from ground up. A too-long-don't-read question here is: is there suggested best practices for online/offline ceph node? Following the official ceph doc, I set up a 4 node ceph (firefly)

Re: [ceph-users] Suggested best practise for Ceph node online/offline?

2014-07-10 Thread Gregory Farnum
On Thu, Jul 10, 2014 at 9:04 AM, Joe Hewitt joe.z.hew...@gmail.com wrote: Hi there Recently I got a problem triggered by rebooting ceph nodes, which eventually wound up by rebuilding from ground up. A too-long-don't-read question here is: is there suggested best practices for online/offline

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Samuel Just
Can you attach your ceph.conf for your osds? -Sam On Thu, Jul 10, 2014 at 8:01 AM, Christian Eichelmann christian.eichelm...@1und1.de wrote: I can also confirm that after upgrading to firefly both of our clusters (test and live) were going from 0 scrub errors each for about 6 Month to about

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Chahal, Sudip
I've a basic related question re: Firefly operation - would appreciate any insights: With three replicas, if checksum inconsistencies across replicas are found during deep-scrub then: a. does the majority win or is the primary always the winner and used to overwrite the secondaries

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Samuel Just
Repair I think will tend to choose the copy with the lowest osd number which is not obviously corrupted. Even with three replicas, it does not do any kind of voting at this time. -Sam On Thu, Jul 10, 2014 at 10:39 AM, Chahal, Sudip sudip.cha...@intel.com wrote: I've a basic related question re:

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Chahal, Sudip
Thanks - so it appears that the advantage of the 3rd replica (relative to 2 replicas) has to do much more with recovering from two concurrent OSD failures than with inconsistencies found during deep scrub - would you agree? Re: repair - do you mean the repair process during deep scrub - if

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Randy Smith
Greetings, Just a follow up on my original issue. =ceph pg repair ...= fixed the problem. However, today I got another inconsistent pg. It's interesting to me that this second error is in the same rbd image and appears to be close to the previously inconsistent pg. (Even more fun, osd.5 was the

Re: [ceph-users] I have PGs that I can't deep-scrub

2014-07-10 Thread Craig Lewis
I fixed this issue by reformatting all of the OSDs. I changed the mkfs options from [osd] osd mkfs type = xfs osd mkfs options xfs = -l size=1024m -n size=64k -i size=2048 -s size=4096 to [osd] osd mkfs type = xfs osd mkfs options xfs = -s size=4096 (I have a mix of 512 and 4k sector

[ceph-users] logrotate

2014-07-10 Thread James Eckersall
Hi, I've just upgraded a ceph cluster from Ubuntu 12.04 with 0.73.1 to Ubuntu 14.04 with 0.80.1. I've noticed that the log rotation doesn't appear to work correctly. The OSD's are just not logging to the current ceph-osd-X.log file. If I restart the OSD's, they start logging, but then overnight,

Re: [ceph-users] scrub error on firefly

2014-07-10 Thread Samuel Just
It could be an indication of a problem on osd 5, but the timing is worrying. Can you attach your ceph.conf? Have there been any osds going down, new osds added, anything to cause recovery? Anything in dmesg to indicate an fs problem? Have you recently changed any settings? -Sam On Thu, Jul

Re: [ceph-users] ceph-users Digest, Vol 18, Issue 8

2014-07-10 Thread Lazuardi Nasution
Hi, I prefer to use bcache or other likely local write back cache on SSD since it is only related to local HDDs. It think, it will reduce the risk of error on cache flushing comparing to CEPH Cache Tiering which still using clustering network on flushing. After the data has been written to the

Re: [ceph-users] Using large SSD cache tier instead of SSD

2014-07-10 Thread Lazuardi Nasution
Hi, I prefer to use bCache or other likely local write back cache on SSD since it is only related to local HDDs. It think, it will reduce the risk of error on cache flushing comparing to CEPH Cache Tiering which still using clustering network on flushing. After the data has been written to the

[ceph-users] ceph mount not working anymore

2014-07-10 Thread Joshua McClintock
I upgraded my cluster to .80.1-2 (CentOS). My mount command just freezes and outputs an error: mount.ceph 192.168.0.14,192.168.0.15,192.168.0.16:/ /us-west01 -o name=chefwks01,secret=`ceph-authtool -p -n client.admin /etc/ceph/us-west01.client.admin.keyring` mount error 5 = Input/output error

Re: [ceph-users] ceph mount not working anymore

2014-07-10 Thread Sage Weil
Have you made any other changes after the upgrade? (Like adjusting tunables, or creating EC pools?) See if there is anything in 'dmesg' output. sage On Thu, 10 Jul 2014, Joshua McClintock wrote: I upgraded my cluster to .80.1-2 (CentOS).  My mount command just freezes and outputs an error:

Re: [ceph-users] I have PGs that I can't deep-scrub

2014-07-10 Thread Chris Dunlop
Hi Craig, On Thu, Jul 10, 2014 at 03:09:51PM -0700, Craig Lewis wrote: I fixed this issue by reformatting all of the OSDs. I changed the mkfs options from [osd] osd mkfs type = xfs osd mkfs options xfs = -l size=1024m -n size=64k -i size=2048 -s size=4096 to [osd] osd mkfs type

[ceph-users] qemu image create failed

2014-07-10 Thread Yonghua Peng
Hi, I try to create a qemu image, but got failed. ceph@ceph:~/my-cluster$ qemu-img create -f rbd rbd:rbd/qemu 2G Formatting 'rbd:rbd/qemu', fmt=rbd size=2147483648 cluster_size=0 qemu-img: error connecting qemu-img: rbd:rbd/qemu: error while creating rbd: Input/output error Can you tell what's

Re: [ceph-users] ceph mount not working anymore

2014-07-10 Thread Sage Weil
That is CEPH_FEATURE_CRUSH_V2. Can you attach teh output of ceph osd crush dump Thanks! sage On Thu, 10 Jul 2014, Joshua McClintock wrote: Yes, I change some of the mount options on my osds (xfs mount options), but I think this may be the answer from dmesg, sorta looks like a version

Re: [ceph-users] ceph mount not working anymore

2014-07-10 Thread Joshua McClintock
[root@chefwks01 ~]# ceph --cluster us-west01 osd crush dump { devices: [ { id: 0, name: osd.0}, { id: 1, name: osd.1}, { id: 2, name: osd.2}, { id: 3, name: osd.3}, { id: 4, name: osd.4}],