[ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Hi, I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down. But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks are unable to mount, and some disks have IO errors in syslog.

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:46 AM, Florent B flor...@coppint.com wrote: Hi, I would like to know which kernel version is needed to mount CephFS on a Hammer cluster ? And if we use 3.16 kernel of Debian Jessie, can we hope using CephFS for a few next release without problem ? I would advice

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Alexandre DERUMIER
maybe this could help to repair pgs ? http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ (6 disk at the same time seem pretty strange. do you have some kind of writeback cache enable of theses disks ?) - Mail original - De: Yujian Peng

Re: [ceph-users] A pesky unfound object

2015-05-04 Thread Eino Tuominen
Hi everybody, Does anybody have any clue on this? I've run a deep scrub on the pg and the status is still showing one unfound object. This is a test cluster only, but I'd like to learn what has happened and why... Thanks, -- Eino Tuominen -Original Message- From: ceph-users

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Alexandre DERUMIER aderumier@... writes: maybe this could help to repair pgs ? http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ (6 disk at the same time seem pretty strange. do you have some kind of writeback cache enable of theses disks ?) The only writeback

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Steffen W Sørensen
On 04/05/2015, at 15.01, Yujian Peng pengyujian5201...@126.com wrote: Alexandre DERUMIER aderumier@... writes: maybe this could help to repair pgs ? http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ (6 disk at the same time seem pretty strange. do you have

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Christopher Kunz
Am 04.05.15 um 09:00 schrieb Yujian Peng: Hi, I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down. But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks are unable to

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Emmanuel Florac
Le Mon, 4 May 2015 07:00:32 + (UTC) Yujian Peng pengyujian5201...@126.com écrivait: I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down. But now I find that 6 disks(xfs) in 4 nodes have

[ceph-users] Rados Object gateway installation

2015-05-04 Thread MOSTAFA Ali (INTERN)
Hi Ceph users, I am new to ceph, I installed a small cluster (3 Monitors with 5 OSDs). Now I am trying to install an Object Gateway server. I followed the steps in the documentation but I am not able to launch the service using /etc/init.d/radosgw start instead I am using sudo -E

[ceph-users] how to display client io in hammer

2015-05-04 Thread Chad William Seys
Hi all, Looks like in Hammer 'ceph -s' no longer displays client IO and ops. How does one display that these days? Thanks, C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] The first infernalis dev release will be v9.0.0

2015-05-04 Thread Sage Weil
The first Ceph release back in Jan of 2008 was 0.1. That made sense at the time. We haven't revised the versioning scheme since then, however, and are now at 0.94.1 (first Hammer point release). To avoid reaching 0.99 (and 0.100 or 1.00?) we have a new strategy. This was discussed a bit on

Re: [ceph-users] how to display client io in hammer

2015-05-04 Thread Chad William Seys
Ooops! Turns out I forgot to mount the ceph rbd, so no client IO displayed! C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Kicking 'Remapped' PGs

2015-05-04 Thread Gregory Farnum
On Sun, May 3, 2015 at 5:18 AM, Paul Evans p...@daystrom.com wrote: Thanks, Greg. Following your lead, we discovered the proper 'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and we updated the cluster accordingly. We then moved a random OSD out and back in to ‘kick’

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Tuomas Juntunen
Hi below is the mds dump dumped mdsmap epoch 1799 epoch 1799 flags 0 created 2014-12-10 12:44:34.188118 modified2015-05-04 07:16:37.205350 tableserver 0 root0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 last_failure1794

Re: [ceph-users] The first infernalis dev release will be v9.0.0

2015-05-04 Thread Loic Dachary
+1 ;-) On 04/05/2015 18:09, Sage Weil wrote: The first Ceph release back in Jan of 2008 was 0.1. That made sense at the time. We haven't revised the versioning scheme since then, however, and are now at 0.94.1 (first Hammer point release). To avoid reaching 0.99 (and 0.100 or 1.00?) we

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Sage Weil
On Mon, 4 May 2015, Tuomas Juntunen wrote: 5827504:10.20.0.11:6800/3382530 'ceph1' mds.0.262 up:rejoin seq 33159 This is why it is 'degraded'... stuck in up:rejoin state. The active+clean+replay has been there for a day now, so there must be something that is not ok, if it should've

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Tuomas Juntunen
Hi Ok, restarting osd's did it. I thought I restarted the daemons after it was almost clean, but it seems I didn't. Now everything is running fine. Thanks again! Br, Tuomas -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: 4. toukokuuta 2015 20:21 To: Tuomas

[ceph-users] I have a trouble using theuthology ceph test tool

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth 2. target server / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu 3. teuthology server / OS: Ubuntu

Re: [ceph-users] ERROR: missing keyring, cannot use cephx for authentication

2015-05-04 Thread Jesus Chavez (jeschave)
You have saved my day! Thank you so much :) no seems to be working not sure why it has that behaivor... Thank you so much! Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.commailto:jesch...@cisco.com Phone: +52 55 5267 3146tel:+52%2055%205267%203146 Mobile: +51 1

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-05-04 Thread Michael Kidd
For Firefly / Giant installs, I've had success with the following: yum install ceph ceph-common --disablerepo=base --disablerepo=epel Let us know if this works for you as well. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services - by Red Hat On Wed, Apr 8, 2015 at

[ceph-users] Ceph migration to AWS

2015-05-04 Thread Mike Travis
To those interested in a tricky problem, We have a Ceph cluster running at one of our data centers. One of our client's requirements is to have them hosted at AWS. My question is: How do we effectively migrate our data on our internal Ceph cluster to an AWS Ceph cluster? Ideas currently on the

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Thanks Mark. I switched to completely different machine and started from scratch, things were much smoother this time. Cluster was up in 30 mins. I guess purgedata , droplets and and purge is Not enough to bring the machine back clean? What I was trying on the old machine to reset it. Thanks JV

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-05-04 Thread Venkateswara Rao Jujjuri
If I want to use librados API for performance testing, are there any existing benchmark tools which directly accesses librados (not through rbd or gateway) Thanks in advance, JV On Sun, Apr 26, 2015 at 10:46 PM, Alexandre DERUMIER aderum...@odiso.com wrote: I'll retest tcmalloc, because I was

[ceph-users] I have a trouble using theuthology ceph test tool..

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth2. targetserver / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu 3. teuthology server / OS: Ubuntu 14.04, IP:

[ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Sanjoy Dasgupta
Hi! This is an often discussed and clarified topic, but Reason why I am asking is because If We use a RAID controller with Lot of Cache (FBWC) and Configure each Drive as Single Drive RAID0, then Write to disks will benefit by using FBWC and accelerate I/O performance. Is this correct

Re: [ceph-users] ERROR: missing keyring, cannot use cephx for authentication

2015-05-04 Thread Jesus Chavez (jeschave)
How did you get the UUIDs without mounting the osds? Thanks Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.commailto:jesch...@cisco.com Phone: +52 55 5267 3146tel:+52%2055%205267%203146 Mobile: +51 1 5538883255tel:+51%201%205538883255 CCIE - 44433 On Apr 10, 2015, at 11:47 PM,

[ceph-users] How to add a slave to rgw

2015-05-04 Thread 周炳华
Hi, geeks: I have a ceph cluster for rgw service in production, which was setup according to the simple configuration tutorial, with only one deafult region and one default zone. Even worse, I didn't enable neither the meta logging nor the data logging. Now i want to add a slave zone to the rgw

[ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread tuomas . juntunen
I upgraded Ceph from 0.87 Giant to 0.94.1 Hammer Then created new pools and deleted some old ones. Also I created one pool for tier to be able to move data without outage. After these operations all but 10 OSD's are down and creating this kind of messages to logs, I get more than 100gb of these

[ceph-users] OSDs remain down

2015-05-04 Thread Jesus Chavez (jeschave)
Hi all , still have a lot of problems when building power fail… There is only one simple node where OSDs remains down after reboot and there is something weird, 1 from 12 OSDs gets up after reboot but just one…Here is an example: Filesystem Size Used Avail Use% Mounted on

[ceph-users] How to Stop/start a specific OSD

2015-05-04 Thread MOSTAFA Ali (INTERN)
Hello all, I am a new user of Ceph. I am trying to stop an OSD, I can stop it (and all other OSDs on the node) using the command : sudo stop ceph-osd-all. But the command: sudo stop ceph-osd id=0 returns an error user@node2:~$ ceph osd tree # idweight type name up/down reweight -1

[ceph-users] OSD failing to start [fclose error: (61) No data available]

2015-05-04 Thread Sourabh saryal
Hi, Trying to start an OSD , which is failing to restart. /etc/init.d/ceph start osd.140 === osd.140 === create-or-move updated item name 'osd.140' weight 3.63 at location {host=XXX,root=default} to crush map Starting Ceph osd.140 on ... starting osd.140 at :/0 osd_data /var/lib/ceph/osd/ceph-140

[ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Started to install basic cluster from scratch, but running into keyring issues. Basically /etc/ceph/ceph.client.admin.keyring is not getting generated on monitor node. When I tried to create it on the monitor node, it fails with: $ ceph auth get-or-create client.admin mon 'allow *' mds 'allow *'

[ceph-users] Rack awareness with different hardware layouts

2015-05-04 Thread Rogier Dikkes
Hello all, At this moment we have a scenario where i would like your opinion on. Scenario: Currently we have a ceph environment with 1 rack of hardware, this rack contains a couple of OSD nodes with 4T disks. In a few weeks time we will deploy 2 more racks with OSD nodes, these nodes have

Re: [ceph-users] Ceph migration to AWS

2015-05-04 Thread Kyle Bader
To those interested in a tricky problem, We have a Ceph cluster running at one of our data centers. One of our client's requirements is to have them hosted at AWS. My question is: How do we effectively migrate our data on our internal Ceph cluster to an AWS Ceph cluster? Ideas currently on

[ceph-users] I have a trouble using theuthology ceph test tool

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 100, user: teuth 2. targetserver / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 10, user: ubuntu 3. teuthology server / OS:

[ceph-users] NVMe Journal and Mixing IO

2015-05-04 Thread Atze de Vries
Hi, We are designing a new Ceph cluster. Some of the cluster wil be used to run vms and most of it wil be used for file storage and object storage. We want to separate the workload for vms (high IO / small block) from the bulk storage (big block lots of latency) since mixing IO seems to be a bad

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Here is the output..I am still stuck at this step. :( (multiple times tried to by purging and restarting from scratch) vjujjuri@rgulistan-wsl10:~/ceph-cluster$ ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configuration file at: /home/vjujjuri/.cephdeploy.conf

[ceph-users] about rgw region and zone

2015-05-04 Thread TERRY
Hi: all when I Configuring Federated Gateways?? I got the error as below: sudo radosgw-agent -c /etc/ceph/ceph-data-sync.conf ERROR:root:Could not retrieve region map from destination Traceback (most recent call last): File /usr/lib/python2.6/site-packages/radosgw_agent/cli.py, line

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Chad William Seys
Hi Florent, Most likely Debian will release backported kernels for Jessie, as they have for Wheezy. E.g. Wheezy has had kernel 3.16 backported to it: https://packages.debian.org/search?suite=wheezy-backportssearchon=nameskeywords=linux-image-amd64 C.

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Vasu Kulkarni
What are your initial monitor nodes? i,e what nodes did you specify in the first step: ceph-deploy new {initial-monitor-node(s)} Did you specify rgulistan-wsl11 as your monitor node in that step? - Original Message - From: Venkateswara Rao Jujjuri jujj...@gmail.com To: ceph-devel

Re: [ceph-users] Shadow Files

2015-05-04 Thread Yehuda Sadeh-Weinraub
I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any

Re: [ceph-users] Ceph migration to AWS

2015-05-04 Thread Christian Balzer
On Mon, 4 May 2015 11:21:12 -0700 Kyle Bader wrote: To those interested in a tricky problem, We have a Ceph cluster running at one of our data centers. One of our client's requirements is to have them hosted at AWS. My question is: How do we effectively migrate our data on our internal

Re: [ceph-users] NVMe Journal and Mixing IO

2015-05-04 Thread Christian Balzer
Hello, On Wed, 15 Apr 2015 10:26:37 +0200 Atze de Vries wrote: Hi, We are designing a new Ceph cluster. Some of the cluster wil be used to run vms and most of it wil be used for file storage and object storage. We want to separate the workload for vms (high IO / small block) from the

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Emmanuel Florac eflorac@... writes: Le Mon, 4 May 2015 07:00:32 + (UTC) Yujian Peng pengyujian5201314 at 126.com écrivait: I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down. But

Re: [ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Jake Young
On Monday, May 4, 2015, Christian Balzer ch...@gol.com wrote: On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote: Hi! This is an often discussed and clarified topic, but Reason why I am asking is because If We use a RAID controller with Lot of Cache (FBWC) and Configure each

Re: [ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Christian Balzer
On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote: Hi! This is an often discussed and clarified topic, but Reason why I am asking is because If We use a RAID controller with Lot of Cache (FBWC) and Configure each Drive as Single Drive RAID0, then Write to disks will benefit by

Re: [ceph-users] Btrfs defragmentation

2015-05-04 Thread Timofey Titovets
Hi list, Excuse me, what I'm saying is off topic @Lionel, if you use btrfs, did you already try to use btrfs compression for OSD? If yes, сan you share the your experience? 2015-05-05 3:24 GMT+03:00 Lionel Bouton lionel+c...@bouton.name: On 05/04/15 01:34, Sage Weil wrote: On Mon, 4 May 2015,

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Mark Kirkwood
On 05/05/15 04:16, Venkateswara Rao Jujjuri wrote: Thanks Mark. I switched to completely different machine and started from scratch, things were much smoother this time. Cluster was up in 30 mins. I guess purgedata , droplets and and purge is Not enough to bring the machine back clean? What I

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread cwseys
HI Illya, Any new features, development work and most of the enhancements are not backported. Only a selected bunch of bug fixes is. Not sure what you are trying to say. Wheezy was released with kernel 3.2 and bugfixes are applied to 3.2 by Debian throughout Wheezy's support cycle. But

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:25 PM, cwseys cws...@physics.wisc.edu wrote: HI Illya, Any new features, development work and most of the enhancements are not backported. Only a selected bunch of bug fixes is. Not sure what you are trying to say. Wheezy was released with kernel 3.2 and

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 9:40 PM, Chad William Seys cws...@physics.wisc.edu wrote: Hi Florent, Most likely Debian will release backported kernels for Jessie, as they have for Wheezy. E.g. Wheezy has had kernel 3.16 backported to it:

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread ceph
Linux 4.0 lives in Debian : 7% [jack:~]apt-cache policy linux-image-4.0.0-trunk-amd64 linux-image-4.0.0-trunk-amd64: Installé : (aucun) Candidat : 4.0-1~exp1 Table de version : 4.0-1~exp1 0 1 http://ftp.fr.debian.org/debian/ experimental/main amd64 Packages On 04/05/2015

Re: [ceph-users] Btrfs defragmentation

2015-05-04 Thread Lionel Bouton
On 05/04/15 01:34, Sage Weil wrote: On Mon, 4 May 2015, Lionel Bouton wrote: Hi, we began testing one Btrfs OSD volume last week and for this first test we disabled autodefrag and began to launch manual btrfs fi defrag. [...] Cool.. let us know how things look after it ages! We had the