Hi,
I'm encountering a data disaster. I have a ceph cluster with 145 osd. The
data center had a power problem yesterday, and all of the ceph nodes were down.
But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks
are unable to mount, and some disks have IO errors in syslog.
On Mon, May 4, 2015 at 11:46 AM, Florent B flor...@coppint.com wrote:
Hi,
I would like to know which kernel version is needed to mount CephFS on a
Hammer cluster ?
And if we use 3.16 kernel of Debian Jessie, can we hope using CephFS for
a few next release without problem ?
I would advice
maybe this could help to repair pgs ?
http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/
(6 disk at the same time seem pretty strange. do you have some kind of
writeback cache enable of theses disks ?)
- Mail original -
De: Yujian Peng
Hi everybody,
Does anybody have any clue on this? I've run a deep scrub on the pg and the
status is still showing one unfound object. This is a test cluster only, but
I'd like to learn what has happened and why...
Thanks,
--
Eino Tuominen
-Original Message-
From: ceph-users
Alexandre DERUMIER aderumier@... writes:
maybe this could help to repair pgs ?
http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/
(6 disk at the same time seem pretty strange. do you have some kind of
writeback cache enable of theses disks ?)
The only writeback
On 04/05/2015, at 15.01, Yujian Peng pengyujian5201...@126.com wrote:
Alexandre DERUMIER aderumier@... writes:
maybe this could help to repair pgs ?
http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/
(6 disk at the same time seem pretty strange. do you have
Am 04.05.15 um 09:00 schrieb Yujian Peng:
Hi,
I'm encountering a data disaster. I have a ceph cluster with 145 osd. The
data center had a power problem yesterday, and all of the ceph nodes were
down.
But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks
are unable to
Le Mon, 4 May 2015 07:00:32 + (UTC)
Yujian Peng pengyujian5201...@126.com écrivait:
I'm encountering a data disaster. I have a ceph cluster with 145 osd.
The data center had a power problem yesterday, and all of the ceph
nodes were down. But now I find that 6 disks(xfs) in 4 nodes have
Hi Ceph users,
I am new to ceph, I installed a small cluster (3 Monitors with 5 OSDs). Now I
am trying to install an Object Gateway server. I followed the steps in the
documentation but I am not able to launch the service using /etc/init.d/radosgw
start instead I am using sudo -E
Hi all,
Looks like in Hammer 'ceph -s' no longer displays client IO and ops.
How does one display that these days?
Thanks,
C.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
The first Ceph release back in Jan of 2008 was 0.1. That made sense at
the time. We haven't revised the versioning scheme since then, however,
and are now at 0.94.1 (first Hammer point release). To avoid reaching
0.99 (and 0.100 or 1.00?) we have a new strategy. This was discussed a
bit on
Ooops!
Turns out I forgot to mount the ceph rbd, so no client IO displayed!
C.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On Sun, May 3, 2015 at 5:18 AM, Paul Evans p...@daystrom.com wrote:
Thanks, Greg. Following your lead, we discovered the proper
'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and
we updated the cluster accordingly. We then moved a random OSD out and back
in to ‘kick’
Hi below is the mds dump
dumped mdsmap epoch 1799
epoch 1799
flags 0
created 2014-12-10 12:44:34.188118
modified2015-05-04 07:16:37.205350
tableserver 0
root0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure1794
+1 ;-)
On 04/05/2015 18:09, Sage Weil wrote:
The first Ceph release back in Jan of 2008 was 0.1. That made sense at
the time. We haven't revised the versioning scheme since then, however,
and are now at 0.94.1 (first Hammer point release). To avoid reaching
0.99 (and 0.100 or 1.00?) we
On Mon, 4 May 2015, Tuomas Juntunen wrote:
5827504:10.20.0.11:6800/3382530 'ceph1' mds.0.262 up:rejoin seq 33159
This is why it is 'degraded'... stuck in up:rejoin state.
The active+clean+replay has been there for a day now, so there must be
something that is not ok, if it should've
Hi
Ok, restarting osd's did it. I thought I restarted the daemons after it was
almost clean, but it seems I didn't.
Now everything is running fine.
Thanks again!
Br,
Tuomas
-Original Message-
From: Sage Weil [mailto:s...@newdream.net]
Sent: 4. toukokuuta 2015 20:21
To: Tuomas
Hi, Cepher!
I need help to use teuthology, the Ceph intergration test framework.
there are three nodes like below
1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth
2. target server / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu
3. teuthology server / OS: Ubuntu
You have saved my day! Thank you so much :) no seems to be working not sure why
it has that behaivor...
Thank you so much!
Jesus Chavez
SYSTEMS ENGINEER-C.SALES
jesch...@cisco.commailto:jesch...@cisco.com
Phone: +52 55 5267 3146tel:+52%2055%205267%203146
Mobile: +51 1
For Firefly / Giant installs, I've had success with the following:
yum install ceph ceph-common --disablerepo=base --disablerepo=epel
Let us know if this works for you as well.
Thanks,
Michael J. Kidd
Sr. Storage Consultant
Inktank Professional Services
- by Red Hat
On Wed, Apr 8, 2015 at
To those interested in a tricky problem,
We have a Ceph cluster running at one of our data centers. One of our
client's requirements is to have them hosted at AWS. My question is: How do
we effectively migrate our data on our internal Ceph cluster to an AWS Ceph
cluster?
Ideas currently on the
Thanks Mark. I switched to completely different machine and started from
scratch, things were much smoother this time. Cluster was up in 30 mins.
I guess purgedata , droplets and and purge is
Not enough to bring the machine back clean?
What I was trying on the old machine to reset it.
Thanks
JV
If I want to use librados API for performance testing, are there any
existing benchmark tools which directly accesses librados (not through
rbd or gateway)
Thanks in advance,
JV
On Sun, Apr 26, 2015 at 10:46 PM, Alexandre DERUMIER
aderum...@odiso.com wrote:
I'll retest tcmalloc, because I was
Hi, Cepher!
I need help to use teuthology, the Ceph intergration test framework.
there are three nodes like below
1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth2.
targetserver / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu
3. teuthology server / OS: Ubuntu 14.04, IP:
Hi!
This is an often discussed and clarified topic, but Reason why I am asking
is because
If We use a RAID controller with Lot of Cache (FBWC) and Configure each
Drive as Single Drive RAID0, then Write to disks will benefit by using
FBWC and accelerate I/O performance. Is this correct
How did you get the UUIDs without mounting the osds?
Thanks
Jesus Chavez
SYSTEMS ENGINEER-C.SALES
jesch...@cisco.commailto:jesch...@cisco.com
Phone: +52 55 5267 3146tel:+52%2055%205267%203146
Mobile: +51 1 5538883255tel:+51%201%205538883255
CCIE - 44433
On Apr 10, 2015, at 11:47 PM,
Hi, geeks:
I have a ceph cluster for rgw service in production, which was setup
according to the simple configuration tutorial, with only one deafult
region and one default zone. Even worse, I didn't enable neither the meta
logging nor the data logging. Now i want to add a slave zone to the rgw
I upgraded Ceph from 0.87 Giant to 0.94.1 Hammer
Then created new pools and deleted some old ones. Also I created one pool for
tier to be able to move data without outage.
After these operations all but 10 OSD's are down and creating this kind of
messages to logs, I get more than 100gb of these
Hi all , still have a lot of problems when building power fail…
There is only one simple node where OSDs remains down after reboot and there is
something weird, 1 from 12 OSDs gets up after reboot but just one…Here is an
example:
Filesystem Size Used Avail Use% Mounted on
Hello all,
I am a new user of Ceph. I am trying to stop an OSD, I can stop it (and all
other OSDs on the node) using the command : sudo stop ceph-osd-all. But the
command: sudo stop ceph-osd id=0 returns an error
user@node2:~$ ceph osd tree
# idweight type name up/down reweight
-1
Hi,
Trying to start an OSD , which is failing to restart.
/etc/init.d/ceph start osd.140
=== osd.140 ===
create-or-move updated item name 'osd.140' weight 3.63 at location
{host=XXX,root=default} to crush map
Starting Ceph osd.140 on ...
starting osd.140 at :/0 osd_data /var/lib/ceph/osd/ceph-140
Started to install basic cluster from scratch, but running into keyring issues.
Basically /etc/ceph/ceph.client.admin.keyring is not getting generated
on monitor node.
When I tried to create it on the monitor node, it fails with:
$ ceph auth get-or-create client.admin mon 'allow *' mds 'allow *'
Hello all,
At this moment we have a scenario where i would like your opinion on.
Scenario:
Currently we have a ceph environment with 1 rack of hardware, this rack
contains a couple of OSD nodes with 4T disks. In a few weeks time we will
deploy 2 more racks with OSD nodes, these nodes have
To those interested in a tricky problem,
We have a Ceph cluster running at one of our data centers. One of our
client's requirements is to have them hosted at AWS. My question is: How do
we effectively migrate our data on our internal Ceph cluster to an AWS Ceph
cluster?
Ideas currently on
Hi, Cepher!
I need help to use teuthology, the Ceph intergration test framework.
there are three nodes like below
1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 100, user: teuth
2. targetserver / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 10, user: ubuntu
3. teuthology server / OS:
Hi,
We are designing a new Ceph cluster. Some of the cluster wil be used to run
vms and most of it wil be used for file storage and object storage.
We want to separate the workload for vms (high IO / small block) from the
bulk storage (big block lots of latency) since mixing IO seems to be a bad
Here is the output..I am still stuck at this step. :(
(multiple times tried to by purging and restarting from scratch)
vjujjuri@rgulistan-wsl10:~/ceph-cluster$ ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/vjujjuri/.cephdeploy.conf
Hi: all
when I Configuring Federated Gateways?? I got the error as below:
sudo radosgw-agent -c /etc/ceph/ceph-data-sync.conf
ERROR:root:Could not retrieve region map from destination
Traceback (most recent call last):
File /usr/lib/python2.6/site-packages/radosgw_agent/cli.py, line
Hi Florent,
Most likely Debian will release backported kernels for Jessie, as they
have for Wheezy.
E.g. Wheezy has had kernel 3.16 backported to it:
https://packages.debian.org/search?suite=wheezy-backportssearchon=nameskeywords=linux-image-amd64
C.
What are your initial monitor nodes? i,e what nodes did you specify in the
first step: ceph-deploy new {initial-monitor-node(s)}
Did you specify rgulistan-wsl11 as your monitor node in that step?
- Original Message -
From: Venkateswara Rao Jujjuri jujj...@gmail.com
To: ceph-devel
I've been working on a new tool that would detect leaked rados objects. It will
take some time for it to be merged into an official release, or even into the
master branch, but if anyone likes to play with it, it is in the
wip-rgw-orphans branch.
At the moment I recommend to not remove any
On Mon, 4 May 2015 11:21:12 -0700 Kyle Bader wrote:
To those interested in a tricky problem,
We have a Ceph cluster running at one of our data centers. One of our
client's requirements is to have them hosted at AWS. My question is:
How do we effectively migrate our data on our internal
Hello,
On Wed, 15 Apr 2015 10:26:37 +0200 Atze de Vries wrote:
Hi,
We are designing a new Ceph cluster. Some of the cluster wil be used to
run vms and most of it wil be used for file storage and object storage.
We want to separate the workload for vms (high IO / small block) from the
Emmanuel Florac eflorac@... writes:
Le Mon, 4 May 2015 07:00:32 + (UTC)
Yujian Peng pengyujian5201314 at 126.com écrivait:
I'm encountering a data disaster. I have a ceph cluster with 145 osd.
The data center had a power problem yesterday, and all of the ceph
nodes were down. But
On Monday, May 4, 2015, Christian Balzer ch...@gol.com wrote:
On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote:
Hi!
This is an often discussed and clarified topic, but Reason why I am
asking is because
If We use a RAID controller with Lot of Cache (FBWC) and Configure each
On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote:
Hi!
This is an often discussed and clarified topic, but Reason why I am
asking is because
If We use a RAID controller with Lot of Cache (FBWC) and Configure each
Drive as Single Drive RAID0, then Write to disks will benefit by
Hi list,
Excuse me, what I'm saying is off topic
@Lionel, if you use btrfs, did you already try to use btrfs compression for OSD?
If yes, сan you share the your experience?
2015-05-05 3:24 GMT+03:00 Lionel Bouton lionel+c...@bouton.name:
On 05/04/15 01:34, Sage Weil wrote:
On Mon, 4 May 2015,
On 05/05/15 04:16, Venkateswara Rao Jujjuri wrote:
Thanks Mark. I switched to completely different machine and started from
scratch, things were much smoother this time. Cluster was up in 30 mins.
I guess purgedata , droplets and and purge is
Not enough to bring the machine back clean?
What I
HI Illya,
Any new features, development work and most of the enhancements are not
backported. Only a selected bunch of bug fixes is.
Not sure what you are trying to say.
Wheezy was released with kernel 3.2 and bugfixes are applied to 3.2 by
Debian throughout Wheezy's support cycle.
But
On Mon, May 4, 2015 at 11:25 PM, cwseys cws...@physics.wisc.edu wrote:
HI Illya,
Any new features, development work and most of the enhancements are not
backported. Only a selected bunch of bug fixes is.
Not sure what you are trying to say.
Wheezy was released with kernel 3.2 and
On Mon, May 4, 2015 at 9:40 PM, Chad William Seys
cws...@physics.wisc.edu wrote:
Hi Florent,
Most likely Debian will release backported kernels for Jessie, as they
have for Wheezy.
E.g. Wheezy has had kernel 3.16 backported to it:
Linux 4.0 lives in Debian :
7% [jack:~]apt-cache policy linux-image-4.0.0-trunk-amd64
linux-image-4.0.0-trunk-amd64:
Installé : (aucun)
Candidat : 4.0-1~exp1
Table de version :
4.0-1~exp1 0
1 http://ftp.fr.debian.org/debian/ experimental/main amd64
Packages
On 04/05/2015
On 05/04/15 01:34, Sage Weil wrote:
On Mon, 4 May 2015, Lionel Bouton wrote:
Hi,
we began testing one Btrfs OSD volume last week and for this first test
we disabled autodefrag and began to launch manual btrfs fi defrag.
[...]
Cool.. let us know how things look after it ages!
We had the
53 matches
Mail list logo