Hi all,
I am trying to remove several rbd images from the cluster.
Unfortunately, that doesn't work:
$ rbd info foo
rbd image 'foo':
size 1024 GB in 262144 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.919443.238e1f29
format: 1
$ rbd rm foo
at 11:30 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi all,
I am trying to remove several rbd images from the cluster.
Unfortunately, that doesn't work:
$ rbd info foo
rbd image 'foo':
size 1024 GB in 262144 objects
order 22 (4096 kB objects
be aiming your gun at your foot with this!
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Mon, May 11, 2015 at 2:09 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi all!
We are experiencing approximately 1 scrub error
Hi all!
We are experiencing approximately 1 scrub error / inconsistent pg every
two days. As far as I know, to fix this you can issue a ceph pg
repair, which works fine for us. I have a few qestions regarding the
behavior of the ceph cluster in such a case:
1. After ceph detects the scrub error,
to dmesg ?
Cheers, Dan
On Mon, Apr 20, 2015 at 9:29 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi Ceph-Users!
We currently have a problem where I am not sure if the it has it's cause
in Ceph or something else. First, some information about our ceph-setup:
* ceph version
has
been running for years without a problem.
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Christian Eichelmann
Sent: 20 April 2015 14:41
To: Nick Fisk; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] 100% IO Wait with CEPH RBD
On Tue, Apr 21, 2015 at 9:13 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi Dan,
we are alreay back on the kernel module since the same problems were
happening with fuse. I had no special ulimit settings for the
fuse-process, so that could have been an issue there.
I
I'm using xfs on the rbd disks.
They are between 1 and 10TB in size.
Am 20.04.2015 um 14:32 schrieb Nick Fisk:
Ah ok, good point
What FS are you using on the RBD?
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Christian Eichelmann
might stop this from
happening.
Nick
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Christian Eichelmann
Sent: 20 April 2015 08:29
To: ceph-users@lists.ceph.com
Subject: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC
Hi Ceph-Users
Hi Ceph-Users!
We currently have a problem where I am not sure if the it has it's cause
in Ceph or something else. First, some information about our ceph-setup:
* ceph version 0.87.1
* 5 MON
* 12 OSD with 60x2TB each
* 2 RSYNC Gateways with 2x10G Ethernet (Kernel: 3.16.3-2~bpo70+1, Debian
can put this to catch more
users? Or maybe a warning issued by the osds themselves or something if
they see limits that are low?
sage
- Karan -
On 09 Mar 2015, at 14:48, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi Karan,
as you are actually writing in your own
http://www.csc.fi/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Christian Eichelmann
Systemadministrator
11
The behaviour is exactly the same on our system, to it looks like the
same issue.
We are current running Giant by the way (0.87)
plus many other OSDs like that.
--
Christian Eichelmann
Systemadministrator
11 Internet AG - IT Operations Mail Media Advertising Targeting
Brauerstraße 48 · DE
Hi all,
during some failover tests and some configuration tests, we currently
discover a strange phenomenon:
Restarting one of our monitors (5 in sum) triggers about 300 of the
following events:
osd.669 10.76.28.58:6935/149172 failed (20 reports from 20 peers after
22.005858 = grace 20.00)
at 9:45 AM, Gregory Farnum g...@gregs42.com wrote:
On Tue, Jan 20, 2015 at 2:40 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi all,
I want to understand what Ceph does if several OSDs are down. First of our,
some words to our Setup:
We have 5 Monitors and 12 OSD Server, each has
Hi all,
I want to understand what Ceph does if several OSDs are down. First of
our, some words to our Setup:
We have 5 Monitors and 12 OSD Server, each has 60x2TB Disks. These
Servers are spread across 4 racks in our datacenter. Every rack holds 3
OSD Server. We have a replication factor of
Hi all,
after our cluster problems with incomplete placementgroups, we've
decided to remove our pools and create new ones. This was going fine in
the beginning. After adding an additional OSD server, we now have 2 PGs
that are stuck in the peering state:
HEALTH_WARN 2 pgs peering; 2 pgs stuck
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Christian Eichelmann
Systemadministrator
11 Internet AG - IT Operations Mail Media Advertising Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
Hi all,
as mentioned last year, our ceph cluster is still broken and unusable.
We are still investigating what has happened and I am taking more deep
looks into the output of ceph pg pgnum query.
The problem is that I can find some informations about what some of the
sections mean, but mostly I
again.
To tell the truth, I guess that will result in the end of our ceph
project (running for already 9 Monthes).
Regards,
Christian
Am 29.12.2014 15:59, schrieb Nico Schottelius:
Hey Christian,
Christian Eichelmann [Mon, Dec 29, 2014 at 10:56:59AM +0100]:
[incomplete PG / RBD hanging, osd
for each pool,
also different OSDs, may be this way you can overcome the issue.
Cheers
Eneko
On 30/12/14 12:17, Christian Eichelmann wrote:
Hi Nico and all others who answered,
After some more trying to somehow get the pgs in a working state (I've
tried force_create_pg, which was putting
in the new pools image
format?
On 30/12/14 12:31, Christian Eichelmann wrote:
Hi Eneko,
I was trying a rbd cp before, but that was haning as well. But I
couldn't find out if the source image was causing the hang or the
destination image. That's why I decided to try a posix copy.
Our cluster
Hi all,
we have a ceph cluster, with currently 360 OSDs in 11 Systems. Last week
we were replacing one OSD System with a new one. During that, we had a
lot of problems with OSDs crashing on all of our systems. But that is
not our current problem.
After we got everything up and running again, we
it
doesn't seem to have gotten much traction in terms of informing users.
Regards
Nathan
On 15/09/2014 7:13 PM, Christian Eichelmann wrote:
Hi all,
I have no idea why running out of filehandles should produce a out of
memory error, but well. I've increased the ulimit as you told me, and
nothing
disks, with OSD SSD journals)
with that kind of case and enjoy the fact that my OSDs never fail. ^o^
Christian (another one)
On 9/12/2014 10:15 AM, Christian Eichelmann wrote:
Hi,
I am running all commands as root, so there are no limits for the
processes.
Regards,
Christian
Hi Ceph-Users,
I have absolutely no idea what is going on on my systems...
Hardware:
45 x 4TB Harddisks
2 x 6 Core CPUs
256GB Memory
When initializing all disks and join them to the cluster, after
approximately 30 OSDs, other osds are crashing. When I try to start them
again I see different
Hi,
I am running all commands as root, so there are no limits for the processes.
Regards,
Christian
___
Von: Mariusz Gronczewski [mariusz.gronczew...@efigence.com]
Gesendet: Freitag, 12. September 2014 15:33
An: Christian Eichelmann
Cc: ceph-users
I can also confirm that after upgrading to firefly both of our clusters (test
and live) were going from 0 scrub errors each for about 6 Month to about 9-12
per week...
This also makes me kind of nervous, since as far as I know everything ceph pg
repair does, is to copy the primary object to all
://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Christian Eichelmann
Systemadministrator
11 Internet AG - IT Operations Mail Media Advertising Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
christian.eichelm...@1und1.de
Amtsgericht Montabaur / HRB 6484
Vorstände
Hi ceph users,
since our cluster had a few inconsistent pgs in the last time, i was
wondering what ceph pg repair does, depending on the replication level.
So I just wanted to check if my assumptions are correct:
Replication 2x
Since the cluster can not decide which version is correct one, it
Hi all,
after coming back from a long weekend, I found my production cluster in
an error state, mentioning 6 scrub errors and 6 pg's in
active+clean+inconsistent state.
Strange is, that my Prelive-Cluster, running on different Hardware, are
also showing 1 scrub error and 1 inconsisten pg...
pg
Hi again,
just found the ceph pg repair command :) Now both clusters are OK again.
Anyways, I'm really interested in the caus of the problem.
Regards,
Christian
Am 10.06.2014 10:28, schrieb Christian Eichelmann:
Hi all,
after coming back from a long weekend, I found my production cluster
Hi Folks!
For those of you, who are using ceph-dash
(https://github.com/Crapworks/ceph-dash), I've created a Nagios-Plugin,
that uses the json endpoint to monitor your cluster remotely:
* https://github.com/Crapworks/check_ceph_dash
I think this can be easily adopted to use the ceph-rest-api as
I have written a small and lightweight gui, which can also acts as a json rest
api (for non-interactive monitoring):
https://github.com/Crapworks/ceph-dash
Maybe thats what you searching for.
Regards,
Christian
Von: ceph-users
34 matches
Mail list logo