I think the root-CA (COMODO RSA Certification Authority) is not available on
your Linux host? Using Google chrome connecting to https://ceph.com/ works
fine.
No, its a wget bug. I now switched to LWP::UserAgent and it works perfectly.
___
ceph-users
I get the following error on standard Debian Wheezy
# wget https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
--2015-02-13 07:19:04-- https://ceph.com/git/?p=ceph.git
Resolving ceph.com (ceph.com)... 208.113.241.137, 2607:f298:4:147::b05:fe2a
Connecting to ceph.com
The more I think about this problem, the less I think there'll be an easy
answer, and it's more likely that I'll have to reproduce the scenario and
actually pause myself next time in order to troubleshoot it?
It is even possible to simulate those crush problem. I reported a few examples
long
Hi all,
I just noticed that a snapshot rollback produces very high load on small
clusters. Seems
all OSDs copies data at full speed, and client access speed drops from 480MB/s
to 10MB/s.
Is there a way to limit rollback speed/priority?
___
Ceph has nothing to do with a HA cluster based on pacemaker.
It has a complete different logic built in.
The only similarity is that both use a quorum algorithm to detect split brain
situations.
I talk about cluster services like 'corosync', which provide membership and
quorum services.
For
Does CEPH rely on any multicasting? Appreciate the feedback..
Nope! All networking is point-to-point.
Besides, it would be great if ceph could use existing cluster stacks like
corosync, ...
Is there any plan to support that?
___
ceph-users
Some projects manually modify PRUNEPATHS in the init script, for example:
http://git.openvz.org/?p=vzctl;a=commitdiff;h=47334979b9b5340f84d84639b2d77a8a1f0bb7cf
It sounds like what is needed here is for the deb and rpm packages to add
/var/lib/ceph to the PRUNEPATHS in /etc/updatedb.conf.
I am unable to start my OSDs on one node:
osd/PGLog.cc: 672: FAILED assert(last_e.version.version e.version.version)
Does that mean there is something wrong with my journal disk? Or why can such
thing happen?
After rebooting other nodes, all my OSD are offline, showing exactly the same
After enabling debugging, I get:
...
-4 2014-02-12 09:43:44.739648 7f7f8b848780 20 read_log 6100'1677
(6100'1676) modify 85949a17/rbd_data.dd6592ae8944a.01bd/head//25
by clie
nt.890681.0:76884 2014-01-26 16:44:08.412457
-3 2014-02-12 09:43:44.739670 7f7f8b848780 20 read_log
This sounds like a bug introduced an entry into the pg log that is not ordered
properly. I don't think I've seen this before... Sam, have you?
How many OSDs you do you have?
12 OSDs, 3 nodes
Can you set 'debug osd = 20' in your ceph.conf, restart and reproduce the
crash,
The log I sent
OK, I upload the log to our server:
ftp://download.proxmox.com/tmp/ceph-osd.4.log
-Original Message-
From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-
boun...@lists.ceph.com] On Behalf Of Dietmar Maurer
Sent: Mittwoch, 12. Februar 2014 18:41
To: Sage Weil
Cc: ceph-users
It would be great to get two logs from two different crashing OSDs for
comparison purposes.
ftp://download.proxmox.com/tmp/ceph-osd.4.log
ftp://download.proxmox.com/tmp/ceph-osd.10.log
and post the log somewhere? (You can use 'ceph-post filename' to
send it to us
# ceph-post-file
It would be great to get two logs from two different crashing OSDs for
comparison purposes.
ftp://download.proxmox.com/tmp/ceph-osd.4.log
ftp://download.proxmox.com/tmp/ceph-osd.10.log
I guess I should also mention that there was a miss-configuration in
the network MTU setting of one
do you have changed/check the crush-map ?
The crush map is OK (not changed).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I am unable to start my OSDs on one node:
osd/PGLog.cc: 672: FAILED assert(last_e.version.version e.version.version)
Does that mean there is something wrong with my journal disk? Or why can such
thing happen?
Here is the OSD log:
...
2014-02-12 07:04:39.376993 7f8236afe780 0 cls
On my test cluster, some PGs are stuck unclean forever (pool 24, size=2).
Directory /var/lib/ceph/osd/ceph-X/current/24.126_head/ is empty on all OSDs.
Any idea what is wrong? And how can I recover from that state?
The interesting thing is that all OSDs are up, and those PGs does not list
I think it could be great to add some osd statistics (io/s,...), I think it's
possible
through ceph api.
You see IO/s on the log.
I also added Latency stats for OSDs rescently.
Also maybe an email alerting system if an osd state change (up/down/)
yes, and SMART,
When using a pool size of 3, I get the following behavior when one OSD
fails:
* the affected PGs get marked active+degraded
* there is no data movement/backfill
Works as designed, if you have the default crush map in place (all replicas
must
be on DIFFERENT hosts). You need to
Sorry, it seems as if I had misread your question: Only a single OSD fails,
not the
whole server?
Yes, only a single OSD is down and marked out.
Then there should definitively be a backfilling in place.
no, this does not happen. Many PGs stay in degraded state (I tested this
several
Are you aware of this?
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
= Stopping w/out Rebalancing
What do you think is wrong with my setup? I want to re-balance. The problem is
that it does not
happen at all!
I do exactly the same test with and without 'ceph osd
-users-
boun...@lists.ceph.com] On Behalf Of Dietmar Maurer
Sent: Dienstag, 14. Jänner 2014 10:40
To: Wolfgang Hennerbichler; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] 3 node setup with pools size=3
Are you aware of this?
http://ceph.com/docs/master/rados/troubleshooting
Seems that marking an OSD as 'out' has other effects than removing an OSD from
crush map.
I guess weights are not changed if the OSD is marked out?
So how can I test that with crushtool?
___
ceph-users mailing list
ceph-users@lists.ceph.com
We observe strange behavior with some configurations. PGs stays in degraded
state after
a single OSD failure.
I can also show the behavior using crushtool with the following map:
--crush map-
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
I am still playing around with a small setup using 3 Nodes, each running 4 OSDs
(=12 OSDs).
When using a pool size of 3, I get the following behavior when one OSD fails:
* the affected PGs get marked active+degraded
* there is no data movement/backfill
Note: using 'ceph osd crush tunables
From the docs:
step [choose|chooseleaf] [firstn|indep] N bucket-type
What exactly is the difference between 'firstn' and 'indep'?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
For Ceph releases up to Emperor[1], firstn is used and I'm not aware of a use
case requiring indep. As part of the effort to implement erasure coded pools,
firstn[2] and indep[3] were separated in two functions. The firstn method is
best
suited for replicated pools. The indep method tries to
the following distribution:
device 0: 423
device 1: 453
device 2: 430
device 3: 455
device 4: 657
device 5: 654
Host with only one osd gets too much data.
On Fri, 3 Jan 2014, Dietmar Maurer wrote:
In both cases, you only get 2 replicas on the remaining 2 hosts
Host with only one osd gets too much data.
I think this is just fundamentally a problem with distributing 3 replicas
over only 4
hosts. Every piece of data in the system needs to include either host 3 or 4
(and
thus device 4 or 5) in order to have 3 replicas (on separate hosts). Add
I think this is just fundamentally a problem with distributing 3
replicas over only 4 hosts. Every piece of data in the system needs
to include either host 3 or 4 (and thus device 4 or 5) in order to
have 3 replicas (on separate hosts). Add more hosts or disks and the
distribution will
I try to understand the default crush rule:
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
Is this the same as:
rule data {
ruleset 0
type
Having your journals on the same disk causes all data to be written twice,
i.e.
once to the journal and once to the
osd store. Notice that your tested throughput is slightly more than half your
expected maximum...
But AFAIK OSD bench already considers journal writes. The disk can write
iirc, chooseleaf goes down the tree and descents into multiple leafs
to find what you are looking for.
choose goes into that leaf and tries to find what you are looking for
without going into subtrees.
Right. To a first approximation, these rules are equivalent. The difference
is
The other difference is if you have one of the two OSDs on the host marked
out.
In the choose case, the remaining OSD will get allocated 2x the data; in the
chooseleaf case, usage will remain proportional with the rest of the cluster
and
the data from the out OSD will be distributed across
-Original Message-
From: Stefan Priebe [mailto:s.pri...@profihost.ag]
Sent: Donnerstag, 02. Jänner 2014 18:36
To: Dietmar Maurer; Dino Yancey
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] rados benchmark question
Hi,
Am 02.01.2014 17:10, schrieb Dietmar Maurer
# iostat -x 5 (after about 30 seconds)
Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz
await r_await w_await svctm %util
sdb 0.00 3.800.00 187.40 0.00 84663.60 903.56
157.62
796.930.00 796.93 5.34 100.00
so your disks are completely utilized and can't keep up see %util and
await.
But it say it writes at 80MB/s, so that would be about 40MB/s for
data? And 40*6=240 (not 190)
Did you miss the replication factor? I think it should be:
40MB/s*6/3 = 80MB/s
My test pool use size=1 (no
Did you miss the replication factor? I think it should be:
40MB/s*6/3 = 80MB/s
My test pool use size=1 (no replication)
ok out of ideas... ;-( sorry
What values do you get? (osd bench vs. rados benchmar with pool size=1)
___
ceph-users
In both cases, you only get 2 replicas on the remaining 2 hosts.
OK, I was able to reproduce this with crushtool.
The difference is if you have 4 hosts with 2 osds. In the choose case, you
have
some fraction of the data that chose the down host in the first step (most of
the
attempts,
I also don't really understand why crush selects OSDs with weight=0
host prox-ceph-3 {
id -4 # do not change unnecessarily
# weight 3.630
alg straw
hash 0 # rjenkins1
item osd.4 weight 0
}
root default {
id -1 # do not
Hi all,
I run 3 nodes connected with a 10Gbit network, each running 2 OSDs.
Disks are 4TB Seagate Constellation ST4000NM0033-9ZM (xfs, journal on same
disk).
# ceph tell osd.0 bench
{ bytes_written: 1073741824,
blocksize: 4194304,
bytes_per_sec: 56494242.00}
So a single OSD can write
40 matches
Mail list logo