This is being output by one of the kernel clients, and it's just
saying that the connections to those two OSDs have died from
inactivity. Either the other OSD connections are used a lot more, or
aren't used at all.
In any case, it's not a problem; just a noisy notification. There's
not much you
In particular, we changed things post-Firefly so that the filesystem
isn't created automatically. You'll need to set it up (and its pools,
etc) explicitly to use it.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Mon, Aug 25, 2014 at 2:40 PM, Sean Crosby
I don't think the log messages you're showing are the actual cause of
the failure. The log file should have a proper stack trace (with
specific function references and probably a listed assert failure),
can you find that?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Tue,
Hmm, that all looks basically fine. But why did you decide not to
segregate OSDs across hosts (according to your CRUSH rules)? I think
maybe it's the interaction of your map, setting choose_local_tries to
0, and trying to go straight to the OSDs instead of choosing hosts.
But I'm not super
On Tue, Aug 26, 2014 at 5:07 AM, m.channappa.nega...@accenture.com wrote:
Hello all,
I have configured a ceph storage cluster.
1. I created the volume .I would like to know that replication of data
will happen automatically in ceph ?
2. how to configure striped volume using ceph ?
[Re-added the list.]
I believe you'll find everything you need at
http://ceph.com/docs/master/cephfs/createfs/
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Tue, Aug 26, 2014 at 1:25 PM, LaBarre, James (CTR) A6IT
james.laba...@cigna.com wrote:
So is there a link
Hi Greg,
Good question: I started with a single node test and had just left the setting
in across larger configs as in earlier versions (e.g. Emperor) it didn't seem
to matter. I also had the same thought that it could be causing an issue with
the new default tunables in Firefly and did try
baijia...@126.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello,
On Tue, 26 Aug 2014 20:21:39 +0200 Loic Dachary wrote:
Hi Craig,
I assume the reason for the 48 hours recovery time is to keep the cost
of the cluster low ? I wrote 1h recovery time because it is roughly
the time it would take to move 4TB over a 10Gb/s link. Could you upgrade
your
On Tue, Aug 26, 2014 at 5:10 PM, Konrad Gutkowski
konrad.gutkow...@ffs.pl wrote:
Ceph-deploy should set priority for ceph repository, which it doesn't, this
usually installs the best available version from any repository.
Thanks Konrad for the tip. It took several goes (notably ceph-deploy
Hello,
On Tue, 26 Aug 2014 16:12:11 +0200 Loic Dachary wrote:
Using percentages instead of numbers lead me to calculations errors.
Here it is again using 1/100 instead of % for clarity ;-)
Assuming that:
* The pool is configured for three replicas (size = 3 which is the
default)
* It
Hi Gregory Farmum,
Thank you for your reply!
This is the log:
2014-08-26 16:22:39.103461 7f083752f700 -1 mds/CDir.cc: In function 'void
CDir::_committed(version_t)' thread 7f083752f700 time 2014-08-26
16:22:39.075809
mds/CDir.cc: 2071: FAILED assert(in-is_dirty() || in-last ((__u64)(-2)))
In the docs [1], 'incomplete' is defined thusly:
Ceph detects that a placement group is missing a necessary period of
history from its log. If you see this state, report a bug, and try
to start any failed OSDs that may contain the needed information.
However, during an extensive review of
hi,all
is there a way to get rbd,ko and ceph.ko for centos 6.X.
or i have to build them from source code? which is the least kernel version?
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
Hi.
I and many people use fio.
For ceph rbd has a special engine:
https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
2014-08-26 12:15 GMT+04:00 yuelongguang fasts...@163.com:
hi,all
i am planning to do a test on ceph, include performance, throughput,
Hi,
In the meantime I already tried with upgrading the cluster to 0.84, to
see if that made a difference, and it seems it does.
I can't reproduce the crashing osds by doing a 'rados -p ecdata ls' anymore.
But now the cluster detect it is inconsistent:
cluster
hi all,
there are a zillion OSD bug fixes. Things are looking pretty good for the
Giant release that is coming up in the next month.
any chance of having a compilable cephfs kernel module for el7 for the
next major release?
stijn
___
ceph-users
Move logs on the SSD and immediately increase performance. you have about
50% of the performance lost on logs. And just for the three replications
recommended more than 5 hosts
2014-08-26 12:17 GMT+04:00 Mateusz Skała mateusz.sk...@budikom.net:
Hi thanks for reply.
From the top of my
Hmm, it looks like you hit this bug(http://tracker.ceph.com/issues/9223).
Sorry for the late message, I forget that this fix is merged into 0.84.
Thanks for your patient :-)
On Tue, Aug 26, 2014 at 4:39 PM, Kenneth Waegeman
kenneth.waege...@ugent.be wrote:
Hi,
In the meantime I already
You mean to move /var/log/ceph/* to SSD disk?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello,
On Tue, 26 Aug 2014 10:23:43 +1000 Blair Bethwaite wrote:
Message: 25
Date: Fri, 15 Aug 2014 15:06:49 +0200
From: Loic Dachary l...@dachary.org
To: Erik Logtenberg e...@logtenberg.eu, ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Best practice K/M-parameters EC pool
I'm sorry, of course it journals)
2014-08-26 13:16 GMT+04:00 Mateusz Skała mateusz.sk...@budikom.net:
You mean to move /var/log/ceph/* to SSD disk?
___
ceph-users mailing list
ceph-users@lists.ceph.com
Hello Gentelmen:-)
Let me point one important aspect of this low performance problem:
from all 4 nodes of our ceph cluster only one node shows bad metrics,
that is very high latency on its osd's (from 200-600ms), while other
three nodes behave normaly, thats is latency of their osds is
thanks Irek Fasikhov.
is it the only way to test ceph-rbd? and an important aim of the test is to
find where the bottleneck is. qemu/librbd/ceph.
could you share your test result with me?
thanks
在 2014-08-26 04:22:22,Irek Fasikhov malm...@gmail.com 写道:
Hi.
I and many people
For me, the bottleneck is single-threaded operation. The recording will
have more or less solved with the inclusion of rbd cache, but there are
problems with reading. But I think that these problems can be solved cache
pool, but have not tested.
It follows that the more threads, the greater the
Sorry..Enter pressed :)
continued...
no, it's not the only way to check, but it depends what you want to use ceph
2014-08-26 15:22 GMT+04:00 Irek Fasikhov malm...@gmail.com:
For me, the bottleneck is single-threaded operation. The recording will
have more or less solved with the inclusion of
hi,all
i have 5 osds and 3 mons. its status is ok then.
to be mentioned , this cluster has no any data. i just deploy it and to be
familar with some command lines.
what is the probpem and how to fix?
thanks
---environment-
ceph-release-1-0.el6.noarch
ceph-deploy-1.5.11-0.noarch
Hi Blair,
Assuming that:
* The pool is configured for three replicas (size = 3 which is the default)
* It takes one hour for Ceph to recover from the loss of a single OSD
* Any other disk has a 0.001% chance to fail within the hour following the
failure of the first disk (assuming AFR
Using percentages instead of numbers lead me to calculations errors. Here it is
again using 1/100 instead of % for clarity ;-)
Assuming that:
* The pool is configured for three replicas (size = 3 which is the default)
* It takes one hour for Ceph to recover from the loss of a single OSD
* Any
Hi all,
I have a cluster of 2 nodes on Centos 6.5 with ceph 0.67.10 (replicate = 2)
When I add the 3rd node in the Ceph Cluster, CEPH perform load balancing.
I have 3 MDS in 3 nodes,the MDS process is dying after a while with a
stack trace:
How far out are your clocks? It's showing a clock skew, if they're too
far out it can cause issues with cephx.
Otherwise you're probably going to need to check your cephx auth keys.
-Michael
On 26/08/2014 12:26, yuelongguang wrote:
hi,all
i have 5 osds and 3 mons. its status is ok then.
to be
My OSD rebuild time is more like 48 hours (4TB disks, 60% full, osd max
backfills = 1). I believe that increases my risk of failure by 48^2 .
Since your numbers are failure rate per hour per disk, I need to consider
the risk for the whole time for each disk. So more formally, rebuild time
to
I had a similar problem once. I traced my problem it to a failed battery
on my RAID card, which disabled write caching. One of the many things I
need to add to monitoring.
On Tue, Aug 26, 2014 at 3:58 AM, pawel.orzechow...@budikom.net wrote:
Hello Gentelmen:-)
Let me point one important
Hi Craig,
I assume the reason for the 48 hours recovery time is to keep the cost of the
cluster low ? I wrote 1h recovery time because it is roughly the time it
would take to move 4TB over a 10Gb/s link. Could you upgrade your hardware to
reduce the recovery time to less than two hours ? Or
Ok, after some delays and the move to new network hardware I have an
update. I'm still seeing the same low bandwidth and high retransmissions
from iperf after moving to the Cisco 6001 (10Gb) and 2960 (1Gb). I've
narrowed it down to transmissions from a 10Gb connected host to a 1Gb
connected host.
35 matches
Mail list logo