On 08-11-12 08:29, Travis Rhoden wrote:
Hey folks,
I'm trying to set up a brand new Ceph cluster, based on v0.53. My
hardware has SSDs for journals, and I'm trying to get mkcephfs to
intialize everything for me. However, the command hangs forever and I
eventually have to kill it.
After
On 08/11/12 21:08, Wido den Hollander wrote:
On 08-11-12 08:29, Travis Rhoden wrote:
Hey folks,
I'm trying to set up a brand new Ceph cluster, based on v0.53. My
hardware has SSDs for journals, and I'm trying to get mkcephfs to
intialize everything for me. However, the command hangs forever
Am 08.11.2012 01:59, schrieb Mark Nelson:
There's also the context switching overhead. It'd be interesting to
know how much the writer processes were shifting around on cores.
What do you mean by that? I'm talking about the KVM guest not about the
ceph nodes.
Stefan, what tool were you
done:
http://tracker.newdream.net/issues/3461
Am 08.11.2012 04:09, schrieb Josh Durgin:
On 11/07/2012 08:26 AM, Stefan Priebe wrote:
Am 07.11.2012 16:04, schrieb Mark Nelson:
Whew, glad you found the problem Stefan! I was starting to wonder what
was going on. :) Do you mind filling a bug
What do you mean by that? I'm talking about the KVM guest not about the
ceph nodes.
Do you have tried to compare virtio-blk and virtio-scsi ?
Do you have tried directly from the host with the rbd kernel module ?
- Mail original -
De: Stefan Priebe - Profihost AG
Am 08.11.2012 09:58, schrieb Alexandre DERUMIER:
What do you mean by that? I'm talking about the KVM guest not about the
ceph nodes.
Do you have tried to compare virtio-blk and virtio-scsi ?
How to change? Right now i'm using the PVE defaults = scsi-hd.
Do you have tried directly from the
Hello list,
is there any prefered way to use clock syncronisation?
I've tried running openntpd and ntpd on all servers but i'm still getting:
2012-11-08 09:55:38.255928 mon.0 [WRN] message from mon.2 was stamped
0.063136s in the future, clocks not synchronized
2012-11-08 09:55:39.328639 mon.0
Do you have tried to compare virtio-blk and virtio-scsi ?
How to change? Right now i'm using the PVE defaults = scsi-hd.
(virtio-blk is classic virtio ;)
Do you have tried directly from the host with the rbd kernel module ?
No don't know how to use ;-)
Am 08.11.2012 10:05, schrieb Alexandre DERUMIER:
Do you have tried to compare virtio-blk and virtio-scsi ?
How to change? Right now i'm using the PVE defaults = scsi-hd.
(virtio-blk is classic virtio ;)
Do you have tried directly from the host with the rbd kernel module ?
No don't know how
W dniu 08.11.2012 12:14, Adam Ochmański pisze:
Hi,
our test cluster going stuck every time when one of our osd host going
down, when mising osd go to up state and recovery go to 100% cluster
still not working propertly.
I forgot add version of ceph i use: 0.53-422-g2d20f3a
--
Best,
blink
--
On 08-11-12 10:04, Stefan Priebe - Profihost AG wrote:
Hello list,
is there any prefered way to use clock syncronisation?
I've tried running openntpd and ntpd on all servers but i'm still getting:
2012-11-08 09:55:38.255928 mon.0 [WRN] message from mon.2 was stamped
0.063136s in the future,
On Thu, Nov 8, 2012 at 4:00 PM, Wido den Hollander w...@widodh.nl wrote:
On 08-11-12 10:04, Stefan Priebe - Profihost AG wrote:
Hello list,
is there any prefered way to use clock syncronisation?
I've tried running openntpd and ntpd on all servers but i'm still getting:
2012-11-08
Am 08.11.2012 13:00, schrieb Wido den Hollander:
On 08-11-12 10:04, Stefan Priebe - Profihost AG wrote:
Hello list,
is there any prefered way to use clock syncronisation?
I've tried running openntpd and ntpd on all servers but i'm still
getting:
2012-11-08 09:55:38.255928 mon.0 [WRN]
From: Yan, Zheng zheng.z@intel.com
Expiring log segment before it's fully flushed may cause various
issues during log replay.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/leveldb | 2 +-
src/mds/MDLog.cc | 8 +---
2 files changed, 6 insertions(+), 4 deletions(-)
diff
On 11/08/2012 02:45 AM, Stefan Priebe - Profihost AG wrote:
Am 08.11.2012 01:59, schrieb Mark Nelson:
There's also the context switching overhead. It'd be interesting to
know how much the writer processes were shifting around on cores.
What do you mean by that? I'm talking about the KVM guest
On Nov 8, 2012, at 3:22 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2012/11/8 Mark Nelson mark.nel...@inktank.com:
I haven't done much with IPoIB (just RDMA), but my understanding is that it
tends to top out at like 15Gb/s. Some others on this mailing list can
probably
These three changes are pretty trivial. -Alex
[PATCH 1/3] rbd: standardize rbd_request variable names
[PATCH 2/3] rbd: standardize ceph_osd_request variable names
[PATCH 3/3] rbd: be picky about osd request status type
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the
There are two names used for items of rbd_request structure type:
req and req_data. The former name is also used to represent
items of pointers to struct ceph_osd_request.
Change all variables that have these names so they are instead
called rbd_req consistently.
Signed-off-by: Alex Elder
There are spots where a ceph_osds_request pointer variable is given
the name req. Since we're dealing with (at least) three types of
requests (block layer, rbd, and osd), I find this slightly
distracting.
Change such instances to use osd_req consistently to make the
abstraction represented a
The result field in a ceph osd reply header is a signed 32-bit type,
but rbd code often casually uses int to represent it.
The following changes the types of variables that handle this result
value to be s32 instead of int to be completely explicit about
it. Only at the point we pass that result
Some refactoring to improve readability.-Alex
[PATCH 1/2] rbd: encapsulate handling for a single request
[PATCH 2/2] rbd: a little more cleanup of rbd_rq_fn()
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More
Now that a big hunk in the middle of rbd_rq_fn() has been moved
into its own routine we can simplify it a little more.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 50
+++---
1 file changed, 23 insertions(+), 27 deletions(-)
Only one of the three callers of rbd_do_request() provide a
collection structure to aggregate status.
If an error occurs in rbd_do_request(), have the caller
take care of calling rbd_coll_end_req() if necessary in
that one spot.
Signed-off-by: Alex Elder el...@inktank.com
---
On 11/08/2012 07:55 AM, Atchley, Scott wrote:
On Nov 8, 2012, at 3:22 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2012/11/8 Mark Nelson mark.nel...@inktank.com:
I haven't done much with IPoIB (just RDMA), but my understanding is that it
tends to top out at like 15Gb/s.
Is there any way to find out why a ceph-osd process takes around 10
times more load on rand 4k writes than on 4k reads?
Stefan
Am 07.11.2012 21:41, schrieb Stefan Priebe:
Hello list,
whiling benchmarking i was wondering, why the ceph-osd load is so
extreme high while having random 4k write
On Thu, 8 Nov 2012, Stefan Priebe - Profihost AG wrote:
Is there any way to find out why a ceph-osd process takes around 10 times more
load on rand 4k writes than on 4k reads?
Something like perf or oprofile is probably your best bet. perf can be
tedious to deploy, depending on where your
[osd]
osd journal size = 4000
Not sure if this is the problem, but when using a block device you don't
have to specify the size for the journal.
So happy to know that, Wido! I had hoped there was a way to skip that.
Tried without it -- only difference in the logs was seeing that
On Nov 8, 2012, at 10:00 AM, Scott Atchley atchle...@ornl.gov wrote:
On Nov 8, 2012, at 9:39 AM, Mark Nelson mark.nel...@inktank.com wrote:
On 11/08/2012 07:55 AM, Atchley, Scott wrote:
On Nov 8, 2012, at 3:22 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2012/11/8
Am 08.11.2012 14:19, schrieb Mark Nelson:
On 11/08/2012 02:45 AM, Stefan Priebe - Profihost AG wrote:
Am 08.11.2012 01:59, schrieb Mark Nelson:
There's also the context switching overhead. It'd be interesting to
know how much the writer processes were shifting around on cores.
What do you
Am 08.11.2012 16:01, schrieb Mark Nelson:
Hi Stefan,
You might want to try running sysprof or perf while the OSDs are running
during the tests and see where CPU time is being spent. Also, how are
you determining how much CPU usage is being used?
Hi Mark,
have a 300MB perf.data file and no
So it is a problem of KVM which let's the processes jump between cores a
lot.
maybe numad from redhat can help ?
http://fedoraproject.org/wiki/Features/numad
It's try to keep process on same numa node and I think it's also doing some
dynamic pinning.
- Mail original -
De: Stefan
On 11/08/2012 09:45 AM, Stefan Priebe - Profihost AG wrote:
Am 08.11.2012 16:01, schrieb Sage Weil:
On Thu, 8 Nov 2012, Stefan Priebe - Profihost AG wrote:
Is there any way to find out why a ceph-osd process takes around 10
times more
load on rand 4k writes than on 4k reads?
Something like
On Thu, Nov 8, 2012 at 7:02 PM, Atchley, Scott atchle...@ornl.gov wrote:
On Nov 8, 2012, at 10:00 AM, Scott Atchley atchle...@ornl.gov wrote:
On Nov 8, 2012, at 9:39 AM, Mark Nelson mark.nel...@inktank.com wrote:
On 11/08/2012 07:55 AM, Atchley, Scott wrote:
On Nov 8, 2012, at 3:22 AM,
Hi Liu,
Sorry for the late reply; I have had a very busy week. :)
On Thu, 1 Nov 2012, liu yaqi wrote:
Dear Mr.Weil
I am a student of Institute of Computing Technology, Chinese Academy of
Sciences, and I am learning the realization of snapshot in ceph system.
There are sometings that
I have a branch for review that reworks that tests for the java bindings
and builds them if both --enable-cephfs-java and --with-debug are
specified. The tests can also be built and run via ant.
Branch name is wip-java-tests.
Regards,
-Joe Buck
--
To unsubscribe from this list: send the line
Merged, thanks!
sage
On Thu, 8 Nov 2012, Joe Buck wrote:
I have a branch for review that reworks that tests for the java bindings and
builds them if both --enable-cephfs-java and --with-debug are specified. The
tests can also be built and run via ant.
Branch name is wip-java-tests.
Solved!
I stumbled into the solution while switching from block device to a
file. I was being bit by running mkcephfs multiple times -- it wasn't
really failing on the journal, it was failing because the OSD data
disk had been initialized before. I couldn't see that until I used a
file for the
On 11/08/2012 11:36 AM, Travis Rhoden wrote:
Solved!
I stumbled into the solution while switching from block device to a
file. I was being bit by running mkcephfs multiple times -- it wasn't
really failing on the journal, it was failing because the OSD data
disk had been initialized before. I
On Nov 8, 2012, at 11:19 AM, Andrey Korolyov and...@xdel.ru wrote:
On Thu, Nov 8, 2012 at 7:02 PM, Atchley, Scott atchle...@ornl.gov wrote:
On Nov 8, 2012, at 10:00 AM, Scott Atchley atchle...@ornl.gov wrote:
On Nov 8, 2012, at 9:39 AM, Mark Nelson mark.nel...@inktank.com wrote:
On
Sorry about that, I think it got chopped. Here's a full trace from
another run, using kernel 3.6.6 and definitely has the patch applied:
https://gist.github.com/4041120
There are no instances of sync_fs_one_sb skipping in the logs.
On Mon, Nov 5, 2012 at 1:29 AM, Sage Weil s...@inktank.com
On 9 November 2012 02:00, Atchley, Scott atchle...@ornl.gov wrote:
On Nov 8, 2012, at 9:39 AM, Mark Nelson mark.nel...@inktank.com wrote:
On 11/08/2012 07:55 AM, Atchley, Scott wrote:
On Nov 8, 2012, at 3:22 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2012/11/8 Mark
Am 08.11.2012 17:06, schrieb Mark Nelson:
On 11/08/2012 09:45 AM, Stefan Priebe - Profihost AG wrote:
Am 08.11.2012 16:01, schrieb Sage Weil:
On Thu, 8 Nov 2012, Stefan Priebe - Profihost AG wrote:
Is there any way to find out why a ceph-osd process takes around 10
times more
load on rand 4k
On Wed, Nov 7, 2012 at 6:16 AM, Sławomir Skowron szi...@gmail.com wrote:
I have realize that requests from fastcgi in nginx from radosgw returning:
HTTP/1.1 200, not a HTTP/1.1 200 OK
Any other cgi that i run, for example php via fastcgi return this like
RFC says, with OK.
Is someone
On 11/08/2012 01:27 PM, Stefan Priebe wrote:
Am 08.11.2012 17:06, schrieb Mark Nelson:
On 11/08/2012 09:45 AM, Stefan Priebe - Profihost AG wrote:
Am 08.11.2012 16:01, schrieb Sage Weil:
On Thu, 8 Nov 2012, Stefan Priebe - Profihost AG wrote:
Is there any way to find out why a ceph-osd
On Thu, Nov 8, 2012 at 7:53 PM, Alexandre DERUMIER aderum...@odiso.com wrote:
So it is a problem of KVM which let's the processes jump between cores a
lot.
maybe numad from redhat can help ?
http://fedoraproject.org/wiki/Features/numad
It's try to keep process on same numa node and I think
On 9 November 2012 08:21, Dieter Kasper d.kas...@kabelmail.de wrote:
Joseph,
I've downloaded and read the presentation from 'Sean Hefty / Intel
Corporation'
about rsockets, which sounds very promising to me.
Can you please teach me how to get access to the rsockets source ?
Thanks,
Ok, i will digg in nginx, thanks.
Dnia 8 lis 2012 o godz. 22:48 Yehuda Sadeh yeh...@inktank.com napisał(a):
On Wed, Nov 7, 2012 at 6:16 AM, Sławomir Skowron szi...@gmail.com wrote:
I have realize that requests from fastcgi in nginx from radosgw returning:
HTTP/1.1 200, not a HTTP/1.1 200 OK
Am 08.11.2012 22:58, schrieb Mark Nelson:
Also, I'm not sure what version you are running, but you may want to try
testing master and see if that helps. Sam has done some work on our
threading and locking code that might help.
This is git master (two hours old).
Stefan
--
To unsubscribe from
We are seeing a somewhat random, but frequent hang on our systems
during startup. The hang happens at the point where an rbd map
rbdvol command is run.
I've attached the ceph logs from the cluster. The map command happens
at Nov 8 18:41:09 on server 172.18.0.15. The process which hung can
be
I have a 3 line change to the file qa/workunits/libcephfs-java/test.sh that
tweaks how LD_LIBRARY_PATH is set for the test execution.
The branch is wip-java-test in ceph.git.
Best,
-Joe Buck--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to
On 11/07/2012 07:28 AM, Stefan Priebe - Profihost AG wrote:
Hello,
i've added two nodes with 4 devices each and modified the crushmap.
But importing the new map results in:
crushmap max_devices 55 osdmap max_osd 35
What's wrong?
I think this is an obsolete check since
Joseph,
I've downloaded and read the presentation from 'Sean Hefty / Intel Corporation'
about rsockets, which sounds very promising to me.
Can you please teach me how to get access to the rsockets source ?
Thanks,
-Dieter
On Thu, Nov 08, 2012 at 09:12:45PM +0100, Joseph Glanville wrote:
On 9
On Wed, Oct 31, 2012 at 1:46 PM, Sage Weil s...@inktank.com wrote:
I would like to freeze v0.55, the bobtail stable release, at the end of
next week. If there is any functionality you are working on that should
be included, we need to get it into master (preferably well) before that.
There
On 11/08/2012 02:10 PM, Mandell Degerness wrote:
We are seeing a somewhat random, but frequent hang on our systems
during startup. The hang happens at the point where an rbd map
rbdvol command is run.
I've attached the ceph logs from the cluster. The map command happens
at Nov 8 18:41:09 on
I've got wip_recovery_qos and wip_persist_missing that should go into bobtail.
wip_recovery_qos passed regression (mostly, failures due to fsx, a bug
fixed in master, and timeouts waiting for machines), and is waiting on
review.
wip_persist_missing has a teuthology test I'll push tomorrow
I needed this patch after some simple 1 OSD vstart environments
refused to allow clients to connect.
--
A minimum pool size of 2 was introduced by 13486857cf. This sets the
minimum to one so that basic vstart environments work.
Signed-off-by: Noah Watkins noahwatk...@gmail.com
diff
56 matches
Mail list logo