Thanks for the information.
-Sreenath
-
Date: Wed, 25 Mar 2015 04:11:11 +0100
From: Francois Lafont flafdiv...@free.fr
To: ceph-users ceph-us...@ceph.com
Subject: Re: [ceph-users] PG calculator queries
Message-ID: 5512274f.1000...@free.fr
Content-Type: text/plain;
Thanks for the answer. Now the meaning of MB data and MB used is
clear, and if all the pools have size=3 I expect a ratio 1 to 3 of the
two values.
I still can't understand why MB used is so big in my setup.
All my pools are size =3 but the ratio MB data and MB used is 1 to
5 instead of 1 to 3.
Hi Don,
after a lot of trouble due an unfinished setcrushmap, I was able to remove the
new EC pool.
Load the old crushmap and edit agin. After include an step set_choose_tries
100 in the crushmap the EC pool creation with
ceph osd pool create ec7archiv 1024 1024 erasure 7hostprofile
work without
On 26/03/2015, at 09.05, 池信泽 xmdx...@gmail.com wrote:
hi,ceph:
Currently, the command ”ceph --admin-daemon
/var/run/ceph/ceph-osd.0.asok dump_historic_ops“ may return as below:
{ description: osd_op(client.4436.1:11617
rb.0.1153.6b8b4567.0192 [] 2.8eb4757c ondisk+write e92),
Hi ,
I 'm just starting on small Ceph implementation and wanted to know the
release date for Hammer.
Will it coincide with relase of Openstack.
My Conf: (using 10G and Jumboframes on Centos 7 / RHEL7 )
3x Mons (VMs) :
CPU - 2
Memory - 4G
Storage - 20 GB
4x OSDs :
CPU - Haswell Xeon
Memory - 8
Hi all,
due an very silly approach, I removed the cache tier of an filled EC pool.
After recreate the pool and connect with the EC pool I don't see any content.
How can I see the rbd_data and other files through the new ssd cache tier?
I think, that I must recreate the rbd_directory (and fill
On 26-03-15 12:04, Stefan Priebe - Profihost AG wrote:
Hi Wido,
Am 26.03.2015 um 11:59 schrieb Wido den Hollander:
On 26-03-15 11:52, Stefan Priebe - Profihost AG wrote:
Hi,
in the past i rwad pretty often that it's not a good idea to run ceph
and qemu / the hypervisors on the same nodes.
Hi,
in the past i rwad pretty often that it's not a good idea to run ceph
and qemu / the hypervisors on the same nodes.
But why is this a bad idea? You save space and can better use the
ressources you have in the nodes anyway.
Stefan
___
ceph-users
On 26-03-15 11:52, Stefan Priebe - Profihost AG wrote:
Hi,
in the past i rwad pretty often that it's not a good idea to run ceph
and qemu / the hypervisors on the same nodes.
But why is this a bad idea? You save space and can better use the
ressources you have in the nodes anyway.
A word of caution: While normally my OSDs use very little CPU, I have
occasionally had an issue where the OSDs saturate the CPU (not necessarily
during a rebuild). This might be a kernel thing, or a driver thing specific
to our hosts, but were this to happen to you, it now impacts your VMs as
well
It's kind of a philosophical question. Technically there's nothing that
prevents you from putting ceph and the hypervisor on the same boxes.
It's a question of whether or not potential cost savings are worth
increased risk of failure and contention. You can minimize those things
through
On 26/03/2015, at 12.14, 池信泽 xmdx...@gmail.com wrote:
It is not so convenience to do conversion in custom.
Because there are many kinds of log in ceph-osd.log. we only need some
of them including latency.
But now, It is hard to grep the log what we want and decode them.
Still run output
I understand that Giant should have systemd service files, but I don't
see them in the CentOS 7 packages.
https://github.com/ceph/ceph/tree/giant/systemd
[ulhglive-root@mon1 systemd]# rpm -qa | grep --color=always ceph
ceph-common-0.93-0.el7.centos.x86_64
python-cephfs-0.93-0.el7.centos.x86_64
For that matter, is there a way to build Calamari without going the whole
vagrant path at all? Some way of just building it through command-line tools?
I would be building it on an Openstack instance, no GUI. Seems silly to have
to install an entire virtualbox environment inside something
I don't know why you're mucking about manually with the rbd directory;
the rbd tool and rados handle cache pools correctly as far as I know.
-Greg
On Thu, Mar 26, 2015 at 8:56 AM, Udo Lembke ulem...@polarzone.de wrote:
Hi Greg,
ok!
It's looks like, that my problem is more
I used this as a guide for building calamari packages w/o using vagrant.
Worked great:
http://bryanapperson.com/blog/compiling-calamari-ceph-ubuntu-14-04/
On Thu, Mar 26, 2015 at 10:30 AM, Steffen W Sørensen ste...@me.com wrote:
On 26/03/2015, at 17.18, LaBarre, James (CTR) A6IT
We run many clusters in a similar config with shared Hypervisor/OSD/RGW/RBD
in production and in staging but we have been looking into moving our
storage to it's own cluster so that we can scale independently. We used AWS
and scaled up a ton of virtual users using JMeter clustering to test
On 03/25/2015 05:44 PM, Gregory Farnum wrote:
On Wed, Mar 25, 2015 at 10:36 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote:
Dear All,
Please forgive this post if it's naive, I'm trying to familiarise myself
with cephfs!
I'm using Scientific Linux 6.6. with Ceph 0.87.1
My first steps with
That one big server sounds great, but it also sounds like a single point of
failure. It's also not cheap. I've been able to build this cluster for
about $1400 per node, including the 10Gb networking gear, which is less
than what I see the _empty case_ you describe going for new. Even used, the
The first step is incorrect:
echo deb http://ppa.launchpad.net/saltstack/salt/ubuntu lsb_release -sc
main | sudo tee /etc/apt/sources.list.d/saltstack.list
should be
echo deb http://ppa.launchpad.net/saltstack/salt/ubuntu $(lsb_release -sc)
main | sudo tee /etc/apt/sources.list.d/saltstack.list
On 26/03/2015, at 17.18, LaBarre, James (CTR) A6IT james.laba...@cigna.com
wrote:
For that matter, is there a way to build Calamari without going the whole
vagrant path at all? Some way of just building it through command-line
tools? I would be building it on an Openstack instance, no
Hi Greg,
On 26.03.2015 18:46, Gregory Farnum wrote:
I don't know why you're mucking about manually with the rbd directory;
the rbd tool and rados handle cache pools correctly as far as I know.
that's because I deleted the cache tier pool, so the files like
rbd_header.2cfc7ce74b0dc51 and
That's a great idea. I know I can setup cinder (the openstack volume
manager) as a multi-backend manager and migrate from one backend to the
other, each backend linking to different pools of the same ceph cluster.
What bugs me though is that I'm pretty sure the image store, glance,
wouldn't
Since I have been in ceph-land today, it reminded me that I needed to close
the loop on this. I was finally able to isolate this problem down to a
faulty NIC on the ceph cluster network. It worked, but it was
accumulating a huge number of Rx errors. My best guess is some receive
buffer cache
For what it's worth, I don't think being patient was the answer. I was
having the same problem a couple of weeks ago, and I waited from before 5pm
one day until after 8am the next, and still got the same errors. I ended up
adding a new cephfs pool with a newly-created small pool, but was never
Has the OSD actually been detected as down yet?
You'll also need to set that min size on your existing pools (ceph
osd pool pool set min_size 1 or similar) to change their behavior;
the config option only takes effect for newly-created pools. (Thus the
default.)
On Thu, Mar 26, 2015 at 1:29 PM,
There have been bugs here in the recent past which have been fixed for
hammer, at least...it's possible we didn't backport it for the giant
point release. :(
But for users going forward that procedure should be good!
-Greg
On Thu, Mar 26, 2015 at 11:26 AM, Kyle Hutson kylehut...@ksu.edu wrote:
Am 26.03.2015 um 16:36 schrieb Mark Nelson:
I suspect a config like this where you only have 3 OSDs per node would
be more manageable than something denser.
IE theoretically a single E5-2697v3 is enough to run 36 OSDs in a 4U
super micro chassis for a semi-dense converged solution. You could
Hi Team,
I’ve just written blog post regarding integration of CEPH RBD devices
management in OpenSVC service :
http://www.flox-arts.net/article30/ceph-rbd-devices-management-with-opensvc-service
http://www.flox-arts.net/article30/ceph-rbd-devices-management-with-opensvc-service
Next blog post
Well, we’re a RedHat shop, so I’ll have to see what’s adaptable from there.
(Mint on all my home systems, so I’m not totally lost with Ubuntu g)
From: Quentin Hartman [mailto:qhart...@direwolfdigital.com]
Sent: Thursday, March 26, 2015 1:15 PM
To: Steffen W Sørensen
Cc: LaBarre, James (CTR)
That's fair enough Greg, I'll keep upgrading when the opportunity arises, and
maybe it'll spring back to life someday :-)
-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: 20 March 2015 23:05
To: Chris Murray
Cc: ceph-users
Subject: Re: [ceph-users] More than 50%
Hi,
Lately I've been going back to work on one of my first ceph setup and
now I see that I have created way too many placement groups for the
pools on that setup (about 10 000 too many). I believe this may impact
performances negatively, as the performances on this ceph cluster are
abysmal.
I thought there was some discussion about this before. Something like
creating a new pool and then taking your existing pool as an overlay of the
new pool (cache) and then flush the overlay to the new pool. I haven't
tried it or know if it is possible.
The other option is shut the VM down,
You shouldn't rely on rados ls when working with cache pools. It
doesn't behave properly and is a silly operation to run against a pool
of any size even when it does. :)
More specifically, rados ls is invoking the pgls operation. Normal
read/write ops will go query the backing store for objects
On 26/03/2015, at 22.53, Steffen W Sørensen ste...@me.com wrote:
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net
mailto:jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack volume
manager) as a multi-backend manager and migrate from one
On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen ste...@me.com wrote:
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack volume
manager) as a multi-backend manager and migrate from one backend to the
other, each
On 26/03/2015, at 23.01, Gregory Farnum g...@gregs42.com wrote:
On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen ste...@me.com
mailto:ste...@me.com wrote:
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack
The procedure you've outlined won't copy snapshots, just the head
objects. Preserving the proper snapshot metadata and inter-pool
relationships on rbd images I think isn't actually possible when
trying to change pools.
On Thu, Mar 26, 2015 at 3:05 PM, Steffen W Sørensen ste...@me.com wrote:
On
On 26/03/2015, at 20.38, J-P Methot jpmet...@gtcomm.net wrote:
Lately I've been going back to work on one of my first ceph setup and now I
see that I have created way too many placement groups for the pools on that
setup (about 10 000 too many). I believe this may impact performances
I added the osd pool default min size = 1 to test the behavior when 2 of 3
OSDs are down, but the behavior is exactly the same as without it: when the
2nd OSD is killed, all client writes start to block and these
pipe.(stuff).fault messages begin:
2015-03-26 16:08:50.775848 7fce177fe700 0
Ah, thanks, got it, I wasn't thinking that mons and osds on the same node
isn't a likely real world thing.
You have to admit that pipe/fault log message is a bit cryptic.
Thanks,
Lee
___
ceph-users mailing list
ceph-users@lists.ceph.com
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack volume manager)
as a multi-backend manager and migrate from one backend to the other, each
backend linking to different pools of the same ceph cluster. What bugs me
On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum g...@gregs42.com wrote:
Has the OSD actually been detected as down yet?
I believe it has, however I can't directly check because ceph health
starts to hang when I down the second node.
You'll also need to set that min size on your existing
On Thu, Mar 26, 2015 at 2:30 PM, Lee Revell rlrev...@gmail.com wrote:
On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum g...@gregs42.com wrote:
Has the OSD actually been detected as down yet?
I believe it has, however I can't directly check because ceph health
starts to hang when I down the
On 03/26/2015 10:46 AM, Gregory Farnum wrote:
I don't know why you're mucking about manually with the rbd directory;
the rbd tool and rados handle cache pools correctly as far as I know.
That's true, but the rados tool should be able to manipulate binary data
more easily. It should probably
On 26/03/2015, at 23.36, Somnath Roy somnath@sandisk.com wrote:
Got most portion of it, thanks !
But, still not able to get when second node is down why with single monitor
in the cluster client is not able to connect ?
1 monitor can form a quorum and should be sufficient for a
On Thu, Mar 26, 2015 at 3:22 PM, Somnath Roy somnath@sandisk.com wrote:
Greg,
Couple of dumb question may be.
1. If you see , the clients are connecting fine with two monitors in the
cluster. 2 monitors can never form a quorum, but, 1 can, so, why with 1
monitor (which is I guess
On Thu, Mar 26, 2015 at 3:36 PM, Somnath Roy somnath@sandisk.com wrote:
Got most portion of it, thanks !
But, still not able to get when second node is down why with single monitor
in the cluster client is not able to connect ?
1 monitor can form a quorum and should be sufficient for a
Got most portion of it, thanks !
But, still not able to get when second node is down why with single monitor in
the cluster client is not able to connect ?
1 monitor can form a quorum and should be sufficient for a cluster to run.
Thanks Regards
Somnath
-Original Message-
From:
Greg,
I think you got me wrong. I am not saying each monitor of a group of 3 should
be able to change the map. Here is the scenario.
1. Cluster up and running with 3 mons (quorum of 3), all fine.
2. One node (and mon) is down, quorum of 2 , still connecting.
3. 2 nodes (and 2 mons) are down,
On Thu, Mar 26, 2015 at 3:54 PM, Somnath Roy somnath@sandisk.com wrote:
Greg,
I think you got me wrong. I am not saying each monitor of a group of 3 should
be able to change the map. Here is the scenario.
1. Cluster up and running with 3 mons (quorum of 3), all fine.
2. One node (and
On 26/03/2015, at 23.13, Gregory Farnum g...@gregs42.com wrote:
The procedure you've outlined won't copy snapshots, just the head
objects. Preserving the proper snapshot metadata and inter-pool
relationships on rbd images I think isn't actually possible when
trying to change pools.
This
Greg,
Couple of dumb question may be.
1. If you see , the clients are connecting fine with two monitors in the
cluster. 2 monitors can never form a quorum, but, 1 can, so, why with 1 monitor
(which is I guess happening after making 2 nodes down) it is not able to
connect ?
2. Also, my
hi,ceph:
Currently, the command ”ceph --admin-daemon
/var/run/ceph/ceph-osd.0.asok dump_historic_ops“ may return as below:
{ description: osd_op(client.4436.1:11617
rb.0.1153.6b8b4567.0192 [] 2.8eb4757c ondisk+write e92),
received_at: 2015-03-25 19:41:47.146145,
age:
On Wed, Mar 25, 2015 at 8:10 PM, Ridwan Rashid Noel ridwan...@gmail.com wrote:
Hi Greg,
Thank you for your response. I have understood that I should be starting
only the mapred daemons when using cephFS instead of HDFS. I have fixed that
and trying to run hadoop wordcount job using this
On Thu, 26 Mar 2015, Gregory Farnum wrote:
On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell rlrev...@gmail.com wrote:
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide. It seems to work OK but ceph
is constantly complaining
I run a converged openstack / ceph cluster with 14 1U nodes. Each has 1 SSD
(os / journals), 3 1TB spinners (1 OSD each), 16 HT cores, 10Gb NICs for
ceph network, and 72GB of RAM. I configure openstack to leave 3GB of RAM
unused on each node for OSD / OS overhead. All the VMs are backed by ceph
On Thu, Mar 26, 2015 at 7:44 AM, Lee Revell rlrev...@gmail.com wrote:
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide. It seems to work OK but ceph
is constantly complaining about clock skew much greater than reality.
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide. It seems to work OK but
ceph is constantly complaining about clock skew much greater than reality.
Clocksource on the virtuals is kvm-clock and they also run ntpd.
On Thu, Mar 26, 2015 at 2:56 AM, Saverio Proto ziopr...@gmail.com wrote:
Thanks for the answer. Now the meaning of MB data and MB used is
clear, and if all the pools have size=3 I expect a ratio 1 to 3 of the
two values.
I still can't understand why MB used is so big in my setup.
All my
I have a virtual test environment of an admin node and 3 mon + osd nodes,
built by just following the quick start guide. It seems to work OK but
ceph is constantly complaining about clock skew much greater than reality.
Clocksource on the virtuals is kvm-clock and they also run ntpd.
I suspect a config like this where you only have 3 OSDs per node would
be more manageable than something denser.
IE theoretically a single E5-2697v3 is enough to run 36 OSDs in a 4U
super micro chassis for a semi-dense converged solution. You could
attempt to restrict the OSDs to one socket
Hi Greg,
ok!
It's looks like, that my problem is more setomapval-related...
I must o something like
rados -p ssd-archiv setomapval rbd_directory name_vm-409-disk-2
\0x0f\0x00\0x00\0x002cfc7ce74b0dc51
but rados setomapval don't use the hexvalues - instead of this I got
rados -p ssd-archiv
You just need to go look at one of your OSDs and see what data is
stored on it. Did you configure things so that the journals are using
a file on the same storage disk? If so, *that* is why the data used
is large.
I followed your suggestion and this is the result of my trobleshooting.
Each
I think I solved the problem. The clock skew only happens when restarting a
node to simulate hardware failure. The virtual comes up with a skewed clock
and ceph services start before ntp has time to adjust it, then there's a
delay before ceph rechecks the clock skew.
Lee
On Thu, Mar 26, 2015 at
65 matches
Mail list logo