On Mon, 5 Jan 2015 22:36:29 + Sanders, Bill wrote:
Hi Ceph Users,
We've got a Ceph cluster we've built, and we're experiencing issues with
slow or hung IO's, even running 'rados bench' on the OSD cluster.
Things start out great, ~600 MB/s, then rapidly drops off as the test
waits for
Hello,
I am new to ceph and have a problem about ceph object gateway usage, did not
find enough hints by googling it, so send an email here, thanks.
I have a ceph server with object gateway configured, and another client node to
test the object accessing.
On the client when I using s3cmd (a
On Mon, Jan 5, 2015 at 10:12 AM, Dan van der Ster
daniel.vanders...@cern.ch wrote:
Hi Sage,
On Tue, Dec 23, 2014 at 10:10 PM, Sage Weil sw...@redhat.com wrote:
This fun issue came up again in the form of 10422:
http://tracker.ceph.com/issues/10422
I think we have 3 main options:
Now that the holidays are over, I'm going to bump this message to see if
there are any good ideas on this.
Thanks,
Robert LeBlanc
On Thu, Dec 18, 2014 at 2:21 PM, Robert LeBlanc rob...@leblancnet.us
wrote:
Before we base thousands of VM image clones off of one or more snapshots,
I want to
Am 05.01.15 um 16:37 schrieb Samuel Just:
Can you post the output of 'ceph pg dump'?
-Sam
Hi,
it's at
https://www.christopher-kunz.de/tmp/pgdump.txt
Regards,
--ck
___
ceph-users mailing list
ceph-users@lists.ceph.com
On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc rob...@leblancnet.us wrote:
Before we base thousands of VM image clones off of one or more snapshots, I
want to test what happens when the snapshot becomes corrupted. I don't
believe the snapshot will become corrupted through client access to the
On Sun, Jan 4, 2015 at 8:10 AM, Lionel Bouton lionel+c...@bouton.name wrote:
On 01/04/15 16:25, Jiri Kanicky wrote:
Hi.
I have been experiencing same issues on both nodes over the past 2
days (never both nodes at the same time). It seems the issue occurs
after some time when copying a
When you shrinking the RBD, most of the time was spent on
librbd/internal.cc::trim_image(), in this function, client will iterator all
unnecessary objects(no matter whether it exists) and delete them.
So in this case, when Edwin shrinking his RBD from 650PB to 650GB, there
are[ (650PB *
Hi all,
I have installed ceph using ceph-deploy utility.. I have created three
VM's, one for monitor+mds and other two VM's for OSD's. ceph admin is
another seperate machine...
.Status and health of ceph are shown below.. Can you please suggest What i
can infer from the status.. I m a beginner
Hi Loic,
That's an interesting idea, I suppose the same could probably be achieved by
just creating more Crush Host Buckets for each actual host and then treat
the actual physical host as a chassis (Chassis-1 contains Host-1-A,
Host-1-B...etc)
I was thinking about this some more and I don't
On Tue, 6 Jan 2015 12:07:26 AM Sanders, Bill wrote:
14 and 18 happened to show up during that run, but its certainly not only
those OSD's. It seems to vary each run. Just from the runs I've done
today I've seen the following pairs of OSD's:
Could your osd nodes be paging? I know from
On Tue, Dec 30, 2014 at 11:38 AM, Erik Logtenberg e...@logtenberg.eu wrote:
Hi Erik,
I have tiering working on a couple test clusters. It seems to be
working with Ceph v0.90 when I set:
ceph osd pool set POOL hit_set_type bloom
ceph osd pool set POOL hit_set_count 1
ceph osd pool set
Hi,
I just ran this test and found my system is not better. But I use
commodity hardware. The only difference is latency. You should look at
it.
Total time run: 62.412381
Total writes made: 919
Write size: 4194304
Bandwidth (MB/sec): 58.899
Stddev Bandwidth:
Hi Pankaj,
You can search for the lib using the 'yum provides' command, which accepts
wildcards.
[root@sl7 ~]# yum provides */lib64/libkeyutils*
Loaded plugins: langpacks
keyutils-libs-1.5.8-3.el7.x86_64 : Key utilities library
Repo: sl
Matched from:
Filename:
On Mon, 5 Jan 2015 23:41:17 +0400 ivan babrou wrote:
Rebalancing is almost finished, but things got even worse:
http://i.imgur.com/0HOPZil.png
Looking at that graph only one OSD really kept growing and growing,
everything else seems to be a lot denser, less varied than before, as one
would
Ceph currently isn't very smart on ordering the balancing operations. It
can fill a disk before moving some things off of it. So if you are close to
the toofull line, it can push that OSD over. I think there is a blueprint
to help with this being worked on for Hammer.
You have a couple of
I think because ceph-disk or ceph-deploy doesn't support --osd-uuid.
On Wed, Dec 31, 2014 at 10:30 AM, Andrey Korolyov and...@xdel.ru wrote:
On Wed, Dec 31, 2014 at 8:20 PM, Wido den Hollander w...@42on.com wrote:
On 12/31/2014 05:54 PM, Andrey Korolyov wrote:
On Wed, Dec 31, 2014 at 7:34
I deleted some old backups and GC is returning some disk space back. But
cluster state is still bad:
2015-01-06 13:35:54.102493 mon.0 [INF] pgmap v4017947: 5832 pgs: 23
active+remapped+wait_backfill, 1
active+remapped+wait_backfill+backfill_toofull, 2
active+remapped+backfilling, 5806
Hi Ajitha,
For one, it looks like you don't have enough OSDs for the number of
replicas you have specified in the config file. What is the value of
your 'osd pool default size' in ceph.conf? If it's 3, for example,
then you need to have at least 3 hosts with 1 OSD each (with the default
On Monday, January 5, 2015, Chen, Xiaoxi xiaoxi.c...@intel.com wrote:
When you shrinking the RBD, most of the time was spent on
librbd/internal.cc::trim_image(), in this function, client will iterator
all unnecessary objects(no matter whether it exists) and delete them.
So in this case,
I would think that the RBD mounter would cache the directory listing
which should always make it fast, unless there is so much memory
pressure that it is dropping it frequently.
How many entries are in your directory and total on the RBD?
ls | wc -l
find . | wc -l
What does your memory look
It does seem like the entries get cached for a certain period of time.
Here is the memory listing for the rbd client server:
root@cephmount1:~# free -m
total used free sharedbuffers cached
Mem: 11965 11816149 3139
What fs are you running inside the RBD?
On Tue, Jan 6, 2015 at 8:29 AM, Shain Miley smi...@npr.org wrote:
Hello,
We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x
4TB drives formatted with xfs. The cluster is running ceph version 0.80.7:
Cluster health:
cluster
Restarting OSD fixed PGs that were stuck: http://i.imgur.com/qd5vuzV.png
Still OSD dis usage is very different, 150..250gb. Shall I double PGs again?
On 6 January 2015 at 17:12, ivan babrou ibob...@gmail.com wrote:
I deleted some old backups and GC is returning some disk space back. But
Hello,
We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x
4TB drives formatted with xfs. The cluster is running ceph version 0.80.7:
Cluster health:
cluster 504b5794-34bd-44e7-a8c3-0494cf800c23
health HEALTH_WARN crush map has legacy tunables
monmap e1: 3
Robert,
Thanks again for the help.
I'll keep looking around. However as you stated it might be a matter of trying
to reduce OSD latency, instead of trying to find tuning option on the client.
I've already increased the readahead values, played with the scheduler, and
mount options...so I'm
Christian,
Each of the OSD's server nodes are running on Dell R-720xd's with 64 GB or RAM.
We have 107 OSD's so I have not checked all of them..however the ones I have
checked with xfs_db, have shown anywhere from 1% to 4% fragmentation.
I'll try to upgrade the client server to 32 or 64 GB of
On 01/06/15 02:36, Gregory Farnum wrote:
[...]
filestore btrfs snap controls whether to use btrfs snapshots to keep
the journal and backing store in check. WIth that option disabled it
handles things in basically the same way we do with xfs.
filestore btrfs clone range I believe controls how
Robert,
xfs on the rbd image as well:
/dev/rbd0 on /mnt/ceph-block-device-archive type xfs (rw)
However looking at the mount options...it does not look like I've enabled
anything special in terms of mount options.
Thanks,
Shain
Shain Miley | Manager of Systems and Infrastructure, Digital
I'm afraid I don't know what would happen if you change those options.
Hopefully we've set it up so things continue to work, but we definitely
don't test it.
-Greg
On Tue, Jan 6, 2015 at 8:22 AM Lionel Bouton lionel+c...@bouton.name
wrote:
On 01/06/15 02:36, Gregory Farnum wrote:
[...]
On Tue, 6 Jan 2015 19:28:44 +0400 ivan babrou wrote:
Restarting OSD fixed PGs that were stuck: http://i.imgur.com/qd5vuzV.png
Good to hear that.
Funny (not really) how often restarting OSDs fixes stuff like that.
Still OSD dis usage is very different, 150..250gb. Shall I double PGs
again?
I've also had some luck with the following crush ruleset, erasure profile
failure domain is set to OSD
rule ecpool_test2 {
ruleset 3
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step set_choose_tries 100
step take ceph1
On 01/06/15 18:26, Gregory Farnum wrote:
I'm afraid I don't know what would happen if you change those options.
Hopefully we've set it up so things continue to work, but we
definitely don't test it.
Thanks. That's not a problem: when the opportunity arise I'll just adapt
my tests accordingly
Can't this be done in parallel? If the OSD doesn't have an object then
it is a noop and should be pretty quick. The number of outstanding
operations can be limited to 100 or a 1000 which would provide a
balance between speed and performance impact if there is data to be
trimmed. I'm not a big fan
On Mon, Jan 5, 2015 at 6:01 PM, Gregory Farnum g...@gregs42.com wrote:
On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc rob...@leblancnet.us wrote:
Before we base thousands of VM image clones off of one or more snapshots, I
want to test what happens when the snapshot becomes corrupted. I don't
35 matches
Mail list logo