On Fri, Nov 25, 2016 at 11:55 AM, Craig Chi wrote:
> Hi Brad,
>
> Thank you for your investigation.
>
> Here are the reasons of why we thought the abnormal Ceph behavior was
> caused by memory exhaustion. The following link redirect to the dmesg
> output on a toughly
Hi Brad,
Thank you for your investigation.
Here are the reasons of why we thought the abnormal Ceph behavior was caused by
memory exhaustion. The following link redirect to the dmesg output on a toughly
survived Ceph node.http://pastebin.com/Aa1FDd4K
However I can not ensure that this is
Hi Nick,
I have seen the report before, if I understand correctly, the
osd_map_cache_size generally introduces a fixed amount of memory usage. We are
using the default value of 200, and a single osd map I got from our cluster is
404KB.
That is totally 404KB * 200 * 90 (osds) = about 7GB on
Patrick,
I remember hearing you talk about this site recently. Do you know who
can help with this query?
On Fri, Nov 25, 2016 at 2:13 AM, Nick Fisk wrote:
> Who is responsible for the metrics.ceph.com site? I noticed that the mailing
> list stats are still trying to retrieve
Two of these appear to be hung task timeouts and the other is an invalid
opcode.
There is no evidence here of memory exhaustion (although it remains to be
seen whether this is a factor but I'd expect to see evidence of shrinker
activity in the stacks) and I would speculate the increased memory
There’s a couple of things you can do to reduce memory usage by limiting the
number of OSD maps each OSD stores, but you will still be pushing up against
the limits of the ram you have available. There is a Cern 30PB test (should be
on google) which gives some details on some of the settings,
Hello,
I have some files that have been uploaded to a Ceph Jewel (10.2.2)
cluster but can't be downloaded afterwards. HEAD on the file is
successful but GET returns 404.
Here is the output from object stat for one of these files :
# radosgw-admin object stat --bucket=sam-storage-mtl-8m-00
Sorry to bring this up again - any ideas? Or should I try the IRC channel?
Cheers,
Thomas
Original Message
Subject:RadosGW not responding if ceph cluster in state health_error
Date: Mon, 21 Nov 2016 17:22:20 +1300
From: Thomas
To:
Hey Guys,
We have a small/medium cluster (~ 30 TB/ 30 OSDs / 5 monitors) mainly
used as an Object Storage through 4 Rados S3 API. We'd like to add one or
two more RadosGW without S3 authentication in order to delivery these
objects using a simple HTTP CDN configuration.
Is it possible to
I will try it, but I wanna see if it stays stable for a few days. Not
sure if I should report this bug or not.
On Thu, Nov 24, 2016 at 6:05 PM, Nick Fisk wrote:
> Can you add them with different ID's, it won't look pretty but might get you
> out of this situation?
>
>>
Hello,
We have a cluster with HEALTH_ERR state for a while now. We are trying
to figure out how to solve it without the need of removing the affected
rbd image.
ceph -s
cluster e94277ae-3d38-4547-8add-2cf3306f3efd
health HEALTH_ERR
1 pgs inconsistent
5 scrub
Who is responsible for the metrics.ceph.com site? I noticed that the mailing
list stats are still trying to retrieve data from the
gmane archives which are no longer active.
Nick
___
ceph-users mailing list
ceph-users@lists.ceph.com
Can you add them with different ID's, it won't look pretty but might get you
out of this situation?
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Daznis
> Sent: 24 November 2016 15:43
> To: Nick Fisk
> Cc: ceph-users
Hello.
We have a cluster, 32 OSD, 80Tb usage space, FireFly 0.80.9 release.
This cluster we use for RBD and RadosGW Object storage witch our Openstack.
The data pools (Volumes, compute, images) are fine, but .rgw pool use 101Mb
space and
have _462446_ objects. Average size an object in this
Yes, unfortunately, it is. And the story still continues. I have
noticed that only 4 OSD are doing this and zapping and readding it
does not solve the issue. Removing them completely from the cluster
solve that issue, but I can't reuse their ID's. If I add another one
with the same ID it starts
Hi,
Just checked permissions:
> # ceph auth get client.cinder
> exported keyring for client.cinder
> [client.cinder]
> key = REDACTED
> caps mon = "allow r"
> caps osd = "allow class-read object_prefix rbd_children, allow rwx
> pool=cinder-volumes, allow rwx pool=cinder-vms, allow rx
Hi Nick,
Oh... In retrospect it makes sense in a way, but it does not as well. ;-)
To clarify: it makes sense since the cache is "just a pool" but it does
not since "it is an overlay and just a cache in between".
Anyway, something that should be well documented and warned for, if you
ask me.
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kees
> Meijs
> Sent: 24 November 2016 14:20
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Stalling IO with cache tier
>
> Hi Nick,
>
> All Ceph pools have very restrictive
Hi Nick,
All Ceph pools have very restrictive permissions for each OpenStack
service, indeed. Besides creating the cache pool and enabling it, no
additional parameters or configuration was done.
Do I understand correctly access parameters (e.g. authx keys) are needed
for a cache tier? If yes, it
Hi Burkhard,
A testing pool makes absolute sense, thank you.
About the complete setup, the documentation states:
> The cache tiering agent can flush or evict objects based upon the
> total number of bytes *or* the total number of objects. To specify a
> maximum number of bytes, execute the
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Burkhard Linke
> Sent: 24 November 2016 14:06
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Stalling IO with cache tier
>
> Hi,
>
>
> *snipsnap*
>
>
> >> # ceph osd tier
Hi,
*snipsnap*
# ceph osd tier add cinder-volumes cache
pool 'cache' is now (or already was) a tier of 'cinder-volumes'
# ceph osd tier cache-mode cache writeback
set cache-mode for pool 'cache' to writeback
# ceph osd tier set-overlay cinder-volumes cache
overlay for 'cinder-volumes' is now
Hi,
In addition, some log was generated by KVM processes:
> qemu: terminating on signal 15 from pid 2827
> osdc/ObjectCacher.cc: In function 'ObjectCacher::~ObjectCacher()'
> thread 7f265a77da80 time 2016-11-23 17:26:24.237542
> osdc/ObjectCacher.cc: 551: FAILED assert(i->empty())
> ceph
Hi list,
Our current Ceph production cluster seems to cope with performance
issues, so we decided to add a fully flash based cache tier (now running
with spinners and journals on separate SSDs).
We ordered SSDs (Intel), disk trays and read
On 24/11/16 11:23, John Spray wrote:
On Thu, Nov 24, 2016 at 11:09 AM, Stephen Harker
wrote:
Hi All,
This morning I went looking for information on the Ceph release timelines
and so on and was directed to this page by Google:
Hi,
El 24/11/16 a las 12:09, Stephen Harker escribió:
Hi All,
This morning I went looking for information on the Ceph release
timelines and so on and was directed to this page by Google:
http://docs.ceph.com/docs/jewel/releases/
but this doesn't seem to have been updated for a long time.
On Thu, Nov 24, 2016 at 11:09 AM, Stephen Harker
wrote:
> Hi All,
>
> This morning I went looking for information on the Ceph release timelines
> and so on and was directed to this page by Google:
>
> http://docs.ceph.com/docs/jewel/releases/
Replace jewel with
Hi All,
This morning I went looking for information on the Ceph release
timelines and so on and was directed to this page by Google:
http://docs.ceph.com/docs/jewel/releases/
but this doesn't seem to have been updated for a long time. Is there
somewhere else I should be looking?
Hi Nick,
Thank you for your helpful information.
I knew that Ceph recommends 1GB/1TB RAM, but we are not going to change the
hardware architecture now.
Are there any methods to set the resource limit one OSD can consume?
And for your question, we currently set system configuration as:
Replying myself.
On Wed, 23 Nov 2016 18:50:02 +0100
grin wrote:
> This is possibly some network issue, but I cannot see the indicator
> about what to see. mon0 usually stands in quorum alone, and other mons
> cannot join. They get the monmap, they intend to join, but it just
>
Hi Craig,
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Craig
Chi
Sent: 24 November 2016 08:34
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph OSDs cause kernel unresponsive
Hi Cephers,
We have encountered kernel hanging issue on our Ceph cluster. Just
radosgw supports keystone v3 in Jewel.
Can you give more details about the error? what is the exact command
are you trying?
radosgw log with debug_rgw=20 and debug_ms=5 will be most helpfull
On Tue, Nov 22, 2016 at 10:24 AM, 한승진 wrote:
> I've figured out the main reason is.
>
Hi Cephers,
We have encountered kernel hanging issue on our Ceph cluster. Just
likehttp://imgur.com/a/U2Flz,http://imgur.com/a/lyEkoorhttp://imgur.com/a/IGXdu.
We believed it is caused by out of memory, because we observed that when OSDs
went crazy, the available memory of each node were
On 11/12/2016 05:30 AM, Bill Sanders wrote:
> I'm curious what the relationship is with python_ceph_cfg[0] and
> DeepSea, which have some overlap in contributors and functionality (and
> supporting organizations?).
DeepSea and python-ceph-cfg look at ceph deployment from two different
34 matches
Mail list logo