I'd like to see a few of the cache tier counters exposed. You get
some info on cache activity in 'ceph -s' so it makes sense from my
perspective to have similar availability in exposed counters.
There's a tracker for this request (opened by me a while ago):
https://tracker.ceph.com/issues/37156
at 12:53 PM Benjeman Meekhof wrote:
>
> Ceph Nautilus, 14.2.2, RGW civetweb.
> Trying to read from the RGW admin api /metadata/user with request URL like:
> GET /admin/metadata/user?key=someuser=json
>
> But am getting a 403 denied error from RGW. Shouldn't the caps below
> b
Ceph Nautilus, 14.2.2, RGW civetweb.
Trying to read from the RGW admin api /metadata/user with request URL like:
GET /admin/metadata/user?key=someuser=json
But am getting a 403 denied error from RGW. Shouldn't the caps below
be sufficient, or am I missing something?
"caps": [
{
I suggest having a look at this thread, which suggests that sizes 'in
between' the requirements of different RocksDB levels have no net
effect, and size accordingly.
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030740.html
My impression is that 28GB is good (L0+L1+L3), or 280
Hi Vlad,
If a user creates a bucket then only that user can see the bucket
unless an S3 ACL is applied giving additional permissionsbut I'd
guess you are asking a more complex question than that.
If you are looking to apply some kind of policy over-riding whatever
ACL a user might apply to a
We have a user syncing data with some kind of rsync + hardlink based
system creating/removing large numbers of hard links. We've
encountered many of the issues with stray inode re-integration as
described in the thread and tracker below.
As noted one fix is to increase mds_bal_fragment_size_max
Hi all,
I'm looking to keep some extra meta-data associated with radosgw users
created by radosgw-admin. I saw in the output of 'radosgw-admin
metadata get user:someuser" there is an 'attrs' structure that looked
promising. However it seems to be strict about what it accepts so I
wonder if
Version: Mimic 13.2.2
Lately during any kind of cluster change, particularly adding OSD in
this most recent instance, I'm seeing our mons (all of them) showing
100% usage on a single core but not at all using any of the other
available cores on the system. Cluster commands are slow to respond
Hi Rishabh,
You might want to check out these examples for python boto3 which include SSE-C:
https://github.com/boto/boto3/blob/develop/boto3/examples/s3.rst
As already noted use 'radosgw-admin' to retrieve access key and secret
key to plug into your client. If you are not an administrator on
MiB) Bytes in malloc metadata
MALLOC:
MALLOC: = 12869599232 (12273.4 MiB) Actual memory used (physical + swap)
MALLOC: +436740096 ( 416.5 MiB) Bytes released to OS (aka unmapped)
MALLOC:
MALLOC: = 13306339328 (12689.9 MiB) Virtual addre
378986760 ( 361.4 MiB) Bytes in central cache freelist
MALLOC: + 4713472 (4.5 MiB) Bytes in transfer cache freelist
MALLOC: + 20722016 ( 19.8 MiB) Bytes in thread cache freelists
MALLOC: + 62652416 ( 59.8 MiB) Bytes in malloc metadata
MALLOC:
MALLOC: = 128
I've been encountering lately a much higher than expected memory usage
on our MDS which doesn't align with the cache_memory limit even
accounting for potential over-runs. Our memory limit is 4GB but the
MDS process is steadily at around 11GB used.
Coincidentally we also have a new user heavily
I can comment on that docker image: We built that to bake in a
certain amount of config regarding nfs-ganesha serving CephFS and
using LDAP to do idmap lookups (example ldap entries are in readme).
At least as we use it the server-side uid/gid information is pulled
from sssd using a config file
to the question
might be interesting for future reference.
thanks,
Ben
On Thu, Jun 21, 2018 at 11:32 AM, Benjeman Meekhof wrote:
> Thanks very much John! Skipping over the corrupt entry by setting a
> new expire_pos seems to have worked. The journal expire_pos is now
> advancing
integrity.
As recommended I did take an export of the journal first and I'll take
a stab at using a hex editor on it near future. Worst case we go
through the tag/scan if necessary.
thanks,
Ben
On Thu, Jun 21, 2018 at 9:04 AM, John Spray wrote:
> On Wed, Jun 20, 2018 at 2:17 PM Benjeman Meek
out": {
"stripe_unit": 4194304,
"stripe_count": 1,
"object_size": 4194304,
"pool_id": 64,
"pool_ns": ""
}
}
thanks,
Ben
On Fri, Jun 15, 2018 at 11:54 AM, John Spray wrote:
> On Fri, Jun 15, 2018
Have seen some posts and issue trackers related to this topic in the
past but haven't been able to put it together to resolve the issue I'm
having. All on Luminous 12.2.5 (upgraded over time from past
releases). We are going to upgrade to Mimic near future if that would
somehow resolve the
I see that luminous RPM packages are up at download.ceph.com for
ganesha-ceph 2.6 but there is nothing in the Deb area. Any estimates
on when we might see those packages?
http://download.ceph.com/nfs-ganesha/deb-V2.6-stable/luminous/
thanks,
Ben
___
Hi Marc,
I can't speak to your other questions but as far as the user auth caps
those are still kept in the radosgw metadata outside of ldap. As far
as I know all that LDAP gives you is a way to authenticate users with
a user/password combination.
So, for example, if you create a user
Hi Marc,
You mentioned following the instructions 'except for doing this ldap
token'. Do I read that correctly that you did not generate / use an
LDAP token with your client? I think that is a necessary part of
triggering the LDAP authentication (Section 3.2 and 3.3 of the doc you
linked). I
e:
> Hi Benjeman,
>
> It is -intended- to work, identically to the standalone radosgw
> server. I can try to verify whether there could be a bug affecting
> this path.
>
> Matt
>
> On Fri, Mar 9, 2018 at 12:01 PM, Benjeman Meekhof <bmeek...@umich.edu> wrote
I'm having issues exporting a radosgw bucket if the configured user is
authenticated using the rgw ldap connectors. I've verified that this
same ldap token works ok for other clients, and as I'll note below it
seems like the rgw instance is contacting the LDAP server and
successfully
We use this one, now heavily modified in our own fork. I'd sooner
point you at the original unless it is missing something you need.
Ours has diverged a bit and makes no attempt to support anything
outside our specific environment (RHEL7).
https://github.com/openstack/puppet-ceph
The 'cannot stat' messages are normal at startup, we see them also in
our working setup with mgr influx module. Maybe they could be fixed
by delaying the module startup, or having it check for some other
'all good' status but I haven't looked into it. You should only be
seeing them when the mgr
In our case I think we grabbed the SRPM from Fedora and rebuilt it on
Scientific Linux (another RHEL derivative). Presumably the binary
didn't work or I would have installed it directly. I'm not quite sure
why it hasn't migrated to EPEL yet.
I haven't tried the SRPM for latest releases, we're
Hi Reed,
Someone in our group originally wrote the plugin and put in PR. Since
our commit the plugin was 'forward-ported' to master and made
incompatible with Luminous so we've been using our own version of the
plugin while waiting for the necessary pieces to be back-ported to
Luminous to use
module log: mgr get_python
Python module requested unknown data 'pg_status'
thanks,
Ben
On Thu, Oct 5, 2017 at 8:42 AM, John Spray <jsp...@redhat.com> wrote:
> On Wed, Oct 4, 2017 at 7:14 PM, Gregory Farnum <gfar...@redhat.com> wrote:
>> On Wed, Oct 4, 2017 at 9:14 AM, Be
Wondering if anyone can tell me how to summarize recovery
bytes/ops/objects from counters available in the ceph-mgr python
interface? To put another way, how does the ceph -s command put
together that infomation and can I access that information from a
counter queryable by the ceph-mgr python
Some of this thread seems to contradict the documentation and confuses
me. Is the statement below correct?
"The BlueStore journal will always be placed on the fastest device
available, so using a DB device will provide the same benefit that the
WAL device would while also allowing additional
p ceph-sn1.example.com:/dev/mapper/disk1
> ceph-deploy osd prepare ceph-sn1.example.com:/dev/mapper/disk1
>
> Best wishes,
> Bruno
>
>
> -----Original Message-
> From: Benjeman Meekhof [mailto:bmeek...@umich.edu]
> Sent: 11 July 2017 18:46
> To: Canning, Bruno
Hi Bruno,
We have similar types of nodes and minimal configuration is required
(RHEL7-derived OS). Install device-mapper-multipath or equivalent
package, configure /etc/multipath.conf and enable 'multipathd'. If
working correctly the command 'multipath -ll' should output multipath
devices and
Hi Sage,
We did at one time run multiple clusters on our OSD nodes and RGW
nodes (with Jewel). We accomplished this by putting code in our
puppet-ceph module that would create additional systemd units with
appropriate CLUSTER=name environment settings for clusters not named
ceph. IE, if the
Hi all,
Even with debug_osd 0/0 as well as every other debug_ setting at 0/0 I
still get logs like those pasted below in
/var/log/ceph/ceph-osd..log when the relevant situation arises
(release 11.2.0).
Any idea what toggle switches these off? I went through and set
every single debug_ setting
Hi,
I'm seeing some SElinux denials for ops to nvme devices. They only
occur at OSD start, they are not ongoing. I'm not sure it's causing
an issue though I did try a few tests with SElinux in permissive mode
to see if it made any difference with startup/recovery CPU loading we
have seen since
far...@redhat.com> wrote:
> On Thu, Feb 16, 2017 at 9:19 AM, Benjeman Meekhof <bmeek...@umich.edu> wrote:
>> I tried starting up just a couple OSD with debug_osd = 20 and
>> debug_filestore = 20.
>>
>> I pasted a sample of the ongoing log here. To my eyes it do
ded+inconsistent
1 active+degraded+inconsistent
On Thu, Feb 16, 2017 at 5:08 PM, Shinobu Kinjo <ski...@redhat.com> wrote:
> Would you simply do?
>
> * ceph -s
>
> On Fri, Feb 17, 2017 at 6:26 AM, Benjeman Meekhof <bmeek...@umich.edu> wrote:
>> As I'm
share_map_peer 0x7fc68f4c1000 already has epoch 152609
2017-02-16 16:23:35.577356 7fc6704e4700 20 osd.564 152609
share_map_peer 0x7fc68f4c1000 already has epoch 152609
thanks,
Ben
On Thu, Feb 16, 2017 at 12:19 PM, Benjeman Meekhof <bmeek...@umich.edu> wrote:
> I tried starting up just a c
to revert to Jewel except perhaps one host to
continue testing.
thanks,
Ben
On Tue, Feb 14, 2017 at 3:55 PM, Gregory Farnum <gfar...@redhat.com> wrote:
> On Tue, Feb 14, 2017 at 11:38 AM, Benjeman Meekhof <bmeek...@umich.edu> wrote:
>> Hi all,
>>
>> We encountered an
Hi all,
I'd also not like to see cache tiering in the current form go away.
We've explored using it in situations where we have a data pool with
replicas spread across WAN sites which we then overlay with a fast
cache tier local to the site where most clients will be using the
pool. This
Hi all,
We encountered an issue updating our OSD from Jewel (10.2.5) to Kraken
(11.2.0). OS was RHEL derivative. Prior to this we updated all the
mons to Kraken.
After updating ceph packages I restarted the 60 OSD on the box with
'systemctl restart ceph-osd.target'. Very soon after the system
more RADOS handles?
>>
>> rgw_num_rados_handles = 8
>>
>> That with more RGW threads as Mark mentioned.
>>
>> Wido
>>
>> > I believe some folks are considering trying to migrate rgw to a
>> > threadpool/event processing model but it sounds lik
Hi all,
We're doing some stress testing with clients hitting our rados gw
nodes with simultaneous connections. When the number of client
connections exceeds about 5400 we start seeing 403 forbidden errors
and log messages like the following:
2017-02-09 08:53:16.915536 7f8c667bc700 0 NOTICE:
Hi Daniel,
50 ms of latency is going to introduce a big performance hit though
things will still function. We did a few tests which are documented
at http://www.osris.org/performance/latency
thanks,
Ben
On Tue, Feb 7, 2017 at 12:17 PM, Daniel Picolli Biazus
wrote:
> Hi
Hi Nick,
We have a Ceph cluster spread across 3 datacenters at 3 institutions
in Michigan (UM, MSU, WSU). It certainly is possible. As noted you
will have increased latency for write operations and overall reduced
throughput as latency increases. Latency between our sites is 3-5ms.
We did
+1 to this, it would be useful
On Tue, Oct 18, 2016 at 8:31 AM, Wido den Hollander wrote:
>
>> Op 18 oktober 2016 om 14:06 schreef Dan van der Ster :
>>
>>
>> +1 I would find this warning useful.
>>
>
> +1 Probably make it configurable, say, you want at least
For automatically collecting stats like this you might also look into
collectd. It has many plugins for different system statistics
including one for collecting stats from Ceph daemon admin sockets.
There are several ways to collect and view the data from collectd. We
are pointing clients at
;
> Hello,
>
> On Wed, 18 May 2016 12:32:25 -0400 Benjeman Meekhof wrote:
>
>> Hi Lionel,
>>
>> These are all very good points we should consider, thanks for the
>> analysis. Just a couple clarifications:
>>
>> - NVMe in this system are actually slot
cases).
regards,
Ben
On Wed, May 18, 2016 at 12:02 PM, Lionel Bouton <lionel+c...@bouton.name> wrote:
> Hi,
>
> I'm not yet familiar with Jewel, so take this with a grain of salt.
>
> Le 18/05/2016 16:36, Benjeman Meekhof a écrit :
>> We're in process of tuning a clus
Hi Michael,
Systemctl pattern for OSD with Infernalis or higher is 'systemctl
start ceph-osd@' (or status, restart)
It will start OSD in default cluster 'ceph' or other cluster if you
have set 'CLUSTER=' in /etc/sysconfig/ceph
If by chance you have 2 clusters on the same hardware you'll have
Hi Michael,
The partprobe issue was resolved for me by updating parted to the
package from Fedora 22: parted-3.2-16.fc22.x86_64. It shouldn't
require any other dependencies updated to install on EL7 varieties.
http://tracker.ceph.com/issues/15176
regards,
Ben
On Thu, Apr 14, 2016 at 12:35
50 matches
Mail list logo