[ceph-users] 答复: 答复: Ceph user manangerment question

2016-09-28 Thread 卢 迪
ok. thanks.


发件人: Daleep Singh Bais 
发送时间: 2016年9月28日 8:14:53
收件人: 卢 迪; ceph-users@lists.ceph.com
主题: Re: 答复: [ceph-users] Ceph user manangerment question

Hi Dillon,

Please check 
http://docs.ceph.com/docs/firefly/rados/operations/auth-intro/#ceph-authorization-caps

http://docs.ceph.com/docs/jewel/rados/operations/user-management/

This might provide some information on permissions.

Thanks,
Daleep Singh Bais

On 09/28/2016 11:28 AM, 卢 迪 wrote:

Hi Daleep,



Thank you for reply.

I have read the document for a moment. Let me try to clarify this.



In my case, I only assgin “mon ‘allow r” permission to account appuser. But, I 
still can mount cephfs and see the directory created before(the folder name is 
“test”).


And, I can create a folder under this folder too. (the folder is “test2”)

However, when I created and edited an text file(“test.txt”) with a read only 
error.When I quit with "q!", I still see the file with 0 bytes.

 [cid:part1.07070906.08050704@gmail.com]

I'm wondering I must misunderstand something. I thought I shouldn't see this 
folder "test" because the user didn't have the read/write permission against 
any pool in this cluster. I shouldn't create the "test.txt" in this folder too 
because of premission.(But, I CREATED it with nothing)



Let's say assigning an OS user permission(for example, Linux). I have to give 
read permission if a user want to read a file; If it has to execute a script, I 
have to grant the exeucte permission. I want to understand when and why I 
should assign which permssion to an user by meeting a special task. Can I find 
this kind of document?



Thanks,

Dillon


发件人: Daleep Singh Bais 
发送时间: 2016年9月27日 6:55:10
收件人: 卢 迪; ceph-users@lists.ceph.com
主题: Re: [ceph-users] Ceph user manangerment question

Hi Dillon,

Ceph uses CephX authentication, which gives permission to users on selected 
Pools  to read / write.  We give mon 'allow r'
 to get cluster/Crush map for client.

You can refer to below URL for more information on CephX and creating user 
keyrings for access to selected / specific pools.

http://docs.ceph.com/docs/jewel/rados/configuration/auth-config-ref/
Cephx Config Reference — Ceph 
Documentation
docs.ceph.com
Deployment Scenarios¶ There are two main scenarios for deploying a Ceph 
cluster, which impact how you initially configure Cephx. Most first time Ceph 
users use ceph ...




The below URL will give you information on various permissions which can be 
applied while creating a CephX authentication key.

http://docs.ceph.com/docs/firefly/rados/operations/auth-intro/
Ceph Authentication & Authorization — Ceph 
Documentation
docs.ceph.com
Ceph Authentication & Authorization¶ Ceph is a distributed storage system where 
a typical deployment involves a relatively small quorum of monitors, scores of 
...




Hope this will give some insight and way forward to proceed.

Thanks,

Daleep Singh Bais

On 09/27/2016 12:02 PM, 卢 迪 wrote:

Hello all,


I'm a newbie of Ceph. I read the document and created a ceph cluster against 
VM. I have a question about how to apply user managerment to the cluster. I'm 
not asking how to create or modify users or user privileges. I have found this 
in the Ceph document.


I want to know:


1. Is there a way to know the usage of all privileges? For example, I created 
an user client.appuser with mon "allow r", this user can accsess the Ceph; If I 
removed the mon "allow r", it will be time out. (in this case, I mount the 
cluster with cephfs). If someone has these information, could you please share 
with me?


2. What kind of situation would you create differnet users for cluster? In 
currently, I user admin user to access the all cluster, such as start cluster, 
mount file system and etc. It looks like the appuser( I created above) can 
mount file system too. Is it possible to create an user liking the OS user or 
database user? So, one user upload some data, the others can't see them or can 
only read them.


ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] KVM vm using rbd volume hangs on 120s when one of the nodes crash

2016-09-28 Thread wei li
Hi, colleagues!
  I'm using Ceph 10.0.2, build a Ceph cluster in order to use it in
production environment.
  And I'm using OpenStack L version. I tested the ceph osd node crash, like
pull out the power supplier or the network cable.
  At the same time, in the vm I try to run some commands, it will hang.
After 120s, it will become OK, and the console will print "blocked for more
than 120 seconds."
  I configured "mon osd report timeout" 20 seconds. So when pull out the
network cable, after less than 30 seconds the osd will be set to down.
  Is there any way to adjust some parameters to make this hang period to
20-30 seconds.
Mars
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Christian Balzer

Hello,

On Wed, 28 Sep 2016 19:36:28 +0200 Sascha Vogt wrote:

> Hi Christian,
> 
> Am 28.09.2016 um 16:56 schrieb Christian Balzer:
> > 0.94.5 has a well known and documented bug, it doesn't rotate the omap log
> > of the OSDs.
> > 
> > Look into "/var/lib/ceph/osd/ceph-xx/current/omap/" of the cache tier and 
> > most likely discover a huge "LOG" file.
> You're right, it was around 200 MB on each OSD (so in total 3,2 GB).
> Double restart of the OSDs fixed that. Can this log file keep 0-byte
> files alive? In other words could this LOG file be the reason for the
> missing 840 GB - or did you think that the log itself might be in the GB
> range?
> 
The later, if not watched long enough. 
But yeah, not explaining the amounts you were mentioning.

I don't think the LOG is keeping the 0-byte files alive, though.

In general these are objects that have been evicted from the cache and if
it's very busy you will wind up with each object that's on the backing
pool also being present (if just as 0-byte file) in your cache tier.

Similar to objects that get created on the cache-tier (writes) and have
not been flushed, they will have 0-byte file on the backing pool.

So that is going to eat up space in a fashion. 

In your particular case, I'd expect objects that are deleted to be gone,
maybe with some delay.

Can you check/verify that the deleted objects are actually gone on the
backing pool?


> Anyway, already thanks for the hint about the log file. We'll keep an
> eye on that one and try to upgrade to Hammer soon!
> 
Well you're already on Hammer. ^o^
Just don't upgrade to 0.94.6, whatever you do (lethal cache tier bug).

If you don't have too many OSDs (see the various threads here), upgrading
to 0.94.9 or a to be released .10 which addresses the encoding storms
should be fine.

At this point in time I think Jewel still has too many rough edges, but
that's me.
Take note (search the ML archives) that Jewel massively changes the cache
tiering behavior (not promoting things as readily as Hammer), so make sure
you don't get surprised there.

Christian

> Greetings
> -Sascha-
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD Down but not marked down by cluster

2016-09-28 Thread Tyler Bishop
S1148 is down but the cluster does not mark it as such. 

cluster 3aac8ab8-1011-43d6-b281-d16e7a61b2bd 
health HEALTH_WARN 
3888 pgs backfill 
196 pgs backfilling 
6418 pgs degraded 
52 pgs down 
52 pgs peering 
1 pgs recovery_wait 
3653 pgs stuck degraded 
52 pgs stuck inactive 
6088 pgs stuck unclean 
3653 pgs stuck undersized 
6417 pgs undersized 
186 requests are blocked > 32 sec 
recovery 42096983/185765821 objects degraded (22.661%) 
recovery 49940341/185765821 objects misplaced (26.883%) 
16/330 in osds are down 
monmap e1: 3 mons at 
{ceph0-mon0=10.1.8.40:6789/0,ceph0-mon1=10.1.8.41:6789/0,ceph0-mon2=10.1.8.42:6789/0}
 
election epoch 13550, quorum 0,1,2 ceph0-mon0,ceph0-mon1,ceph0-mon2 
osdmap e236889: 370 osds: 314 up, 330 in; 4096 remapped pgs 
pgmap v47890297: 20920 pgs, 19 pools, 316 TB data, 85208 kobjects 
530 TB used, 594 TB / 1125 TB avail 
42096983/185765821 objects degraded (22.661%) 
49940341/185765821 objects misplaced (26.883%) 
14390 active+clean 
3846 active+undersized+degraded+remapped+wait_backfill 
2375 active+undersized+degraded 
196 active+undersized+degraded+remapped+backfilling 
52 down+peering 
42 active+remapped+wait_backfill 
11 active+remapped 
7 active+clean+scrubbing+deep 
1 active+recovery_wait+degraded+remapped 
recovery io 2408 MB/s, 623 objects/s 


-43 304.63928 host ceph0-s1148 
303 5.43999 osd.303 down 0 1.0 
304 5.43999 osd.304 down 0 1.0 
305 5.43999 osd.305 down 0 1.0 
306 5.43999 osd.306 down 0 1.0 
307 5.43999 osd.307 down 0 1.0 
308 5.43999 osd.308 down 0 1.0 
309 5.43999 osd.309 down 0 1.0 
310 5.43999 osd.310 down 0 1.0 
311 5.43999 osd.311 down 0 1.0 
312 5.43999 osd.312 down 0 1.0 
313 5.43999 osd.313 down 0 1.0 
314 5.43999 osd.314 down 0 1.0 
315 5.43999 osd.315 down 0 1.0 
316 5.43999 osd.316 down 0 1.0 
317 5.43999 osd.317 down 0 1.0 
318 5.43999 osd.318 down 0 1.0 
319 5.43999 osd.319 down 0 1.0 
320 5.43999 osd.320 down 0 1.0 
321 5.43999 osd.321 down 0 1.0 
322 5.43999 osd.322 down 0 1.0 
323 5.43999 osd.323 down 0 1.0 
324 5.43999 osd.324 down 0 1.0 
325 5.43999 osd.325 down 0 1.0 
326 5.43999 osd.326 down 0 1.0 
327 5.43999 osd.327 down 0 1.0 
328 5.43999 osd.328 down 0 1.0 
329 5.43999 osd.329 down 0 1.0 
330 5.43999 osd.330 down 0 1.0 
331 5.43999 osd.331 down 0 1.0 
332 5.43999 osd.332 down 1.0 1.0 
333 5.43999 osd.333 down 1.0 1.0 
334 5.43999 osd.334 down 1.0 1.0 
335 5.43999 osd.335 down 0 1.0 
337 5.43999 osd.337 down 1.0 1.0 
338 5.43999 osd.338 down 0 1.0 
339 5.43999 osd.339 down 1.0 1.0 
340 5.43999 osd.340 down 0 1.0 
341 5.43999 osd.341 down 0 1.0 
342 5.43999 osd.342 down 0 1.0 
343 5.43999 osd.343 down 0 1.0 
344 5.43999 osd.344 down 0 1.0 
345 5.43999 osd.345 down 0 1.0 
346 5.43999 osd.346 down 0 1.0 
347 5.43999 osd.347 down 1.0 1.0 
348 5.43999 osd.348 down 1.0 1.0 
349 5.43999 osd.349 down 0 1.0 
350 5.43999 osd.350 down 1.0 1.0 
351 5.43999 osd.351 down 1.0 1.0 
352 5.43999 osd.352 down 1.0 1.0 
353 5.43999 osd.353 down 1.0 1.0 
354 5.43999 osd.354 down 1.0 1.0 
355 5.43999 osd.355 down 1.0 1.0 
356 5.43999 osd.356 down 1.0 1.0 
357 5.43999 osd.357 down 1.0 1.0 
358 5.43999 osd.358 down 0 1.0 
369 5.43999 osd.369 down 1.0 1.0 







Tyler Bishop 
Chief Technical Officer 
513-299-7108 x10 



tyler.bis...@beyondhosting.net 


If you are not the intended recipient of this transmission you are notified 
that disclosing, copying, distributing or taking any action in reliance on the 
contents of this information is strictly prohibited. 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Same pg scrubbed over and over (Jewel)

2016-09-28 Thread Arvydas Opulskis
Hi,

we have same situation with one PG on our different cluster. Scrubs and
deep-scrubs are running over and over for same PG (38.34). I've logged some
period with deep-scrub and some scrubs repeating. OSD log form primary osd
can be found there:
https://www.dropbox.com/s/njmixbgzkfo1wws/ceph-osd.377.log.gz?dl=0

Cluster is Jewel 10.2.2. Btw, restarting primary osd service doesn't help.

Br,
Arvydas


On Wed, Sep 21, 2016 at 2:35 PM, Samuel Just  wrote:

> Ah, same question then.  If we can get logging on the primary for one
> of those pgs, it should be fairly obvious.
> -Sam
>
> On Wed, Sep 21, 2016 at 4:08 AM, Pavan Rallabhandi
>  wrote:
> > We find this as well in our fresh built Jewel clusters, and seems to
> happen only with a handful of PGs from couple of pools.
> >
> > Thanks!
> >
> > On 9/21/16, 3:14 PM, "ceph-users on behalf of Tobias Böhm" <
> ceph-users-boun...@lists.ceph.com on behalf of t...@robhost.de> wrote:
> >
> > Hi,
> >
> > there is an open bug in the tracker: http://tracker.ceph.com/
> issues/16474
> >
> > It also suggests restarting OSDs as a workaround. We faced the same
> issue after increasing the number of PGs in our cluster and restarting OSDs
> solved it as well.
> >
> > Tobias
> >
> > > Am 21.09.2016 um 11:26 schrieb Dan van der Ster <
> d...@vanderster.com>:
> > >
> > > There was a thread about this a few days ago:
> > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/
> 2016-September/012857.html
> > > And the OP found a workaround.
> > > Looks like a bug though... (by default PGs scrub at most once per
> day).
> > >
> > > -- dan
> > >
> > >
> > >
> > > On Tue, Sep 20, 2016 at 10:43 PM, Martin Bureau <
> mbur...@stingray.com> wrote:
> > >> Hello,
> > >>
> > >>
> > >> I noticed that the same pg gets scrubbed repeatedly on our new
> Jewel
> > >> cluster:
> > >>
> > >>
> > >> Here's an excerpt from log:
> > >>
> > >>
> > >> 2016-09-20 20:36:31.236123 osd.12 10.1.82.82:6820/14316 150514 :
> cluster
> > >> [INF] 25.3f scrub ok
> > >> 2016-09-20 20:36:32.232918 osd.12 10.1.82.82:6820/14316 150515 :
> cluster
> > >> [INF] 25.3f scrub starts
> > >> 2016-09-20 20:36:32.236876 osd.12 10.1.82.82:6820/14316 150516 :
> cluster
> > >> [INF] 25.3f scrub ok
> > >> 2016-09-20 20:36:33.233268 osd.12 10.1.82.82:6820/14316 150517 :
> cluster
> > >> [INF] 25.3f deep-scrub starts
> > >> 2016-09-20 20:36:33.242258 osd.12 10.1.82.82:6820/14316 150518 :
> cluster
> > >> [INF] 25.3f deep-scrub ok
> > >> 2016-09-20 20:36:36.233604 osd.12 10.1.82.82:6820/14316 150519 :
> cluster
> > >> [INF] 25.3f scrub starts
> > >> 2016-09-20 20:36:36.237221 osd.12 10.1.82.82:6820/14316 150520 :
> cluster
> > >> [INF] 25.3f scrub ok
> > >> 2016-09-20 20:36:41.234490 osd.12 10.1.82.82:6820/14316 150521 :
> cluster
> > >> [INF] 25.3f deep-scrub starts
> > >> 2016-09-20 20:36:41.243720 osd.12 10.1.82.82:6820/14316 150522 :
> cluster
> > >> [INF] 25.3f deep-scrub ok
> > >> 2016-09-20 20:36:45.235128 osd.12 10.1.82.82:6820/14316 150523 :
> cluster
> > >> [INF] 25.3f deep-scrub starts
> > >> 2016-09-20 20:36:45.352589 osd.12 10.1.82.82:6820/14316 150524 :
> cluster
> > >> [INF] 25.3f deep-scrub ok
> > >> 2016-09-20 20:36:47.235310 osd.12 10.1.82.82:6820/14316 150525 :
> cluster
> > >> [INF] 25.3f scrub starts
> > >> 2016-09-20 20:36:47.239348 osd.12 10.1.82.82:6820/14316 150526 :
> cluster
> > >> [INF] 25.3f scrub ok
> > >> 2016-09-20 20:36:49.235538 osd.12 10.1.82.82:6820/14316 150527 :
> cluster
> > >> [INF] 25.3f deep-scrub starts
> > >> 2016-09-20 20:36:49.243121 osd.12 10.1.82.82:6820/14316 150528 :
> cluster
> > >> [INF] 25.3f deep-scrub ok
> > >> 2016-09-20 20:36:51.235956 osd.12 10.1.82.82:6820/14316 150529 :
> cluster
> > >> [INF] 25.3f deep-scrub starts
> > >> 2016-09-20 20:36:51.244201 osd.12 10.1.82.82:6820/14316 150530 :
> cluster
> > >> [INF] 25.3f deep-scrub ok
> > >> 2016-09-20 20:36:52.236076 osd.12 10.1.82.82:6820/14316 150531 :
> cluster
> > >> [INF] 25.3f scrub starts
> > >> 2016-09-20 20:36:52.239376 osd.12 10.1.82.82:6820/14316 150532 :
> cluster
> > >> [INF] 25.3f scrub ok
> > >> 2016-09-20 20:36:56.236740 osd.12 10.1.82.82:6820/14316 150533 :
> cluster
> > >> [INF] 25.3f scrub starts
> > >>
> > >>
> > >> How can I troubleshoot / resolve this ?
> > >>
> > >>
> > >> Regards,
> > >>
> > >> Martin
> > >>
> > >>
> > >>
> > >> ___
> > >> ceph-users mailing list
> > >> ceph-users@lists.ceph.com
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> 

[ceph-users] Attempt to access beyond end of device

2016-09-28 Thread Brady Deetz
The question:
Is this something I need to investigate further, or am I being paranoid?
Seems bad to me.


I have a fairly new cluster built using ceph-deploy 1.5.34-0, ceph
10.2.2-0, and centos 7.2.1511.

I recently noticed on every one of my osd nodes alarming dmesg log entries
for each osd on each node on some kind of periodic basis:
attempt to access beyond end of device
sda1: rw=0, want=11721043088, limit=11721043087

For instance one node had entries at times:
Sep 27 05:40:34
Sep 27 07:10:32
Sep 27 08:10:30
Sep 27 09:40:28
Sep 27 12:40:24
Sep 27 15:40:19

In every case, the "want" is 1 sector greater than the "limit"... My first
thought was 'could this be an off-by-one bug somewhere in Ceph?' But, after
thinking about the way stuff works and the data below, the seems unlikely.

Digging around I found and followed this redhat article:
https://access.redhat.com/solutions/21135

--
Error Message Device Size:
11721043087 * 512 = 6001174060544


Current Device Size:
cat /proc/partitions | grep sda1
8 1 5860521543 sda1

5860521543 * 1024 = 6001174060032


Filesystem Size:
sudo xfs_info /dev/sda1 | grep data | grep blocks
data = bsize=4096 blocks=1465130385, imaxpct=5

1465130385 * 4096 = 6001174056960
--

(EMDS != CDS) == true
Redhat says device naming may have change. All but 2 disks in the node are
identical. Those 2 disks are md raided and not exhibiting the issue. So, I
don't think this is the issue.

(FSS > CDS) == false
My filesystem is not larger than the device size or the error message
device size.

Thanks,
Brady
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi Christian,

Am 28.09.2016 um 16:56 schrieb Christian Balzer:
> 0.94.5 has a well known and documented bug, it doesn't rotate the omap log
> of the OSDs.
> 
> Look into "/var/lib/ceph/osd/ceph-xx/current/omap/" of the cache tier and 
> most likely discover a huge "LOG" file.
You're right, it was around 200 MB on each OSD (so in total 3,2 GB).
Double restart of the OSDs fixed that. Can this log file keep 0-byte
files alive? In other words could this LOG file be the reason for the
missing 840 GB - or did you think that the log itself might be in the GB
range?

Anyway, already thanks for the hint about the log file. We'll keep an
eye on that one and try to upgrade to Hammer soon!

Greetings
-Sascha-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Very Small Cluster

2016-09-28 Thread Vasu Kulkarni
On Wed, Sep 28, 2016 at 8:03 AM, Ranjan Ghosh  wrote:
> Hi everyone,
>
> Up until recently, we were using GlusterFS to have two web servers in sync
> so we could take one down and switch back and forth between them - e.g. for
> maintenance or failover. Usually, both were running, though. The performance
> was abysmal, unfortunately. Copying many small files on the file system
> caused outages for several minutes - simply unacceptable. So I found Ceph.
> It's fairly new but I thought I'd give it a try. I liked especially the
> good, detailed documentation, the configurability and the many command-line
> tools which allow you to find out what is going on with your Cluster. All of
> this is severly lacking with GlusterFS IMHO.
>
> Because we're on a very tiny budget for this project we cannot currently
> have more than two file system servers. I added a small Virtual Server,
> though, only for monitoring. So at least we have 3 monitoring nodes. I also
> created 3 MDS's, though as far as I understood, two are only for standby. To
> sum it up, we have:
>
> server0: Admin (Deployment started from here) + Monitor + MDS
> server1: Monitor + MDS + OSD
> server2: Monitor + MDS + OSD
>
> So, the OSD is on server1 and server2 which are next to each other connected
> by a local GigaBit-Ethernet connection. The cluster is mounted (also on
> server1 and server2) as /var/www and Apache is serving files off the
> cluster.
>
> I've used these configuration settings:
>
> osd pool default size = 2
> osd pool default min_size = 1
>
> My idea was that by default everything should be replicated on 2 servers
> i.e. each file is normally written on server1 and server2. In case of
> emergency though (one server has a failure), it's better to keep operating
> and only write the file to one server. Therefore, i set min_size = 1. My
> further understanding is (correct me if I'm wrong), that when the server
> comes back online, the files that were written to only 1 server during the
> outage will automatically be replicated to the server that has come back
> online.
>
> So far, so good. With two servers now online, the performance is light-years
> away from sluggish GlusterFS. I've also worked with XtreemFS, OCFS2, AFS and
> never had such a good performance with any Cluster. In fact it's so
> blazingly fast, that I had to check twice I really had the cluster mounted
> and wasnt accidentally working on the hard drive. Impressive. I can edit
> files on server1 and they are immediately changed on server2 and vice versa.
> Great!
>
Nice!, Thanks for sharing your details.

> Unfortunately, when I'm now stopping all ceph-Services on server1, the
> websites on server2 start to hang/freeze. And "ceph health" shows "#x
> blocked requests". Now, what I don't understand: Why is it blocking?
> Shouldnt both servers have the file? And didn't I set min_size to "1"? And
> if there are a few files (could be some unimportant stuff) that's missing on
> one of the servers: How can I abort the blocking? I'd rather have a missing
> file or whatever, then a completely blocking website.

Are all the pools using min_size 1?  did you check pg stat and see which ones
are waiting? some steps to debug further and check
 http://docs.ceph.com/docs/jewel/rados/operations/monitoring-osd-pg/

Also did you shutdown the server abruptly while it was busy?


>
> Are my files really duplicated 1:1 - or are they perhaps spread evenly
> between both OSDs? Do I have to edit the crushmap to achieve a real
> "RAID-1"-type of replication? Is there a command to find out for a specific
> file where it actually resides and whether it has really been replicated?


>
> Thank you!
> Ranjan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] fixing zones

2016-09-28 Thread Michael Parson

On Wed, 28 Sep 2016, Orit Wasserman wrote:

see blow

On Tue, Sep 27, 2016 at 8:31 PM, Michael Parson  wrote:





We googled around a bit and found the fix-zone script:

https://raw.githubusercontent.com/yehudasa/ceph/wip-fix-default-zone/src/fix-zone

Which ran fine until the last command, which errors out with:

+ radosgw-admin zone default --rgw-zone=default
WARNING: failed to initialize zonegroup



That is a known issue (default zone is a realm propety) it should not
effect you because radosgw uses the "default" zone
if it doesn't find any zone.


the 'default' rgw-zone seems OK:

$ sudo radosgw-admin zone get --zone-id=default
{
"id": "default",
"name": "default",
"domain_root": ".rgw_",


the underscore doesn't look good here and in the other pools
are you sure this are the pools you used before?


The underscores were done by the script referenced above, but you're
right, I don't see the underscores in my osd pool list:

$ sudo ceph osd pool ls | grep rgw | sort
default.rgw.buckets.data
.rgw
.rgw.buckets
.rgw.buckets.extra
.rgw.buckets.index
.rgw.control
.rgw.gc
.rgw.meta
.rgw.root
.rgw.root.backup

--
Michael Parson
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Very Small Cluster

2016-09-28 Thread Ranjan Ghosh

Hi everyone,

Up until recently, we were using GlusterFS to have two web servers in 
sync so we could take one down and switch back and forth between them - 
e.g. for maintenance or failover. Usually, both were running, though. 
The performance was abysmal, unfortunately. Copying many small files on 
the file system caused outages for several minutes - simply 
unacceptable. So I found Ceph. It's fairly new but I thought I'd give it 
a try. I liked especially the good, detailed documentation, the 
configurability and the many command-line tools which allow you to find 
out what is going on with your Cluster. All of this is severly lacking 
with GlusterFS IMHO.


Because we're on a very tiny budget for this project we cannot currently 
have more than two file system servers. I added a small Virtual Server, 
though, only for monitoring. So at least we have 3 monitoring nodes. I 
also created 3 MDS's, though as far as I understood, two are only for 
standby. To sum it up, we have:


server0: Admin (Deployment started from here) + Monitor + MDS
server1: Monitor + MDS + OSD
server2: Monitor + MDS + OSD

So, the OSD is on server1 and server2 which are next to each other 
connected by a local GigaBit-Ethernet connection. The cluster is mounted 
(also on server1 and server2) as /var/www and Apache is serving files 
off the cluster.


I've used these configuration settings:

osd pool default size = 2
osd pool default min_size = 1

My idea was that by default everything should be replicated on 2 servers 
i.e. each file is normally written on server1 and server2. In case of 
emergency though (one server has a failure), it's better to keep 
operating and only write the file to one server. Therefore, i set 
min_size = 1. My further understanding is (correct me if I'm wrong), 
that when the server comes back online, the files that were written to 
only 1 server during the outage will automatically be replicated to the 
server that has come back online.


So far, so good. With two servers now online, the performance is 
light-years away from sluggish GlusterFS. I've also worked with 
XtreemFS, OCFS2, AFS and never had such a good performance with any 
Cluster. In fact it's so blazingly fast, that I had to check twice I 
really had the cluster mounted and wasnt accidentally working on the 
hard drive. Impressive. I can edit files on server1 and they are 
immediately changed on server2 and vice versa. Great!


Unfortunately, when I'm now stopping all ceph-Services on server1, the 
websites on server2 start to hang/freeze. And "ceph health" shows "#x 
blocked requests". Now, what I don't understand: Why is it blocking? 
Shouldnt both servers have the file? And didn't I set min_size to "1"? 
And if there are a few files (could be some unimportant stuff) that's 
missing on one of the servers: How can I abort the blocking? I'd rather 
have a missing file or whatever, then a completely blocking website.


Are my files really duplicated 1:1 - or are they perhaps spread evenly 
between both OSDs? Do I have to edit the crushmap to achieve a real 
"RAID-1"-type of replication? Is there a command to find out for a 
specific file where it actually resides and whether it has really been 
replicated?


Thank you!
Ranjan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Christian Balzer
On Wed, 28 Sep 2016 14:08:43 +0200 Sascha Vogt wrote:

> Hi all,
> 
> we currently experience a few "strange" things on our Ceph cluster and I
> wanted to ask if anyone has recommendations for further tracking them
> down (or maybe even an explanation already ;) )
> 
> Ceph version is 0.94.5 and we have a HDD based pool with a cache pool on
> NVMe SSDs in front if it.
> 
> ceph df detail lists a "used" size on the ssd pool (the cache) of
> currently 3815 GB. We have a replication size of 2, so effectively this
> should take around 7670 GB on disk. Duing a df on all OSDs and summing
> them up gives 8501 GB, which is 871 GB more than expected.
> 
> Last week the difference was around 840 GB, the week before that around
> 780 GB. So it looks like the difference is constantly growing.
> 

0.94.5 has a well known and documented bug, it doesn't rotate the omap log
of the OSDs.

Look into "/var/lib/ceph/osd/ceph-xx/current/omap/" of the cache tier and 
most likely discover a huge "LOG" file.

Upgrading to the latest Hammer will fix this, alas that fraught with peril
as well if you have lots of OSDs.

Otherwise, restart the cache OSDs (twice) one by one.

Christian

> Doing a for date in `ceph pg dump | grep active | awk '{print $20}'`; do
> date +%A -d $date; done | sort | uniq -c
> 
> Returns
> 
> 2002 Tuesday
> 1390 Wednesday
> 
> So scrubbing and deepscrubbing is regularly done.
> 
> A thing I noticed which might or might not be related is the following:
> The pool is used for OpenStack ephemeral disks and I had created a 1 TB
> VM (1TB ephemeral, not a cinder volume ;) )
> 
> I looked up the RBD device and noted down the block prefix name.
> 
> > rbd info ephemeral-vms/0edd1080-9f84-48d2-8714-34b1cd7d50df_disk
> > rbd image '0edd1080-9f84-48d2-8714-34b1cd7d50df_disk':
> > size 1024 GB in 262144 objects
> > order 22 (4096 kB objects)
> > block_name_prefix: rbd_data.2c383a0238e1f29
> > format: 2
> > features: layering
> > flags:
> 
> After I had deleted the VM I regularly checked the amount of objects in
> rados via "rados -p ephemeral-vms ls | grep rbd_data.2c383a0238e1f29 |
> wc -l"
> 
> and it still returns a large amount of objects:
> 
> > Mon Sep 19 09:10:43 CEST 2016 - 138937
> > Tue Sep 20 16:11:55 CEST 2016 - 135818
> > Thu Sep 22 09:59:03 CEST 2016 - 135791
> > Wed Sep 28 12:15:07 CEST 2016 - 133862
> 
> I did a "stat" AND a "rm" on each and every of those objects, but they
> all returned:
> 
> >  rados -p ephemeral-vms stat rbd_data.2c383a0238e1f29.f8b8
> >  error stat-ing ephemeral-vms/rbd_data.2c383a0238e1f29.f8b8: 
> > (2) No such file or directory
> 
> So why is rados still return those objects via an ls?
> 
> Even worse, counting the objects on the ssd pool I get:
> rados -p ssd ls | grep rbd_data.2c383a0238e1f29 | wc -l
> Wed Sep 28 12:54:07 CEST 2016 - 246681
> 
> I did a find on one of the OSDs data dir:
> > find . -name "*data.2c383a0238e1f29*" | wc -l
> > 33060
> 
> And checked a few, all of them very 0-byte files
> 
> e.g.
> > ls -lha 
> > ./11.1d_head/DIR_D/DIR_1/DIR_0/DIR_7/DIR_9/rbd\\udata.2c383a0238e1f29.00019bf7__head_87C9701D__b
> > -rw-r--r-- 1 root root 0 Sep  9 11:21 
> > ./11.1d_head/DIR_D/DIR_1/DIR_0/DIR_7/DIR_9/rbd\udata.2c383a0238e1f29.00019bf7__head_87C9701D__b
> 
> But even a 0-byte file takes some space on the disk, might those be the
> reason?
> 
> Any feedback welcome.
> Greetings
> -Sascha-
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW multisite replication failures

2016-09-28 Thread Ben Morrice
Hello Orit,

Thanks for your help so far. The bug you referenced was not included in
10.2.3. I cherry-picked the commits mentioned in
http://tracker.ceph.com/issues/16742 into the 10.2.3 release and
deployed this radosgw on the servers affected.

Unfortunately it's still failing, now the sync and subsequent retries of
the sync fail with a return code of -5.

Any other suggestions?


2016-09-28 16:14:52.145933 7f84609e3700 20 rgw meta sync: entry:
name=20160928:bbp-gva-master.106061599.1
2016-09-28 16:14:52.145994 7f84609e3700 20 rgw meta sync: entry:
name=20160928:bbp-gva-master.106061599.1
2016-09-28 16:14:52.145998 7f84609e3700 20 rgw meta sync: entry:
name=20160928
2016-09-28 16:14:52.146001 7f84609e3700 20 rgw meta sync: entry:
name=20160928
2016-09-28 16:14:52.151393 7f84609e3700 20 rgw meta sync:
incremental_sync:1576: shard_id=11 log_entry:
1_1475072090.900125_1286081.1:bucket.instance:20160928:bbp-gva-master.106061599.1:2016-09-28
16:14:50.900125
2016-09-28 16:14:52.151507 7f84609e3700 20 rgw meta sync:
incremental_sync:1576: shard_id=11 log_entry:
1_1475072090.914533_1286082.1:bucket.instance:20160928:bbp-gva-master.106061599.1:2016-09-28
16:14:50.914533
2016-09-28 16:14:52.151524 7f84609e3700 20 rgw meta sync: fetching
remote metadata: bucket.instance:20160928:bbp-gva-master.106061599.1
2016-09-28 16:14:52.151533 7f84609e3700 20 rgw meta sync:
incremental_sync:1576: shard_id=11 log_entry:
1_1475072090.918249_1286083.1:bucket:20160928:2016-09-28 16:14:50.918249
2016-09-28 16:14:52.151700 7f84609e3700 10 get_canon_resource():
dest=/admin/metadata/bucket.instance/20160928:bbp-gva-master.106061599.1
/admin/metadata/bucket.instance/20160928:bbp-gva-master.106061599.1
2016-09-28 16:14:52.151756 7f84609e3700 20 sending request to
https://bbpobjectstorage.epfl.ch:443/admin/metadata/bucket.instance/20160928:bbp-gva-master.106061599.1?key=20160928%3Abbp-gva-master.106061599.1=bbp-gva
2016-09-28 16:14:52.151814 7f84609e3700 20 rgw meta sync:
incremental_sync:1576: shard_id=11 log_entry:
1_1475072090.933082_1286084.1:bucket:20160928:2016-09-28 16:14:50.933082
2016-09-28 16:14:52.151839 7f84609e3700 20 rgw meta sync: fetching
remote metadata: bucket:20160928
2016-09-28 16:14:52.152030 7f84609e3700 10 get_canon_resource():
dest=/admin/metadata/bucket/20160928
/admin/metadata/bucket/20160928
2016-09-28 16:14:52.152086 7f84609e3700 20 sending request to
https://bbpobjectstorage.epfl.ch:443/admin/metadata/bucket/20160928?key=20160928=bbp-gva
2016-09-28 16:14:52.299619 7f8471ffb700 20 get_system_obj_state:
rctx=0x7f8471ff9200 obj=.bbp-gva-secondary.domain.rgw:20160928
state=0x7f842c24b7f8 s->prefetch_data=0
2016-09-28 16:14:52.398943 7f846a7fc700 20 reading from
.bbp-gva-secondary.domain.rgw:.bucket.meta.20160928:bbp-gva-master.106061599.1
2016-09-28 16:14:52.398975 7f846a7fc700 20 get_system_obj_state:
rctx=0x7f846a7fa150
obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.20160928:bbp-gva-master.106061599.1
state=0x7f8420de2af8 s->prefetch_data=0
2016-09-28 16:15:38.111569 7f828ed65700 20 execute(): read data:
[{"key":16,"val":["20160928:bbp-gva-master.106061599.1"]}]
2016-09-28 16:15:38.111836 7f828ed65700 20 execute(): modified
key=20160928:bbp-gva-master.106061599.1
2016-09-28 16:15:38.111839 7f828ed65700 20 wakeup_data_sync_shards:
source_zone=bbp-gva-master,
shard_ids={16=20160928:bbp-gva-master.106061599.1}
2016-09-28 16:15:38.111919 7f845b7fe700 20 incremental_sync(): async
update notification: 20160928:bbp-gva-master.106061599.1
2016-09-28 16:15:38.112248 7f845b7fe700  5
Sync:bbp-gva-:data:Bucket:20160928:bbp-gva-master.106061599.1:start
2016-09-28 16:15:38.112661 7f8473fff700 20 get_system_obj_state:
rctx=0x7f84283c17f8
obj=.bbp-gva-secondary.log:bucket.sync-status.bbp-gva-master:20160928:bbp-gva-master.106061599.1
state=0x7f844405b338 s->prefetch_data=0
2016-09-28 16:15:38.115079 7f845b7fe700 20 operate(): sync status for
bucket 20160928:bbp-gva-master.106061599.1: 0
2016-09-28 16:15:38.115202 7f8491bf7700 20 reading from
.bbp-gva-secondary.domain.rgw:.bucket.meta.20160928:bbp-gva-master.106061599.1
2016-09-28 16:15:38.115232 7f8491bf7700 20 get_system_obj_state:
rctx=0x7f8491bf56d0
obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.20160928:bbp-gva-master.106061599.1
state=0x7f84540808d8 s->prefetch_data=0
2016-09-28 16:15:38.118544 7f8491bf7700 20 get_system_obj_state:
rctx=0x7f8491bf56d0
obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.20160928:bbp-gva-master.106061599.1
state=0x7f84540808d8 s->prefetch_data=0
2016-09-28 16:15:38.138092 7f845b7fe700 20 sending request to
https://bbpobjectstorage.epfl.ch:443/admin/log/?type=bucket-index=20160928%3Abbp-gva-master.106061599.1=bbp-gva
2016-09-28 16:15:38.223131 7f845b7fe700  5
Sync:bbp-gva-:data:BucketFull:20160928:bbp-gva-master.106061599.1:start
2016-09-28 16:15:38.227125 7f845b7fe700 10 get_canon_resource():
dest=/20160928?versions
/20160928?versions
2016-09-28 16:15:38.227201 7f845b7fe700 20 sending request to

Re: [ceph-users] Bcache, partitions and BlueStore

2016-09-28 Thread Wido den Hollander

> Op 26 september 2016 om 19:51 schreef Sam Yaple :
> 
> 
> On Mon, Sep 26, 2016 at 5:44 PM, Wido den Hollander  wrote:
> 
> >
> > > Op 26 september 2016 om 17:48 schreef Sam Yaple :
> > >
> > >
> > > On Mon, Sep 26, 2016 at 9:31 AM, Wido den Hollander 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > This has been discussed on the ML before [0], but I would like to bring
> > > > this up again with the outlook towards BlueStore.
> > > >
> > > > Bcache [1] allows for block device level caching in Linux. This can be
> > > > read/write(back) and vastly improves read and write performance to a
> > block
> > > > device.
> > > >
> > > > With the current layout of Ceph with FileStore you can already use
> > bcache,
> > > > but not with ceph-disk.
> > > >
> > > > The reason is that bcache currently does not support creating
> > partitions
> > > > on those devices. There are patches [2] out there, but they are not
> > > > upstream.
> > > >
> > > > I haven't tested it yet, but it looks like BlueStore can still benefit
> > > > quite good from Bcache and it would be a lot easier if the patches [2]
> > were
> > > > merged upstream.
> > > >
> > > > This way you would have:
> > > >
> > > > - bcache0p1: XFS/EXT4 OSD metadata
> > > > - bcache0p2: RocksDB
> > > > - bcache0p3: RocksDB WAL
> > > > - bcache0p4: BlueStore DATA
> > > >
> > > > With bcache you could create multiple bcache devices by creating
> > > > partitions on the backing disk and creating bcache devices for all of
> > them,
> > > > but that's a lot of work and not easy to automate with ceph-disk.
> > > >
> > > > So what I'm trying to find is the best route to get this upstream in
> > the
> > > > Linux kernel. That way next year when BlueStore becomes the default in
> > L
> > > > (luminous) users can use bcache underneath BlueStore easily.
> > > >
> > > > Does anybody know the proper route we need to take to get this fixed
> > > > upstream? Has any contacts with the bcache developers?
> > > >
> > >
> > > Kent is pretty heavy into developing bcachefs at the moment. But you can
> > > hit him up on IRC at OFTC #bcache . I've talked ot him about this before
> > > and he is 100% willing to accept any patch to solves this issue in the
> > > standard way the kernel typically allocs major/minors for disks. The blog
> > > post you listed from me does _not_ solve this in an upstream way, though
> > > the final result is pretty accurate from my understanding.
> > >
> >
> > No, I understood that the blog indeed doesn't solve that.
> >
> > > I will look into a more better way to patch this upstream since there is
> > > renew interested in this.
> > >
> >
> > That would be great! My kernel knowledge is to limited to look into this,
> > but if you could help with this it would be nice.
> >
> > If this hits the kernel somewhere in Nov/Dec we should be good for a
> > kernel release somewhere together with L for Ceph.
> >
> > > Also, checkout bcachefs if you like bcache. It's up and coming, but it is
> > > pretty sweet. My goal is to use bcachefs with bluestore in the future.
> > >
> >
> > bcachefs with bluestore? The OSD doesn't require a filesystem with
> > BlueStore, just a raw block device :)
> >
> > Well there are parts of the OSD that still use a file system that can
> benefit from the caching (rockdb and wal). This is what I meant. There is a
> tiering system with bcachefs which currently only supports 2 tiers, but
> will eventually allow for 15 tiers, so you could have small and fast pci
> caching tier, followed by ssd, followed by spinning disk. Controlling what
> data can exist on what tier (and with writeback/writethrough potentially).
> Lots of room for configurations to improve performance.
> 

Interesting! Although RocksDB and it's WAL can also be a partition which would 
be bcache again.

However, I send a message to the linux-bcache mailinglist [0], hope we can get 
a proper patch into the kernel soon.

Any input, help or suggestions there would be nice!

Wido

[0]: https://marc.info/?l=linux-bcache=147507062812270

> SamYaple
> 
> 
> > Wido
> >
> > >
> > > >
> > > > Thanks!
> > > >
> > > > Wido
> > > >
> > > > [0]: http://www.spinics.net/lists/ceph-devel/msg29550.html
> > > > [1]: https://bcache.evilpiepirate.org/
> > > > [2]: https://yaple.net/2016/03/31/bcache-partitions-and-dkms/
> > > >
> > >
> > >
> > > SamYaple
> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] v10.2.3 Jewel Released

2016-09-28 Thread Abhishek Lekshmanan
This point release fixes several important bugs in RBD mirroring, RGW 
multi-site, CephFS, and RADOS.

We recommend that all v10.2.x users upgrade.

Notable changes in this release include:
* build/ops: 60-ceph-partuuid-workaround-rules still needed by debian jessie 
(udev 215-17) (#16351, runsisi, Loic Dachary)
* build/ops: ceph Resource Agent does not work with systemd (#14828, Nathan 
Cutler)
* build/ops: ceph-base requires parted (#16095, Ken Dreyer)
* build/ops: ceph-osd-prestart.sh contains Upstart-specific code (#15984, 
Nathan Cutler)
* build/ops: mount.ceph: move from ceph-base to ceph-common and add symlink in 
/sbin for SUSE (#16598, #16645, Nathan Cutler, Dan Horák, Ricardo Dias, Kefu 
Chai)
* build/ops: need rocksdb commit 7ca731b12ce for ppc64le build (#17092, Nathan 
Cutler)
* build/ops: rpm: OBS needs ExclusiveArch (#16936, Michel Normand)
* cli: ceph command line tool chokes on ceph –w (the dash is unicode 'en dash' 
, copy-paste to reproduce) (#12287, Oleh Prypin, Kefu Chai)
* common: expose buffer const_iterator symbols (#16899, Noah Watkins)
* common: global-init: fixup chown of the run directory along with log and asok 
files (#15607, Karol Mroz)
* fs: ceph-fuse: link to libtcmalloc or jemalloc (#16655, Yan, Zheng)
* fs: client: crash in unmount when fuse_use_invalidate_cb is enabled (#16137, 
Yan, Zheng)
* fs: client: fstat cap release (#15723, Yan, Zheng, Noah Watkins)
* fs: essential backports for OpenStack Manila (#15406, #15614, #15615, John 
Spray, Ramana Raja, Xiaoxi Chen)
* fs: fix double-unlock on shutdown (#17126, Greg Farnum)
* fs: fix mdsmap print_summary with standby replays (#15705, John Spray)
* fs: fuse mounted file systems fails SAMBA CTDB ping_pong rw test with v9.0.2 
(#12653, #15634, Yan, Zheng)
* librados: Add cleanup message with time to rados bench output (#15704, 
Vikhyat Umrao)
* librados: Missing export for rados_aio_get_version in 
src/include/rados/librados.h (#15535, Jim Wright)
* librados: osd: bad flags can crash the osd (#16012, Sage Weil)
* librbd: Close journal and object map before flagging exclusive lock as 
released (#16450, Jason Dillaman)
* librbd: Crash when utilizing advisory locking API functions (#16364, Jason 
Dillaman)
* librbd: ExclusiveLock object leaked when switching to snapshot (#16446, Jason 
Dillaman)
* librbd: FAILED assert(object_no < m_object_map.size()) (#16561, Jason 
Dillaman)
* librbd: Image removal doesn't necessarily clean up all rbd_mirroring entries 
(#16471, Jason Dillaman)
* librbd: Object map/fast-diff invalidated if journal replays the same snap 
remove event (#16350, Jason Dillaman)
* librbd: Timeout sending mirroring notification shouldn't result in failure 
(#16470, Jason Dillaman)
* librbd: Whitelist EBUSY error from snap unprotect for journal replay (#16445, 
Jason Dillaman)
* librbd: cancel all tasks should wait until finisher is done (#16517, Haomai 
Wang)
* librbd: delay acquiring lock if image watch has failed (#16923, Jason 
Dillaman)
* librbd: fix missing return statement if failed to get mirror image state 
(#16600, runsisi)
* librbd: flag image as updated after proxying maintenance op (#16404, Jason 
Dillaman)
* librbd: mkfs.xfs slow performance with discards and object map (#16707, 
#16689, Jason Dillaman)
* librbd: potential use after free on refresh error (#16519, Mykola Golub)
* librbd: rbd-nbd does not properly handle resize notifications (#15715, Mykola 
Golub)
* librbd: the option 'rbd_cache_writethrough_until_flush=true' dosn't work 
(#16740, #16386, #16708, #16654, #16478, Mykola Golub, xinxin shu, Xiaowei 
Chen, Jason Dillaman)
* mds:  tell command blocks forever with async messenger 
(TestVolumeClient.test_evict_client failure) (#16288, Douglas Fuller)
* mds: Confusing MDS log message when shut down with stalled journaler reads 
(#15689, John Spray)
* mds: Deadlock on shutdown active rank while busy with metadata IO (#16042, 
Patrick Donnelly)
* mds: Failing file operations on kernel based cephfs mount point leaves 
unaccessible file behind on hammer 0.94.7 (#16013, Yan, Zheng)
* mds: Fix shutting down mds timed-out due to deadlock (#16396, Zhi Zhang)
* mds: MDSMonitor fixes (#16136, xie xingguo)
* mds: MDSMonitor::check_subs() is very buggy (#16022, Yan, Zheng)
* mds: Session::check_access() is buggy (#16358, Yan, Zheng)
* mds: StrayManager.cc: 520: FAILED assert(dnl->is_primary()) (#15920, Yan, 
Zheng)
* mds: enforce a dirfrag limit on entries (#16164, Patrick Donnelly)
* mds: fix SnapRealm::have_past_parents_open() (#16299, Yan, Zheng)
* mds: fix getattr starve setattr (#16154, Yan, Zheng)
* mds: wrongly treat symlink inode as normal file/dir when symlink inode is 
stale on kcephfs (#15702, Zhi Zhang)
* mon: "mon metadata" fails when only one monitor exists (#15866, John Spray, 
Kefu Chai)
* mon: Monitor: validate prefix on handle_command() (#16297, You Ji)
* mon: OSDMonitor: drop pg temps from not the current primary (#16127, Samuel 
Just)
* mon: prepare_pgtemp needs to only update up_thru 

[ceph-users] Radosgw Orphan and multipart objects

2016-09-28 Thread William Josefsson
Hi,

I'm CentOS7/Hammer 0.94.9 (upgraded from RGW s3 objects created in
0.94.7) and I have radosgw multipart and shadow objects in
.rgw.buckets even though I have deleted all buckets 2weeks ago, can
anybody advice on how to prune or garbage collect the orphan and
multipart objects? Pls help. Thx will
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi Burkhard,

thanks a lot for the quick response.

Am 28.09.2016 um 14:15 schrieb Burkhard Linke:
> someone correct me if I'm wrong, but removing objects in a cache tier
> setup result in empty objects which acts as markers for deleting the
> object on the backing store.. I've seen the same pattern you have
> described in the past.

Hm, but why do I still have a lot of those objects in the backing pool
with the same pattern? I mean if the marker object in the cache pool
stays (ie. due to an issue during deleting in the backing pool) why do I
have 0-bytes in the backing pool and also have the "file not found"
while stating / rm'ing on the backing pool? How do I cleanup the backing
pool?

> As a test you can try to evict all objects from the cache pool. This
> should trigger the actual removal of pending objects.

Hm, evicting our cache pool would take several hours if not even days -
In the meantime the OpenStack would not be useable. Is there any way we
could do that "asynchronously" without a complete downtime? Also, as the
used space is constantly growing without, this would have to be done
every few weeks?

Greetings
-Sascha-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Burkhard Linke

Hi,


someone correct me if I'm wrong, but removing objects in a cache tier 
setup result in empty objects which acts as markers for deleting the 
object on the backing store.. I've seen the same pattern you have 
described in the past.



As a test you can try to evict all objects from the cache pool. This 
should trigger the actual removal of pending objects.



Regards,

Burkhard


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi all,

we currently experience a few "strange" things on our Ceph cluster and I
wanted to ask if anyone has recommendations for further tracking them
down (or maybe even an explanation already ;) )

Ceph version is 0.94.5 and we have a HDD based pool with a cache pool on
NVMe SSDs in front if it.

ceph df detail lists a "used" size on the ssd pool (the cache) of
currently 3815 GB. We have a replication size of 2, so effectively this
should take around 7670 GB on disk. Duing a df on all OSDs and summing
them up gives 8501 GB, which is 871 GB more than expected.

Last week the difference was around 840 GB, the week before that around
780 GB. So it looks like the difference is constantly growing.

Doing a for date in `ceph pg dump | grep active | awk '{print $20}'`; do
date +%A -d $date; done | sort | uniq -c

Returns

2002 Tuesday
1390 Wednesday

So scrubbing and deepscrubbing is regularly done.

A thing I noticed which might or might not be related is the following:
The pool is used for OpenStack ephemeral disks and I had created a 1 TB
VM (1TB ephemeral, not a cinder volume ;) )

I looked up the RBD device and noted down the block prefix name.

> rbd info ephemeral-vms/0edd1080-9f84-48d2-8714-34b1cd7d50df_disk
> rbd image '0edd1080-9f84-48d2-8714-34b1cd7d50df_disk':
> size 1024 GB in 262144 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.2c383a0238e1f29
> format: 2
> features: layering
> flags:

After I had deleted the VM I regularly checked the amount of objects in
rados via "rados -p ephemeral-vms ls | grep rbd_data.2c383a0238e1f29 |
wc -l"

and it still returns a large amount of objects:

> Mon Sep 19 09:10:43 CEST 2016 - 138937
> Tue Sep 20 16:11:55 CEST 2016 - 135818
> Thu Sep 22 09:59:03 CEST 2016 - 135791
> Wed Sep 28 12:15:07 CEST 2016 - 133862

I did a "stat" AND a "rm" on each and every of those objects, but they
all returned:

>  rados -p ephemeral-vms stat rbd_data.2c383a0238e1f29.f8b8
>  error stat-ing ephemeral-vms/rbd_data.2c383a0238e1f29.f8b8: (2) 
> No such file or directory

So why is rados still return those objects via an ls?

Even worse, counting the objects on the ssd pool I get:
rados -p ssd ls | grep rbd_data.2c383a0238e1f29 | wc -l
Wed Sep 28 12:54:07 CEST 2016 - 246681

I did a find on one of the OSDs data dir:
> find . -name "*data.2c383a0238e1f29*" | wc -l
> 33060

And checked a few, all of them very 0-byte files

e.g.
> ls -lha 
> ./11.1d_head/DIR_D/DIR_1/DIR_0/DIR_7/DIR_9/rbd\\udata.2c383a0238e1f29.00019bf7__head_87C9701D__b
> -rw-r--r-- 1 root root 0 Sep  9 11:21 
> ./11.1d_head/DIR_D/DIR_1/DIR_0/DIR_7/DIR_9/rbd\udata.2c383a0238e1f29.00019bf7__head_87C9701D__b

But even a 0-byte file takes some space on the disk, might those be the
reason?

Any feedback welcome.
Greetings
-Sascha-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw multi-site replication issues

2016-09-28 Thread Orit Wasserman
On Tue, Sep 27, 2016 at 10:19 PM, John Rowe  wrote:
> Hi Orit,
>
> It appears it must have been one of the known bugs in 10.2.2.  I just
> upgraded to 10.2.3 and bi-directional syncing now works.
>

Good

> I am still seeing some errors when I run synch-related commands but they
> don't seem to be affecting operations as of yet:
>
> radosgw-admin sync status
> 2016-09-27 16:17:15.270331 7fe5e83ad9c0  0 error in read_id for id  : (2) No
> such file or directory
> 2016-09-27 16:17:15.270883 7fe5e83ad9c0  0 error in read_id for id  : (2) No
> such file or directory
>   realm 3af93a86-916a-490f-b38f-17922b472b19 (my_realm)
>   zonegroup 235b010c-22e2-4b43-8fcc-8ae01939273e (us)
>zone 58aa3eef-fc1f-492c-a08e-9c6019e7c266 (us-phx)
>   metadata sync preparing for full sync
> full sync: 0/64 shards
> metadata is caught up with master
> incremental sync: 64/64 shards
>   data sync source: 6c830b44-4e39-4e19-9bd8-03c37c2021f2 (us-dfw)
> preparing for full sync
> full sync: 18/128 shards
> full sync: 0 buckets to sync
> incremental sync: 110/128 shards
> data is behind on 20 shards
> oldest incremental change not applied: 2016-09-27
> 16:17:08.0.922757s
>
>

data sync can take time if you have lots of data,
make sure it catches up

> I've also verified all of the existing data within each bucket has sync'd
> over as well
>
>
> On Tue, Sep 27, 2016 at 9:21 AM John Rowe  wrote:
>>
>> Hi Orit,
>>
>> That was my failed attempt at sanitizing :)
>>
>> They are actually all identical:
>>
>> Periods:
>> MD5 (cephrgw1-1-dfw-period.json) = 12ed481381c1f2937a27b57db0473d6d
>> MD5 (cephrgw1-1-phx-period.json) = 12ed481381c1f2937a27b57db0473d6d
>> MD5 (cephrgw1-2-dfw-period.json) = 12ed481381c1f2937a27b57db0473d6d
>> MD5 (cephrgw1-2-phx-period.json) = 12ed481381c1f2937a27b57db0473d6d
>> MD5 (cephrgw1-3-dfw-period.json) = 12ed481381c1f2937a27b57db0473d6d
>> MD5 (cephrgw1-3-phx-period.json) = 12ed481381c1f2937a27b57db0473d6d
>>
>> Realms:
>> MD5 (cephrgw1-1-dfw-realm.json) = 39a4e63bab64ed756961117d3629b109
>> MD5 (cephrgw1-1-phx-realm.json) = 39a4e63bab64ed756961117d3629b109
>> MD5 (cephrgw1-2-dfw-realm.json) = 39a4e63bab64ed756961117d3629b109
>> MD5 (cephrgw1-2-phx-realm.json) = 39a4e63bab64ed756961117d3629b109
>> MD5 (cephrgw1-3-dfw-realm.json) = 39a4e63bab64ed756961117d3629b109
>> MD5 (cephrgw1-3-phx-realm.json) = 39a4e63bab64ed756961117d3629b109
>>
>>
>> On Tue, Sep 27, 2016 at 5:32 AM Orit Wasserman 
>> wrote:
>>>
>>> see comment below
>>>
>>> On Mon, Sep 26, 2016 at 10:00 PM, John Rowe 
>>> wrote:
>>> > Hi Orit,
>>> >
>>> > Sure thing, please see below.
>>> >
>>> > Thanks!
>>> >
>>> >
>>> > DFW (Primary)
>>> > radosgw-admin zonegroupmap get
>>> > {
>>> > "zonegroups": [
>>> > {
>>> > "key": "235b010c-22e2-4b43-8fcc-8ae01939273e",
>>> > "val": {
>>> > "id": "235b010c-22e2-4b43-8fcc-8ae01939273e",
>>> > "name": "us",
>>> > "api_name": "us",
>>> > "is_master": "true",
>>> > "endpoints": [
>>> > "http:\/\/ELB_FQDN:80"
>>> > ],
>>> > "hostnames": [],
>>> > "hostnames_s3website": [],
>>> > "master_zone": "6c830b44-4e39-4e19-9bd8-03c37c2021f2",
>>> > "zones": [
>>> > {
>>> > "id": "58aa3eef-fc1f-492c-a08e-9c6019e7c266",
>>> > "name": "us-phx",
>>> > "endpoints": [
>>> > "http:\/\/cephrgw1-1:80"
>>> > ],
>>> > "log_meta": "false",
>>> > "log_data": "true",
>>> > "bucket_index_max_shards": 0,
>>> > "read_only": "false"
>>> > },
>>> > {
>>> > "id": "6c830b44-4e39-4e19-9bd8-03c37c2021f2",
>>> > "name": "us-dfw",
>>> > "endpoints": [
>>> > "http:\/\/cephrgw1-1-dfw:80"
>>> > ],
>>> > "log_meta": "true",
>>> > "log_data": "true",
>>> > "bucket_index_max_shards": 0,
>>> > "read_only": "false"
>>> > }
>>> > ],
>>> > "placement_targets": [
>>> > {
>>> > "name": "default-placement",
>>> > "tags": []
>>> > }
>>> > ],
>>> > "default_placement": 

Re: [ceph-users] fixing zones

2016-09-28 Thread Orit Wasserman
see blow

On Tue, Sep 27, 2016 at 8:31 PM, Michael Parson  wrote:
> (I tried to start this discussion on irc, but I wound up with the wrong
> paste buffer and wound up getting kicked off for a paste flood, sorry,
> that was on me :(  )
>
> We were having some weirdness with our Ceph and did an upgrade up to
> 10.2.3, which fixed some, but not all of our problems.
>
> It looked like our users pool might have been corrupt, so we moved it
> aside and created a new set:
>
> $ sudo ceph osd pool rename .users old.users
> $ sudo ceph osd pool rename .users.email old.users.email
> $ sudo ceph osd pool rename .users.swift old.users.swift
> $ sudo ceph osd pool rename .users.uid old.users.uid
>
>
> $ sudo ceph osd pool create .users 16 16
> $ sudo ceph osd pool create .users.email 16 16
> $ sudo ceph osd pool create .users.swift 16 16
> $ sudo ceph osd pool create .users.uid 16 16
>
> This allowed me to create new users and swift subusers under them, but
> only the first one is allowing auth, all others are getting 403s when
> attempting to auth.
>
> We googled around a bit and found the fix-zone script:
>
> https://raw.githubusercontent.com/yehudasa/ceph/wip-fix-default-zone/src/fix-zone
>
> Which ran fine until the last command, which errors out with:
>
> + radosgw-admin zone default --rgw-zone=default
> WARNING: failed to initialize zonegroup
>

That is a known issue (default zone is a realm propety) it should not
effect you because radosgw uses the "default" zone
if it doesn't find any zone.

> the 'default' rgw-zone seems OK:
>
> $ sudo radosgw-admin zone get --zone-id=default
> {
> "id": "default",
> "name": "default",
> "domain_root": ".rgw_",

the underscore doesn't look good here and in the other pools
are you sure this are the pools you used before?

Orit

> "control_pool": ".rgw.control_",
> "gc_pool": ".rgw.gc_",
> "log_pool": ".log_",
> "intent_log_pool": ".intent-log_",
> "usage_log_pool": ".usage_",
> "user_keys_pool": ".users_",
> "user_email_pool": ".users.email_",
> "user_swift_pool": ".users.swift_",
> "user_uid_pool": ".users.uid_",
> "system_key": {
> "access_key": "",
> "secret_key": ""
> },
> "placement_pools": [
> {
> "key": "default-placement",
> "val": {
> "index_pool": ".rgw.buckets.index_",
> "data_pool": ".rgw.buckets_",
> "data_extra_pool": ".rgw.buckets.extra_",
> "index_type": 0
> }
> }
> ],
> "metadata_heap": ".rgw.meta",
> "realm_id": "a113de3d-c506-4112-b419-0d5c94ded7af"
> }
>

> Not really sure where to go from here, any help would be appreciated.
>
> --
> Michael Parson
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding new monitors to production cluster

2016-09-28 Thread Nick @ Deltaband
On 28 September 2016 at 19:22, Wido den Hollander  wrote:
>
>
> > Op 28 september 2016 om 0:35 schreef "Nick @ Deltaband" 
> > :
> >
> >
> > Hi Cephers,
> >
> > We need to add two new monitors to a production cluster (0.94.9) which has
> > 3 existing monitors. It looks like it's as easy as ceph-deploy mon add  > mon>.
> >
>
> You are going to add two additional monitors? 3 to 5?
>

Yes, we're going to add two.

>
> > What's the best practice in terms of when to update the existing monitor
> > and osd ceph.conf file to include the new monitors in mon_initial_members
> > and mon_hosts? Before adding the new monitor, afterwards, or doesn't it
> > make any difference?
> >
>
> Add the new monitor using ceph-deploy and afterwards update the configuration 
> on all the nodes so that mon_host contains all the new Monitors.
>

Thanks for confirming.

> > Will it cause any disruption on the cluster, or is it 100% safe to do with
> > no disruption? Any steps we can take to minimise risk?
> >
>
> A very short interruption (seconds) might occur when the monitors are 
> electing. Nothing is 100% safe ofcourse, but this is a safe operation to do 
> in production.
>

Understood, we'll give it a go.

Thanks again.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding new monitors to production cluster

2016-09-28 Thread Wido den Hollander

> Op 28 september 2016 om 0:35 schreef "Nick @ Deltaband" :
> 
> 
> Hi Cephers,
> 
> We need to add two new monitors to a production cluster (0.94.9) which has
> 3 existing monitors. It looks like it's as easy as ceph-deploy mon add  mon>.
> 

You are going to add two additional monitors? 3 to 5?

> What's the best practice in terms of when to update the existing monitor
> and osd ceph.conf file to include the new monitors in mon_initial_members
> and mon_hosts? Before adding the new monitor, afterwards, or doesn't it
> make any difference?
> 

Add the new monitor using ceph-deploy and afterwards update the configuration 
on all the nodes so that mon_host contains all the new Monitors.

> Will it cause any disruption on the cluster, or is it 100% safe to do with
> no disruption? Any steps we can take to minimise risk?
> 

A very short interruption (seconds) might occur when the monitors are electing. 
Nothing is 100% safe ofcourse, but this is a safe operation to do in production.

Wido

> Thanks
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Troubles seting up radosgw

2016-09-28 Thread Iban Cabrillo
Dear Admins,
   During last day I have been trying to deploy a new radosgw, following
jewel guide, ceph cluster is healthy (3 mon and 2 osd servers )
   root@cephrgw ceph]# ceph -v
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
   [root@cephrgw ceph]# rpm -qa | grep ceph
ceph-common-10.2.3-0.el7.x86_64
libcephfs1-10.2.3-0.el7.x86_64
ceph-deploy-1.5.36-0.noarch
ceph-release-1-1.el7.noarch
ceph-base-10.2.3-0.el7.x86_64
ceph-radosgw-10.2.3-0.el7.x86_64
python-cephfs-10.2.3-0.el7.x86_64
ceph-selinux-10.2.3-0.el7.x86_64

Cityweb is running on default port:
[root@cephrgw ceph]# systemctl status ceph-radosgw@rgw.cephrgw.service
● ceph-radosgw@rgw.cephrgw.service - Ceph rados gateway
   Loaded: loaded (/usr/lib/systemd/system/ceph-radosgw@.service; enabled;
vendor preset: disabled)
   Active: active (running) since mié 2016-09-28 10:20:34 CEST; 2s ago
 Main PID: 29311 (radosgw)
   CGroup:
/system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw@rgw.cephrgw.service
   └─29311 /usr/bin/radosgw -f --cluster ceph --name
client.rgw.cephrgw --setuser ceph --setgroup ceph

sep 28 10:20:34 cephrgw.ifca.es systemd[1]: Started Ceph rados gateway.
sep 28 10:20:34 cephrgw.ifca.es systemd[1]: Starting Ceph rados gateway...

And this pools were created on ceph storage:
.rgw.root  2400
   0  252  19545
default.rgw.control0800
   00000
default.rgw.data.root0000
 00000
default.rgw.gc 0   3200
   0 2112 2080 14080
default.rgw.log0  12700
   04762547498317500
default.rgw.users.uid0000
 00000

But seems that zones are not well defined.

radosgw-admin zone get --zone-id=default
2016-09-28 10:24:07.142478 7fd810b219c0  0 failed reading obj info from
.rgw.root:zone_info.default: (2) No such file or directory


[root@cephrgw ~]# radosgw-admin zone get
2016-09-28 10:25:41.740162 7f18072799c0  1 -- :/0 messenger.start
2016-09-28 10:25:41.741262 7f18072799c0  1 -- :/945549824 -->
10.10.3.3:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7f18085c9850
con 0x7f18085c85a0
2016-09-28 10:25:41.742048 7f180726f700  1 -- 10.10.3.4:0/945549824 learned
my addr 10.10.3.4:0/945549824
2016-09-28 10:25:41.743168 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 1  mon_map magic: 0 v1  495+0+0 (2693174994
0 0) 0x7f17d4000b90 con 0x7f18085c85a0
2016-09-28 10:25:41.743380 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 2  auth_reply(proto 2 0 (0) Success) v1 
33+0+0 (3801669063 0 0) 0x7f17d4001010 con 0x7f18085c85a0
2016-09-28 10:25:41.743696 7f17eae03700  1 -- 10.10.3.4:0/945549824 -->
10.10.3.3:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- ?+0 0x7f17e0001730
con 0x7f18085c85a0
2016-09-28 10:25:41.744541 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 3  auth_reply(proto 2 0 (0) Success) v1 
206+0+0 (1705741500 0 0) 0x7f17d4001010 con 0x7f18085c85a0
2016-09-28 10:25:41.744765 7f17eae03700  1 -- 10.10.3.4:0/945549824 -->
10.10.3.3:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- ?+0
0x7f17e0001bf0 con 0x7f18085c85a0
2016-09-28 10:25:41.745619 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 4  auth_reply(proto 2 0 (0) Success) v1 
393+0+0 (482591267 0 0) 0x7f17d40008c0 con 0x7f18085c85a0
2016-09-28 10:25:41.745783 7f17eae03700  1 -- 10.10.3.4:0/945549824 -->
10.10.3.3:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f18085cd560 con
0x7f18085c85a0
2016-09-28 10:25:41.745967 7f18072799c0  1 -- 10.10.3.4:0/945549824 -->
10.10.3.3:6789/0 -- mon_subscribe({osdmap=0}) v2 -- ?+0 0x7f18085c9850 con
0x7f18085c85a0
2016-09-28 10:25:41.746521 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 5  mon_map magic: 0 v1  495+0+0 (2693174994
0 0) 0x7f17d40012b0 con 0x7f18085c85a0
2016-09-28 10:25:41.746669 7f17daffd700  2
RGWDataChangesLog::ChangesRenewThread: start
2016-09-28 10:25:41.746882 7f18072799c0 20 get_system_obj_state:
rctx=0x7ffe0afd57e0 obj=.rgw.root:default.realm state=0x7f18085cf4e8
s->prefetch_data=0
2016-09-28 10:25:41.746962 7f17eae03700  1 -- 10.10.3.4:0/945549824 <==
mon.2 10.10.3.3:6789/0 6  osd_map(5792..5792 src has 5225..5792) v3
 13145+0+0 (1223904398 0 0) 0x7f17d40008c0 con 0x7f18085c85a0
2016-09-28 10:25:41.747661 7f18072799c0  1 -- 10.10.3.4:0/945549824 -->
10.10.3.12:6810/8166 -- osd_op(client.2974205.0:1 26.85fca992 default.realm
[getxattrs,stat] snapc 0=[] ack+read+known_if_redirected e5792) v7 -- ?+0
0x7f18085d3410 con 

Re: [ceph-users] 答复: Ceph user manangerment question

2016-09-28 Thread Daleep Singh Bais
Hi Dillon,

Please check
http://docs.ceph.com/docs/firefly/rados/operations/auth-intro/#ceph-authorization-caps
 


http://docs.ceph.com/docs/jewel/rados/operations/user-management/

This might provide some information on permissions.

Thanks,
Daleep Singh Bais

On 09/28/2016 11:28 AM, 卢 迪 wrote:
>
> Hi Daleep,
>
>  
>
> Thank you for reply. 
>
> I have read the document for a moment. Let me try to clarify this. 
>
>  
>
> In my case, I only assgin “mon ‘allow r” permission to account
> appuser. But, I still can mount cephfs and see the directory created
> before(the folder name is “test”).
>
>
> And, I can create a folder under this folder too. (the folder is “test2”)
>
> However, when I created and edited an text file(“test.txt”) with aread
> onlyerror.When I quit with "q!", I still see the file with 0 bytes.
>
>  
>
> I'm wondering I must misunderstand something. I thought I shouldn't
> see this folder "test" because the user didn't have the read/write
> permission against any pool in this cluster. I shouldn't create the
> "test.txt" in this folder too because of premission.(But, I CREATED it
> with nothing)
>
>  
>
> Let's say assigning an OS user permission(for example, Linux). I have
> to give read permission if a user want to read a file; If it has to
> execute a script, I have to grant the exeucte permission. I want to
> understand when and why I should assign which permssion to an user by
> meeting a special task. Can I find this kind of document?
>
>  
>
> Thanks,
>
> Dillon
>
> 
> *发件人:* Daleep Singh Bais 
> *发送时间:* 2016年9月27日 6:55:10
> *收件人:* 卢 迪; ceph-users@lists.ceph.com
> *主题:* Re: [ceph-users] Ceph user manangerment question
>  
> Hi Dillon,
>
> Ceph uses CephX authentication, which gives permission to users on
> selected Pools  to read / write.  We give mon 'allow r'
>  to get cluster/Crush map for client.
>
> You can refer to below URL for more information on CephX and creating
> user keyrings for access to selected / specific pools.
>
> http://docs.ceph.com/docs/jewel/rados/configuration/auth-config-ref/
> Cephx Config Reference — Ceph Documentation
> 
> docs.ceph.com
> Deployment Scenarios¶ There are two main scenarios for deploying a
> Ceph cluster, which impact how you initially configure Cephx. Most
> first time Ceph users use ceph ...
>
>
>
>
> The below URL will give you information on various permissions which
> can be applied while creating a CephX authentication key.
>
> http://docs.ceph.com/docs/firefly/rados/operations/auth-intro/
> Ceph Authentication & Authorization — Ceph Documentation
> 
> docs.ceph.com
> Ceph Authentication & Authorization¶ Ceph is a distributed storage
> system where a typical deployment involves a relatively small quorum
> of monitors, scores of ...
>
>
>
>
> Hope this will give some insight and way forward to proceed.
>
> Thanks,
>
> Daleep Singh Bais
>
> On 09/27/2016 12:02 PM, 卢 迪 wrote:
>>
>> Hello all,
>>
>>
>> I'm a newbie of Ceph. I read the document and created a ceph cluster
>> against VM. I have a question about how to apply user managerment to
>> the cluster. I'm not asking how to create or modify users or user
>> privileges. I have found this in the Ceph document.
>>
>>
>> I want to know:
>>
>>
>> 1. Is there a way to know the usage of all privileges? For example, I
>> created an user client.appuser with mon "allow r", this user can
>> accsess the Ceph; If I removed the mon "allow r", it will be time
>> out. (in this case, I mount the cluster with cephfs). If someone has
>> these information, could you please share with me?
>>
>>
>> 2. What kind of situation would you create differnet users for
>> cluster? In currently, I user admin user to access the all cluster,
>> such as start cluster, mount file system and etc. It looks like the
>> appuser( I created above) can mount file system too. Is it possible
>> to create an user liking the OS user or database user? So, one user
>> upload some data, the others can't see them or can only read them.
>>
>>
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com