Re: [ceph-users] scrub error: found clone without head

2013-05-23 Thread Olivier Bonvalet
Not yet. I keep it for now.

Le mercredi 22 mai 2013 à 15:50 -0700, Samuel Just a écrit :
 rb.0.15c26.238e1f29
 
 Has that rbd volume been removed?
 -Sam
 
 On Wed, May 22, 2013 at 12:18 PM, Olivier Bonvalet ceph.l...@daevel.fr 
 wrote:
  0.61-11-g3b94f03 (0.61-1.1), but the bug occured with bobtail.
 
 
  Le mercredi 22 mai 2013 à 12:00 -0700, Samuel Just a écrit :
  What version are you running?
  -Sam
 
  On Wed, May 22, 2013 at 11:25 AM, Olivier Bonvalet ceph.l...@daevel.fr 
  wrote:
   Is it enough ?
  
   # tail -n500 -f /var/log/ceph/osd.28.log | grep -A5 -B5 'found clone 
   without head'
   2013-05-22 15:43:09.308352 7f707dd64700  0 log [INF] : 9.105 scrub ok
   2013-05-22 15:44:21.054893 7f707dd64700  0 log [INF] : 9.451 scrub ok
   2013-05-22 15:44:52.898784 7f707cd62700  0 log [INF] : 9.784 scrub ok
   2013-05-22 15:47:43.148515 7f707cd62700  0 log [INF] : 9.3c3 scrub ok
   2013-05-22 15:47:45.717085 7f707dd64700  0 log [INF] : 9.3d0 scrub ok
   2013-05-22 15:52:14.573815 7f707dd64700  0 log [ERR] : scrub 3.6b 
   ade3c16b/rb.0.15c26.238e1f29.9221/12d7//3 found clone without 
   head
   2013-05-22 15:55:07.230114 7f707d563700  0 log [ERR] : scrub 3.6b 
   261cc0eb/rb.0.15c26.238e1f29.3671/12d7//3 found clone without 
   head
   2013-05-22 15:56:56.456242 7f707d563700  0 log [ERR] : scrub 3.6b 
   b10deaeb/rb.0.15c26.238e1f29.86a2/12d7//3 found clone without 
   head
   2013-05-22 15:57:51.667085 7f707dd64700  0 log [ERR] : 3.6b scrub 3 
   errors
   2013-05-22 15:57:55.241224 7f707dd64700  0 log [INF] : 9.450 scrub ok
   2013-05-22 15:57:59.800383 7f707cd62700  0 log [INF] : 9.465 scrub ok
   2013-05-22 15:59:55.024065 7f707661a700  0 -- 192.168.42.3:6803/12142  
   192.168.42.5:6828/31490 pipe(0x2a689000 sd=108 :6803 s=2 pgs=200652 
   cs=73 l=0).fault with nothing to send, going to standby
   2013-05-22 16:01:45.542579 7f7022770700  0 -- 192.168.42.3:6803/12142  
   192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 74 vs existing 73 state standby
   --
   2013-05-22 16:29:49.544310 7f707dd64700  0 log [INF] : 9.4eb scrub ok
   2013-05-22 16:29:53.190233 7f707dd64700  0 log [INF] : 9.4f4 scrub ok
   2013-05-22 16:29:59.478736 7f707dd64700  0 log [INF] : 8.6bb scrub ok
   2013-05-22 16:35:12.240246 7f7022770700  0 -- 192.168.42.3:6803/12142  
   192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=2 pgs=200667 cs=75 
   l=0).fault with nothing to send, going to standby
   2013-05-22 16:35:19.519019 7f707d563700  0 log [INF] : 8.700 scrub ok
   2013-05-22 16:39:15.422532 7f707dd64700  0 log [ERR] : scrub 3.1 
   b1869301/rb.0.15c26.238e1f29.0836/12d7//3 found clone without 
   head
   2013-05-22 16:40:04.995256 7f707cd62700  0 log [ERR] : scrub 3.1 
   bccad701/rb.0.15c26.238e1f29.9a00/12d7//3 found clone without 
   head
   2013-05-22 16:41:07.008717 7f707d563700  0 log [ERR] : scrub 3.1 
   8a9bec01/rb.0.15c26.238e1f29.9820/12d7//3 found clone without 
   head
   2013-05-22 16:41:42.460280 7f707c561700  0 log [ERR] : 3.1 scrub 3 errors
   2013-05-22 16:46:12.385678 7f7077735700  0 -- 192.168.42.3:6803/12142  
   192.168.42.5:6828/31490 pipe(0x2a689c80 sd=137 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 76 vs existing 75 state standby
   2013-05-22 16:58:36.079010 7f707661a700  0 -- 192.168.42.3:6803/12142  
   192.168.42.3:6801/11745 pipe(0x2a689a00 sd=44 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 40 vs existing 39 state standby
   2013-05-22 16:58:36.798038 7f707d563700  0 log [INF] : 9.50c scrub ok
   2013-05-22 16:58:40.104159 7f707c561700  0 log [INF] : 9.526 scrub ok
  
  
   Note : I have 8 scrub errors like that, on 4 impacted PG, and all 
   impacted objects are about the same RBD image (rb.0.15c26.238e1f29).
  
  
  
   Le mercredi 22 mai 2013 à 11:01 -0700, Samuel Just a écrit :
   Can you post your ceph.log with the period including all of these 
   errors?
   -Sam
  
   On Wed, May 22, 2013 at 5:39 AM, Dzianis Kahanovich
   maha...@bspu.unibel.by wrote:
Olivier Bonvalet пишет:
   
Le lundi 20 mai 2013 à 00:06 +0200, Olivier Bonvalet a écrit :
Le mardi 07 mai 2013 à 15:51 +0300, Dzianis Kahanovich a écrit :
I have 4 scrub errors (3 PGs - found clone without head), on one 
OSD. Not
repairing. How to repair it exclude re-creating of OSD?
   
Now it easy to clean+create OSD, but in theory - in case there 
are multiple
OSDs - it may cause data lost.
   
I have same problem : 8 objects (4 PG) with error found clone 
without
head. How can I fix that ?
since pg repair doesn't handle that kind of errors, is there a way 
to
manually fix that ? (it's a production cluster)
   
Trying to fix manually I cause assertions in trimming process (died 
OSD). And
many others troubles. So, if you want to keep cluster running, wait 
for
developers answer. IMHO.
   
About manual repair attempt: see issue #4937. Also 

Re: [ceph-users] mon problems after upgrading to cuttlefish

2013-05-23 Thread Smart Weblications GmbH - Florian Wiessner
Hi,


please do not forget to respond to the list ceph-users@lists.ceph.com

find answer below.

Am 23.05.2013 17:16, schrieb Bryan Stillwell:
 This is what I currently have configured:
 
 
 # Ceph config file
 
 [global]
 auth cluster required = none
 auth service required = none
 #   auth client required = cephx
 
 [osd]
 osd journal size = 1000
 filestore xattr use omap = true
 osd mkfs type = xfs
 osd mkfs options xfs = noatime
 
 [mon.a]
 host = a1
 mon addr = 172.24.88.50:6789
 
 [osd.0]
 host = b1
 devs = /dev/sdb
 
 [osd.1]
 host = b1
 devs = /dev/sdc
 
 [osd.2]
 host = b1
 devs = /dev/sdd
 
 [osd.3]
 host = b1
 devs = /dev/sde
 
 [osd.4]
 host = b1
 devs = /dev/sdf
 
 [osd.5]
 host = b2
 devs = /dev/sdb
 
 [osd.6]
 host = b2
 devs = /dev/sdc
 
 [osd.7]
 host = b2
 devs = /dev/sdd1
 
 [osd.8]
 host = b2
 devs = /dev/sde
 
 [osd.9]
 host = b2
 devs = /dev/sdf
 
 [osd.10]
 host = b3
 devs = /dev/sdb1
 
 [osd.11]
 host = b3
 devs = /dev/sdc1
 
 [osd.12]
 host = b3
 devs = /dev/sdd1
 
 [osd.13]
 host = b3
 devs = /dev/sde1
 
 [osd.14]
 host = b3
 devs = /dev/sdf1
 
 [osd.15]
 host = b4
 devs = /dev/sdb1
 
 [osd.16]
 host = b4
 devs = /dev/sdc1
 
 [osd.17]
 host = b4
 devs = /dev/sdd1
 
 [osd.18]
 host = b4
 devs = /dev/sde1
 
 [osd.19]
 host = b4
 devs = /dev/sdf1
 
 [osd.20]
 host = b1
 devs = /dev/sda4
 
 [osd.21]
 host = b2
 devs = /dev/sda4
 
 [osd.22]
 host = b3
 devs = /dev/sda4
 
 [osd.23]
 host = b4
 devs = /dev/sda4
 
 [mds.a]
 host = a1
 
 #[client]
 #   debug ms = 1
 #   debug client = 20
 
 On Thu, May 23, 2013 at 4:00 AM, Smart Weblications GmbH - Florian
 Wiessner f.wiess...@smart-weblications.de wrote:
 Am 23.05.2013 07:45, schrieb Bryan Stillwell:
 I attempted to upgrade my bobtail cluster to cuttlefish tonight and I
 believe I'm running into some mon related issues.  I did the original
 install manually instead of with mkcephfs or ceph-deploy, so I think
 that might have to do with this error:

 root@a1:~# ceph-mon -d -c /etc/ceph/ceph.conf
 2013-05-22 23:37:29.283975 7f8fb97b3780  0 ceph version 0.61.2
 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 5531
 IO error: /var/lib/ceph/mon/ceph-admin/store.db/LOCK: No such file or 
 directory
 2013-05-22 23:37:29.286534 7f8fb97b3780  1 unable to open monitor
 store at /var/lib/ceph/mon/ceph-admin
 2013-05-22 23:37:29.286544 7f8fb97b3780  1 check for old monitor store 
 format
 2013-05-22 23:37:29.286550 7f8fb97b3780  1
 store(/var/lib/ceph/mon/ceph-admin) mount
 2013-05-22 23:37:29.286559 7f8fb97b3780  1
 store(/var/lib/ceph/mon/ceph-admin) basedir
 /var/lib/ceph/mon/ceph-admin dne
 2013-05-22 23:37:29.286564 7f8fb97b3780 -1 unable to mount monitor
 store: (2) No such file or directory
 2013-05-22 23:37:29.286577 7f8fb97b3780 -1 found errors while
 attempting to convert the monitor store: (2) No such file or directory
 root@a1:~# ls -l /var/lib/ceph/mon/
 total 4
 drwxr-xr-x 15 root root 4096 May 22 23:30 ceph-a


 I only have one mon daemon in this cluster as well.  I was planning on
 upgrading it to 3 tonight but when I try to run most commands they
 just hang now.

 I do see the store.db directory in the ceph-a directory if that helps:

 root@a1:~# ls -l  /var/lib/ceph/mon/ceph-a/
 total 868
 drwxr-xr-x 2 root root   4096 May 22 23:30 auth
 drwxr-xr-x 2 root root   4096 May 22 23:30 auth_gv
 -rw--- 1 root root 37 Feb  4 14:22 cluster_uuid
 -rw--- 1 root root  2 May 22 23:30 election_epoch
 -rw--- 1 root root120 Feb  4 14:22 feature_set
 -rw--- 1 root root  2 Dec 28 11:35 joined
 -rw--- 1 root root 77 May 22 22:30 keyring
 -rw--- 1 root root  0 Dec 28 11:35 lock
 drwxr-xr-x 2 root root  20480 May 22 23:30 logm
 drwxr-xr-x 2 root root  20480 May 22 23:30 logm_gv
 -rw--- 1 root root 21 Dec 28 11:35 magic
 drwxr-xr-x 2 root root  12288 May 22 23:30 mdsmap
 drwxr-xr-x 2 root root  12288 May 22 23:30 mdsmap_gv
 drwxr-xr-x 2 root root   4096 Dec 28 11:35 monmap
 drwxr-xr-x 2 root root 233472 May 22 23:30 osdmap
 drwxr-xr-x 2 root root 237568 May 22 23:30 osdmap_full
 drwxr-xr-x 2 root root 253952 May 22 23:30 osdmap_gv
 drwxr-xr-x 2 root root  20480 May 22 23:30 pgmap
 drwxr-xr-x 2 root root  20480 May 22 23:30 pgmap_gv
 drwxr-xr-x 2 root root   4096 May 22 23:36 store.db



 what does your ceph.conf look like?


 store(/var/lib/ceph/mon/ceph-admin) mount
 2013-05-22 23:37:29.286559 7f8fb97b3780  1
 store(/var/lib/ceph/mon/ceph-admin) basedir
 /var/lib/ceph/mon/ceph-admin dne
 2013-05-22 23:37:29.286564 

[ceph-users] radosgw with nginx

2013-05-23 Thread Erdem Agaoglu
Hi all,

We are trying to run radosgw with nginx.
We've found an example https://gist.github.com/guilhem/4964818
And changed our nginx.conf like below:

http {
server {
listen 0.0.0.0:80 http://0.0.0.0/;
server_name _;
access_log  off;
location / {
fastcgi_pass_header Authorization;
fastcgi_pass_request_headers on;
include fastcgi_params;
fastcgi_keep_conn on;
fastcgi_pass unix:/tmp/radosgw.sock;
}
}
}

But the simplest test gives following error:

# curl -v http://x.x.x.x/bucket/test.jpg
* About to connect() to x.x.x.x port 80 (#0)
*   Trying x.x.x.x ... connected
 GET /bucket/test.jpg HTTP/1.1
 User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0
OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
 Host: x.x.x.x
 Accept: */*

 HTTP/1.1 400
 Server: nginx/1.1.19
 Date: Thu, 23 May 2013 15:34:05 GMT
 Content-Type: application/json
 Content-Length: 26
 Connection: keep-alive
 Accept-Ranges: bytes

* Connection #0 to host x.x.x.x left intact
* Closing connection #0
{Code:InvalidArgument}

radosgw logs show these:

2013-05-23 08:34:31.074037 7f0739c33780 20 enqueued request req=0x1e78870
2013-05-23 08:34:31.074044 7f0739c33780 20 RGWWQ:
2013-05-23 08:34:31.074045 7f0739c33780 20 req: 0x1e78870
2013-05-23 08:34:31.074047 7f0739c33780 10 allocated request req=0x1ec6490
2013-05-23 08:34:31.074084 7f0720ce8700 20 dequeued request req=0x1e78870
2013-05-23 08:34:31.074093 7f0720ce8700 20 RGWWQ: empty
2013-05-23 08:34:31.074098 7f0720ce8700  1 == starting new request
req=0x1e78870 =
2013-05-23 08:34:31.074140 7f0720ce8700  2 req 4:0.42initializing
2013-05-23 08:34:31.074174 7f0720ce8700  5 nothing to log for operation
2013-05-23 08:34:31.074178 7f0720ce8700  2 req 4:0.80::GET
/bucket/test.jpg::http status=400
2013-05-23 08:34:31.074192 7f0720ce8700  1 == req done req=0x1e78870
http_status=400 ==


Normally we expect a well formed 403 (because request doesn't have
Authorization header) but we have a 400 and cannot figure out why.

Thanks in advance.

-- 
erdem agaoglu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy

2013-05-23 Thread Dewan Shamsul Alam
Hi,

I tried ceph-deploy all day. Found that it has a python-setuptools as
dependency. I knew about python-pushy. But is there any other dependency
that I'm missing?

The problem I'm getting are as follows:

#ceph-deploy gatherkeys ceph0 ceph1 ceph2
returns the following error,
Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph0', 'ceph1',
'ceph2']

Once I got passed this, I don't know why it works sometimes. I have been
following the exact steps as mentioned in the blog.

Then when I try to do

ceph-deploy osd create ceph0:/dev/sda3 ceph1:/dev/sda3 ceph2:/dev/sda3

It gets stuck.

I'm using Ubuntu 13.04 for ceph-deploy and 12.04 for ceph nodes. I just
need to get the cuttlefish working and willing to change the OS if it is
required. Please help. :)

Best Regards,
Dewan Shamsul Alam
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mkcephfs

2013-05-23 Thread Dewan Shamsul Alam
Hi,

I had a running ceph cluster on bobtail. It was on 0.56.4. It is my test
cluster. I upgraded it to 0.56.6, now mkcephfs doesn't work with the same
working configuration and the following command:

/sbin/mkcephfs -a -c /etc/ceph/ceph.conf

ceph.conf

[global]
auth supported = none
auth cluster required = none
auth service required = none
auth client required = none
[osd]
osd journal size = 1000
filestore xattr use omap = true
osd mkfs type = btrfs
osd mkfs options btrfs = -m raid0
osd mount options btrfs = rw, noatime
[mon.a]
host = ceph0
mon addr = 192.168.128.10:6789
[mon.b]
host = ceph1
mon addr = 192.168.128.11:6789
[mon.c]
host = ceph2
mon addr = 192.168.128.12:6789
[osd.0]
host = ceph0
devs = /dev/sda3
[osd.1]
host = ceph1
devs = /dev/sda3
[osd.2]
host = ceph2
devs = /dev/sda3
[mds.a]
host = ceph0
[mds.b]
host = ceph1
[mds.c]
host = ceph2

Best Regards,
Dewan Shamsul Alam
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mkcephfs

2013-05-23 Thread Sage Weil
Can you be more specific?  How does it fail?

A copy of the actual output would be ideal, thanks!

sage

On Thu, 23 May 2013, Dewan Shamsul Alam wrote:

 Hi,
 
 I had a running ceph cluster on bobtail. It was on 0.56.4. It is my test
 cluster. I upgraded it to 0.56.6, now mkcephfs doesn't work with the same
 working configuration and the following command:
 
 /sbin/mkcephfs -a -c /etc/ceph/ceph.conf
 
 ceph.conf
 
 [global]
     auth supported = none
     auth cluster required = none
     auth service required = none
     auth client required = none
 [osd]
     osd journal size = 1000
     filestore xattr use omap = true
     osd mkfs type = btrfs
     osd mkfs options btrfs = -m raid0
     osd mount options btrfs = rw, noatime
 [mon.a]
     host = ceph0
     mon addr = 192.168.128.10:6789
 [mon.b]
     host = ceph1
     mon addr = 192.168.128.11:6789
 [mon.c]
     host = ceph2
     mon addr = 192.168.128.12:6789
 [osd.0]
     host = ceph0
     devs = /dev/sda3
 [osd.1]
     host = ceph1
     devs = /dev/sda3
 [osd.2]
     host = ceph2
     devs = /dev/sda3
 [mds.a]
     host = ceph0
 [mds.b]
     host = ceph1
 [mds.c]
     host = ceph2
 
 Best Regards,
 Dewan Shamsul Alam
 
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FW: About RBD

2013-05-23 Thread Mensah, Yao (CIV)
Thank you very much for your prompt response…

So basically I can’t use cluster aware tool like Microsoft CSV on the RBD, is 
that correct?

What I am trying to understand is that can I have 2 physical hosts (Maybe Dell 
PowerEdge2950)

*host1 with VM#0-10
*host2 with  VM #10-20

And both of these hosts accessing one big LUN or, in this case ceph RBD?

Can host1 failed all it VMs to host2 in case that machine has trouble and still 
make it resources available to my users? This is very important to us if we 
really want to explore this new avenue of Ceph

Thank you,

Yao Mensah
Systems Administrator II
OLS Servers
yao.men...@usdoj.govmailto:yao.men...@usdoj.gov
(202) 307 0354
MCITP
MCSE NT4.0 / 2000-2003
A+

From: Dave Spano [mailto:dsp...@optogenics.com]
Sent: Thursday, May 23, 2013 1:19 PM
To: Mensah, Yao (CIV)
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] FW: About RBD

Unless something changed, each RBD needs to be attached to 1 host at a time 
like an ISCSI lun.
Dave Spano
Optogenics



From: Yao Mensah (CIV) yao.men...@usdoj.govmailto:yao.men...@usdoj.gov
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Sent: Thursday, May 23, 2013 1:10:53 PM
Subject: [ceph-users] FW: About RBD
FYI

From: Mensah, Yao (CIV)
Sent: Wednesday, May 22, 2013 5:59 PM
To: 'i...@inktank.com'
Subject: About RBD

Hello,

I was doing some reading on your web site about ceph and what it capable of. I 
have one question and maybe you can help on this:

Can ceph RBD be used by 2 physical hosts at the same time? Or, is Ceph rbd 
CSV(Clustered Shared Volumes) aware?

Thank you,

Yao Mensah
Systems Administrator II
OLS Servers


___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mkcephfs

2013-05-23 Thread Dewan Shamsul Alam
Hi,

This is what I get while building the cluster:

#/sbin/mkcephfs -a -c /etc/ceph/ceph.conf
temp dir is /tmp/mkcephfs.yzl9PFOJYo
preparing monmap in /tmp/mkcephfs.yzl9PFOJYo/monmap
/usr/bin/monmaptool --create --clobber --add a 192.168.128.10:6789 --add b
192.168.128.11:6789 --add c 192.168.128.12:6789 --print
/tmp/mkcephfs.yzl9PFOJYo/monmap
/usr/bin/monmaptool: monmap file /tmp/mkcephfs.yzl9PFOJYo/monmap
/usr/bin/monmaptool: generated fsid 09136333-16dc-476f-8773-90262ad0b80d
epoch 0
fsid 09136333-16dc-476f-8773-90262ad0b80d
last_changed 2013-05-23 23:36:41.325667
created 2013-05-23 23:36:41.325667
0: 192.168.128.10:6789/0 mon.a
1: 192.168.128.11:6789/0 mon.b
2: 192.168.128.12:6789/0 mon.c
/usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.yzl9PFOJYo/monmap (3
monitors)

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
=== osd.0 ===
2013-05-23 23:36:41.851599 7fe81f143780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:42.576549 7fe81f143780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:42.577795 7fe81f143780 -1
filestore(/var/lib/ceph/osd/ceph-0) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-23 23:36:42.918456 7fe81f143780 -1 created object store
/var/lib/ceph/osd/ceph-0 journal /var/lib/ceph/osd/ceph-0/journal for osd.0
fsid 09136333-16dc-476f-8773-90262ad0b80d
2013-05-23 23:36:42.918520 7fe81f143780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-0/keyring: can't open
/var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2013-05-23 23:36:42.918642 7fe81f143780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-0/keyring

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
=== osd.1 ===
pushing conf and monmap to
ceph1:/tmp/mkfs.ceph.HWyfcu95hsnB1jxdVqxGJNAOd2u3aj5I
2013-05-23 23:36:12.380573 7ff116b63780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:13.026598 7ff116b63780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:13.037762 7ff116b63780 -1
filestore(/var/lib/ceph/osd/ceph-1) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-23 23:36:13.366445 7ff116b63780 -1 created object store
/var/lib/ceph/osd/ceph-1 journal /var/lib/ceph/osd/ceph-1/journal for osd.1
fsid 09136333-16dc-476f-8773-90262ad0b80d
2013-05-23 23:36:13.366510 7ff116b63780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-1/keyring: can't open
/var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
2013-05-23 23:36:13.366621 7ff116b63780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-1/keyring

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
collecting osd.1 key
=== osd.2 ===
pushing conf and monmap to
ceph2:/tmp/mkfs.ceph.tNt36unRvZ6lVKmz65OjiOhrpUfsw7xz
2013-05-23 23:36:59.086209 7fe38a955780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:59.610999 7fe38a955780 -1 journal FileJournal::_open:
disabling aio for non-block journal.  Use journal_force_aio to force use of
aio anyway
2013-05-23 23:36:59.623725 7fe38a955780 -1
filestore(/var/lib/ceph/osd/ceph-2) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-23 23:36:59.850510 7fe38a955780 -1 created object store
/var/lib/ceph/osd/ceph-2 journal /var/lib/ceph/osd/ceph-2/journal for osd.2
fsid 09136333-16dc-476f-8773-90262ad0b80d
2013-05-23 23:36:59.850574 7fe38a955780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-2/keyring: can't open
/var/lib/ceph/osd/ceph-2/keyring: (2) No such file or directory
2013-05-23 23:36:59.850688 7fe38a955780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-2/keyring

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
collecting osd.2 key
=== mds.a ===
creating private key for mds.a keyring /var/lib/ceph/mds/ceph-a/keyring
creating /var/lib/ceph/mds/ceph-a/keyring

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
=== mds.b ===
pushing conf and monmap to
ceph1:/tmp/mkfs.ceph.QDX0IZSEBd3469OT6yw6ISlckxfmO6nu
creating private key for mds.b keyring /var/lib/ceph/mds/ceph-b/keyring
creating /var/lib/ceph/mds/ceph-b/keyring

WARNING: mkcephfs is now deprecated in favour of ceph-deploy. Please see:
 http://github.com/ceph/ceph-deploy
collecting mds.b key
=== mds.c ===
pushing conf and monmap to
ceph2:/tmp/mkfs.ceph.XimfAW4CrJR11rs8IAJhsHn0inBNJdhl
creating private 

Re: [ceph-users] RADOS Gateway Configuration

2013-05-23 Thread Daniel Curran
Hey John,

Thanks for the reply. I'll check out that other doc you have there. Just
for future reference do you know where ceph-deploy puts the ceph keyring?

Daniel


On Wed, May 22, 2013 at 7:19 PM, John Wilkins john.wilk...@inktank.comwrote:

 Daniel,

 It looks like I need to update that portion of the docs too, as it
 links back to the 5-minute quick start. Once you are up and running
 with HEALTH OK on either the 5-minute Quick Start or Quick Ceph
 Deploy, your storage cluster is running fine. The remaining issues
 would likely be with authentication, chmod on the files, or with the
 RGW setup. There's a quick start for RGW, which I had verified here:
 http://ceph.com/docs/master/start/quick-rgw/. Someone else had a
 problem with the Rewrite rule on that example reported here:
 http://tracker.ceph.com/issues/4608. It's likely I need to run through
 with specific Ceph and Apache versions. There are also a few
 additional tips in the configuration section.
 http://ceph.com/docs/master/radosgw/config/

 There is an issue in some cases where keys have forward or backslash
 characters, and you may need to regenerate the keys.



 On Wed, May 22, 2013 at 4:42 PM, Daniel Curran danielcurra...@gmail.com
 wrote:
 
  Hello,
 
  I just started using ceph recently and was trying to get the RADOS
 Gateway
  working in order to use the Swift compatible API. I followed the install
  instructions found here (http://ceph.com/docs/master
  /start/quick-ceph-deploy/) and got to a point where ceph health give me
  HEALTH_OK. This is all well and good but near the end of the rados gw
 setup
  (found here http://ceph.com/docs/master/radosgw/manual-install/) I need
 to
  execute the following line:
 
  sudo ceph -k /etc/ceph/ceph.keyring auth add client.radosgw.gateway -i
  /etc/ceph/keyring.radosgw.gateway
 
  Unfortunately, I don't believe ceph-deploy places the keyring at
  /etc/ceph/ceph.keyring. I tried to use the one from
  /var/lib/ceph/bootstrap-osd/ceph.keyring but it was unable to
 authenticate
  as client.admin. Is there another location that the keyring needs to be
  copied from or am I doing something totally wrong?
 
  I didn't want to be held back so I restarted and did the manual install
 from
  the 5-minute quick start where I was able to find the ring. I had more
  issues almost immediately. I have to execute the following steps to
 create
  some users for swift:
 
  radosgw-admin user create --uid=johndoe --display-name=John Doe
  --email=j...@example.com
  sudo radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift
  --access=full
 
  sudo radosgw-admin key create --subuser=johndoe:swift --key-type=swift
 
  The first two gave me output I was expecting but the very last line had
 some
  weirdness that essentially made swift unusable. The expected output is
  something along these lines:
 
  { user_id: johndoe,
rados_uid: 0,
display_name: John Doe,
email: j...@example.com,
suspended: 0,
subusers: [
   { id: johndoe:swift,
 permissions: full-control}],
keys: [
  { user: johndoe,
access_key: QFAMEDSJP5DEKJO0DDXY,
secret_key: iaSFLDVvDdQt6lkNzHyW4fPLZugBAI1g17LO0+87}],
swift_keys: [
  { user: johndoe:swift,
secret_key: E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1}]}
 
  Where that last secret key is what we hand the swift CLI as seen here:
 
  swift -V 1.0 -A http://radosgw.example.com/auth -U johndoe:swift -K
  E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1 post test
 
  However, my output came out like this:
 
  { user_id: johndoe,
display_name: John Doe,
email: j...@example.com,
suspended: 0,
max_buckets: 1000,
auid: 0,
   subusers: [
   { id: johndoe:swift,
 permissions: full-control}],
keys: [
  { user: johndoe,
access_key: SUEXWVL3WB2Z64CRAG97,
secret_key: C\/jHFJ3wdPv4iJ+aq4JeZ52LEC3OdnhsYEnVkhBP}],
swift_keys: [
  { user: johndoe:swift,
secret_key: }],
caps: []}
 
 
  Giving me no swift key to use. I don't believe the key is supposed to be
  blank because I tried that and received auth errors (to the best of my
  ability). I can't tell if this is my fault since I'm new nor am I able to
  find a way around it. It looks like there are definitely changes between
 the
  version used in the doc and mine so maybe it's all working as it should
 but
  the secret_key for swift lives somewhere else. If anyone knows anything
 I'd
  appreciate it a lot.
 
  Thank you,
  Daniel
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 



 --
 John Wilkins
 Senior Technical Writer
 Intank
 john.wilk...@inktank.com
 (415) 425-9599
 http://inktank.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Gateway Configuration

2013-05-23 Thread John Wilkins
It puts it in the same directory where you executed ceph-deploy.

On Thu, May 23, 2013 at 10:57 AM, Daniel Curran
danielcurra...@gmail.com wrote:
 Hey John,

 Thanks for the reply. I'll check out that other doc you have there. Just for
 future reference do you know where ceph-deploy puts the ceph keyring?

 Daniel


 On Wed, May 22, 2013 at 7:19 PM, John Wilkins john.wilk...@inktank.com
 wrote:

 Daniel,

 It looks like I need to update that portion of the docs too, as it
 links back to the 5-minute quick start. Once you are up and running
 with HEALTH OK on either the 5-minute Quick Start or Quick Ceph
 Deploy, your storage cluster is running fine. The remaining issues
 would likely be with authentication, chmod on the files, or with the
 RGW setup. There's a quick start for RGW, which I had verified here:
 http://ceph.com/docs/master/start/quick-rgw/. Someone else had a
 problem with the Rewrite rule on that example reported here:
 http://tracker.ceph.com/issues/4608. It's likely I need to run through
 with specific Ceph and Apache versions. There are also a few
 additional tips in the configuration section.
 http://ceph.com/docs/master/radosgw/config/

 There is an issue in some cases where keys have forward or backslash
 characters, and you may need to regenerate the keys.



 On Wed, May 22, 2013 at 4:42 PM, Daniel Curran danielcurra...@gmail.com
 wrote:
 
  Hello,
 
  I just started using ceph recently and was trying to get the RADOS
  Gateway
  working in order to use the Swift compatible API. I followed the install
  instructions found here (http://ceph.com/docs/master
  /start/quick-ceph-deploy/) and got to a point where ceph health give
  me
  HEALTH_OK. This is all well and good but near the end of the rados gw
  setup
  (found here http://ceph.com/docs/master/radosgw/manual-install/) I need
  to
  execute the following line:
 
  sudo ceph -k /etc/ceph/ceph.keyring auth add client.radosgw.gateway -i
  /etc/ceph/keyring.radosgw.gateway
 
  Unfortunately, I don't believe ceph-deploy places the keyring at
  /etc/ceph/ceph.keyring. I tried to use the one from
  /var/lib/ceph/bootstrap-osd/ceph.keyring but it was unable to
  authenticate
  as client.admin. Is there another location that the keyring needs to be
  copied from or am I doing something totally wrong?
 
  I didn't want to be held back so I restarted and did the manual install
  from
  the 5-minute quick start where I was able to find the ring. I had more
  issues almost immediately. I have to execute the following steps to
  create
  some users for swift:
 
  radosgw-admin user create --uid=johndoe --display-name=John Doe
  --email=j...@example.com
  sudo radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift
  --access=full
 
  sudo radosgw-admin key create --subuser=johndoe:swift --key-type=swift
 
  The first two gave me output I was expecting but the very last line had
  some
  weirdness that essentially made swift unusable. The expected output is
  something along these lines:
 
  { user_id: johndoe,
rados_uid: 0,
display_name: John Doe,
email: j...@example.com,
suspended: 0,
subusers: [
   { id: johndoe:swift,
 permissions: full-control}],
keys: [
  { user: johndoe,
access_key: QFAMEDSJP5DEKJO0DDXY,
secret_key: iaSFLDVvDdQt6lkNzHyW4fPLZugBAI1g17LO0+87}],
swift_keys: [
  { user: johndoe:swift,
secret_key: E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1}]}
 
  Where that last secret key is what we hand the swift CLI as seen here:
 
  swift -V 1.0 -A http://radosgw.example.com/auth -U johndoe:swift -K
  E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1 post test
 
  However, my output came out like this:
 
  { user_id: johndoe,
display_name: John Doe,
email: j...@example.com,
suspended: 0,
max_buckets: 1000,
auid: 0,
   subusers: [
   { id: johndoe:swift,
 permissions: full-control}],
keys: [
  { user: johndoe,
access_key: SUEXWVL3WB2Z64CRAG97,
secret_key: C\/jHFJ3wdPv4iJ+aq4JeZ52LEC3OdnhsYEnVkhBP}],
swift_keys: [
  { user: johndoe:swift,
secret_key: }],
caps: []}
 
 
  Giving me no swift key to use. I don't believe the key is supposed to be
  blank because I tried that and received auth errors (to the best of my
  ability). I can't tell if this is my fault since I'm new nor am I able
  to
  find a way around it. It looks like there are definitely changes between
  the
  version used in the doc and mine so maybe it's all working as it should
  but
  the secret_key for swift lives somewhere else. If anyone knows anything
  I'd
  appreciate it a lot.
 
  Thank you,
  Daniel
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 



 --
 John Wilkins
 Senior Technical Writer
 Intank
 john.wilk...@inktank.com
 (415) 425-9599
 http://inktank.com





-- 
John Wilkins
Senior Technical 

[ceph-users] ZFS on RBD?

2013-05-23 Thread Tim Bishop
Hi all,

I'm evaluating Ceph and one of my workloads is a server that provides
home directories to end users over both NFS and Samba. I'm looking at
whether this could be backed by Ceph provided storage.

So to test this I built a single node Ceph instance (Ubuntu precise,
ceph.com packages) in a VM and popped a couple of OSDs on it. I then
built another VM and used it to mount an RBD from the Ceph node. No
problems... it all worked as described in the documentation.

Then I started to look at the filesystem I was using on top of the RBD.
I'd tested ext4 without any problems. I'd been testing ZFS (from stable
zfs-native PPA) separately against local storage on the client VM too,
so I thought I'd try that on top of the RBD. This is when I hit
problems, and the VM paniced (trace at the end of this email).

Now I am just experimenting, so this isn't a huge deal right now. But
I'm wondering if this is something that should work? Am I overlooking
something? Is it a silly idea to even try it?

The trace looks to be in the ZFS code, so if there's a bug that needs
fixing it's probably over there rather than in Ceph, but I thought here
might be a good starting point for advice.

Thanks in advance everyone,

Tim.

[  504.644120] divide error:  [#1] SMP
[  504.644298] Modules linked in: coretemp(F) ppdev(F) vmw_balloon(F) 
microcode(F) psmouse(F) serio_raw(F) parport_pc(F) vmwgfx(F) i2c_piix4(F) 
mac_hid(F) ttm(F) shpchp(F) drm(F) rbd(F) libceph(F) lp(F) parport(F) zfs(POF) 
zcommon(POF) znvpair(POF) zavl(POF) zunicode(POF) spl(OF) floppy(F) e1000(F) 
mptspi(F) mptscsih(F) mptbase(F) btrfs(F) zlib_deflate(F) libcrc32c(F)
[  504.646156] CPU 0
[  504.646234] Pid: 2281, comm: txg_sync Tainted: PF   B  O 
3.8.0-21-generic #32~precise1-Ubuntu VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform
[  504.646550] RIP: 0010:[a0258092]  [a0258092] 
spa_history_write+0x82/0x1d0 [zfs]
[  504.646816] RSP: 0018:88003ae3dab8  EFLAGS: 00010246
[  504.646940] RAX:  RBX:  RCX: 
[  504.647091] RDX:  RSI: 0020 RDI: 
[  504.647242] RBP: 88003ae3db28 R08: 88003b2afc00 R09: 0002
[  504.647423] R10: 88003b9a4512 R11: 6d206b6e61742066 R12: 88003add6600
[  504.647600] R13: 88003cfc2000 R14: 88003d3c9000 R15: 0008
[  504.647778] FS:  () GS:88003fc0() 
knlGS:
[  504.647997] CS:  0010 DS:  ES:  CR0: 8005003b
[  504.648153] CR2: 7fbc1ef54a38 CR3: 3bf3e000 CR4: 07f0
[  504.648380] DR0:  DR1:  DR2: 
[  504.648586] DR3:  DR6: 0ff0 DR7: 0400
[  504.648766] Process txg_sync (pid: 2281, threadinfo 88003ae3c000, task 
88003b7c45c0)
[  504.648990] Stack:
[  504.649087]  0002 a01e3360 88003b2afc00 
88003ae3dba0
[  504.649461]  88003d3c9000 0008 88003cfc2000 
5530ebc2
[  504.649835]  88003d22ac40 88003d22ac40 88003cfc2000 
88003b2afc00
[  504.650209] Call Trace:
[  504.650351]  [a0258415] spa_history_log_sync+0x235/0x650 [zfs]
[  504.650554]  [a023fdf3] dsl_sync_task_group_sync+0x123/0x210 [zfs]
[  504.650760]  [a0237deb] dsl_pool_sync+0x41b/0x530 [zfs]
[  504.650953]  [a024cfd8] spa_sync+0x3a8/0xa50 [zfs]
[  504.651117]  [810ae6ac] ? ktime_get_ts+0x4c/0xe0
[  504.651302]  [a025de3f] txg_sync_thread+0x2df/0x540 [zfs]
[  504.651501]  [a025db60] ? txg_init+0x250/0x250 [zfs]
[  504.651676]  [a0156c58] thread_generic_wrapper+0x78/0x90 [spl]
[  504.651856]  [a0156be0] ? __thread_create+0x310/0x310 [spl]
[  504.652029]  [8107f000] kthread+0xc0/0xd0
[  504.652174]  [8107ef40] ? flush_kthread_worker+0xb0/0xb0
[  504.652339]  [816facac] ret_from_fork+0x7c/0xb0
[  504.652492]  [8107ef40] ? flush_kthread_worker+0xb0/0xb0
[  504.652655] Code: 55 b0 48 89 fa 48 29 f2 48 01 c2 48 39 55 b8 0f 82 bc 00 
00 00 4c 8b 75 b0 41 bf 08 00 00 00 48 29 c8 31 d2 49 8b b5 70 08 00 00 48 f7 
f7 4c 8d 45 c0 4c 89 f7 48 01 ca 48 29 d3 48 83 fb 08 49
[  504.659810] RIP  [a0258092] spa_history_write+0x82/0x1d0 [zfs]
[  504.660045]  RSP 88003ae3dab8
[  504.660187] ---[ end trace e69c7eee3ba17773 ]---

-- 
Tim Bishop
http://www.bishnet.net/tim/
PGP Key: 0x5AE7D984
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] MDS dying on cuttlefish

2013-05-23 Thread Giuseppe 'Gippa' Paterno'
Hi!

I've got a cluster of two nodes on Ubuntu 12.04 with cuttlefish from the
ceph.com repo.
ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)

The MDS process is dying after a while with a stack trace, but I can't
understand why.
I reproduced the same problem on debian 7 with the same repository.

-3 2013-05-23 23:00:42.957679 7fa39e28e700  1 --
10.123.200.189:6800/28919 == osd.0 10.123.200.188:6802/27665 1 
osd_op_reply(5 200. [read 0~0] ack = -2 (No such file or
directory)) v4  111+0+0 (2261481792 0 0) 0x29afe00 con 0x29c4b00
-2 2013-05-23 23:00:42.957780 7fa39e28e700  0 mds.0.journaler(ro)
error getting journal off disk
-1 2013-05-23 23:00:42.960974 7fa39e28e700  1 --
10.123.200.189:6800/28919 == osd.0 10.123.200.188:6802/27665 2 
osd_op_reply(1 mds0_inotable [read 0~0] ack = -2 (No such file or
directory)) v4  112+0+0 (1612134461 0 0) 0x2a1c200 con 0x29c4b00
 0 2013-05-23 23:00:42.963326 7fa39e28e700 -1 mds/MDSTable.cc: In
function 'void MDSTable::load_2(int, ceph::bufferlist, Context*)'
thread 7fa39e28e700 time 2013-05-23 23:00:42.961076
mds/MDSTable.cc: 150: FAILED assert(0)

 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)
 1: (MDSTable::load_2(int, ceph::buffer::list, Context*)+0x3bb) [0x6dd2db]
 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe1b) [0x7275bb]
 3: (MDS::handle_core_message(Message*)+0xae7) [0x513c57]
 4: (MDS::_dispatch(Message*)+0x33) [0x513d53]
 5: (MDS::ms_dispatch(Message*)+0xab) [0x515b3b]
 6: (DispatchQueue::entry()+0x393) [0x847ca3]
 7: (DispatchQueue::DispatchThread::entry()+0xd) [0x7caeed]
 8: (()+0x6b50) [0x7fa3a3376b50]
 9: (clone()+0x6d) [0x7fa3a1d24a7d]

Full logs here:
http://pastebin.com/C81g5jFd

I can't understand why and I'd really appreciate an hint.
Thanks!
Regards,
  Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] scrub error: found clone without head

2013-05-23 Thread Samuel Just
Do all of the affected PGs share osd.28 as the primary?  I think the
only recovery is probably to manually remove the orphaned clones.
-Sam

On Thu, May 23, 2013 at 5:00 AM, Olivier Bonvalet ceph.l...@daevel.fr wrote:
 Not yet. I keep it for now.

 Le mercredi 22 mai 2013 à 15:50 -0700, Samuel Just a écrit :
 rb.0.15c26.238e1f29

 Has that rbd volume been removed?
 -Sam

 On Wed, May 22, 2013 at 12:18 PM, Olivier Bonvalet ceph.l...@daevel.fr 
 wrote:
  0.61-11-g3b94f03 (0.61-1.1), but the bug occured with bobtail.
 
 
  Le mercredi 22 mai 2013 à 12:00 -0700, Samuel Just a écrit :
  What version are you running?
  -Sam
 
  On Wed, May 22, 2013 at 11:25 AM, Olivier Bonvalet ceph.l...@daevel.fr 
  wrote:
   Is it enough ?
  
   # tail -n500 -f /var/log/ceph/osd.28.log | grep -A5 -B5 'found clone 
   without head'
   2013-05-22 15:43:09.308352 7f707dd64700  0 log [INF] : 9.105 scrub ok
   2013-05-22 15:44:21.054893 7f707dd64700  0 log [INF] : 9.451 scrub ok
   2013-05-22 15:44:52.898784 7f707cd62700  0 log [INF] : 9.784 scrub ok
   2013-05-22 15:47:43.148515 7f707cd62700  0 log [INF] : 9.3c3 scrub ok
   2013-05-22 15:47:45.717085 7f707dd64700  0 log [INF] : 9.3d0 scrub ok
   2013-05-22 15:52:14.573815 7f707dd64700  0 log [ERR] : scrub 3.6b 
   ade3c16b/rb.0.15c26.238e1f29.9221/12d7//3 found clone without 
   head
   2013-05-22 15:55:07.230114 7f707d563700  0 log [ERR] : scrub 3.6b 
   261cc0eb/rb.0.15c26.238e1f29.3671/12d7//3 found clone without 
   head
   2013-05-22 15:56:56.456242 7f707d563700  0 log [ERR] : scrub 3.6b 
   b10deaeb/rb.0.15c26.238e1f29.86a2/12d7//3 found clone without 
   head
   2013-05-22 15:57:51.667085 7f707dd64700  0 log [ERR] : 3.6b scrub 3 
   errors
   2013-05-22 15:57:55.241224 7f707dd64700  0 log [INF] : 9.450 scrub ok
   2013-05-22 15:57:59.800383 7f707cd62700  0 log [INF] : 9.465 scrub ok
   2013-05-22 15:59:55.024065 7f707661a700  0 -- 192.168.42.3:6803/12142 
192.168.42.5:6828/31490 pipe(0x2a689000 sd=108 :6803 s=2 pgs=200652 
   cs=73 l=0).fault with nothing to send, going to standby
   2013-05-22 16:01:45.542579 7f7022770700  0 -- 192.168.42.3:6803/12142 
192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 74 vs existing 73 state standby
   --
   2013-05-22 16:29:49.544310 7f707dd64700  0 log [INF] : 9.4eb scrub ok
   2013-05-22 16:29:53.190233 7f707dd64700  0 log [INF] : 9.4f4 scrub ok
   2013-05-22 16:29:59.478736 7f707dd64700  0 log [INF] : 8.6bb scrub ok
   2013-05-22 16:35:12.240246 7f7022770700  0 -- 192.168.42.3:6803/12142 
192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=2 pgs=200667 
   cs=75 l=0).fault with nothing to send, going to standby
   2013-05-22 16:35:19.519019 7f707d563700  0 log [INF] : 8.700 scrub ok
   2013-05-22 16:39:15.422532 7f707dd64700  0 log [ERR] : scrub 3.1 
   b1869301/rb.0.15c26.238e1f29.0836/12d7//3 found clone without 
   head
   2013-05-22 16:40:04.995256 7f707cd62700  0 log [ERR] : scrub 3.1 
   bccad701/rb.0.15c26.238e1f29.9a00/12d7//3 found clone without 
   head
   2013-05-22 16:41:07.008717 7f707d563700  0 log [ERR] : scrub 3.1 
   8a9bec01/rb.0.15c26.238e1f29.9820/12d7//3 found clone without 
   head
   2013-05-22 16:41:42.460280 7f707c561700  0 log [ERR] : 3.1 scrub 3 
   errors
   2013-05-22 16:46:12.385678 7f7077735700  0 -- 192.168.42.3:6803/12142 
192.168.42.5:6828/31490 pipe(0x2a689c80 sd=137 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 76 vs existing 75 state standby
   2013-05-22 16:58:36.079010 7f707661a700  0 -- 192.168.42.3:6803/12142 
192.168.42.3:6801/11745 pipe(0x2a689a00 sd=44 :6803 s=0 pgs=0 cs=0 
   l=0).accept connect_seq 40 vs existing 39 state standby
   2013-05-22 16:58:36.798038 7f707d563700  0 log [INF] : 9.50c scrub ok
   2013-05-22 16:58:40.104159 7f707c561700  0 log [INF] : 9.526 scrub ok
  
  
   Note : I have 8 scrub errors like that, on 4 impacted PG, and all 
   impacted objects are about the same RBD image (rb.0.15c26.238e1f29).
  
  
  
   Le mercredi 22 mai 2013 à 11:01 -0700, Samuel Just a écrit :
   Can you post your ceph.log with the period including all of these 
   errors?
   -Sam
  
   On Wed, May 22, 2013 at 5:39 AM, Dzianis Kahanovich
   maha...@bspu.unibel.by wrote:
Olivier Bonvalet пишет:
   
Le lundi 20 mai 2013 à 00:06 +0200, Olivier Bonvalet a écrit :
Le mardi 07 mai 2013 à 15:51 +0300, Dzianis Kahanovich a écrit :
I have 4 scrub errors (3 PGs - found clone without head), on 
one OSD. Not
repairing. How to repair it exclude re-creating of OSD?
   
Now it easy to clean+create OSD, but in theory - in case there 
are multiple
OSDs - it may cause data lost.
   
I have same problem : 8 objects (4 PG) with error found clone 
without
head. How can I fix that ?
since pg repair doesn't handle that kind of errors, is there a 
way to
manually fix that ? (it's a production cluster)
   
Trying to fix manually I cause 

Re: [ceph-users] scrub error: found clone without head

2013-05-23 Thread Olivier Bonvalet
No : 
pg 3.7c is active+clean+inconsistent, acting [24,13,39]
pg 3.6b is active+clean+inconsistent, acting [28,23,5]
pg 3.d is active+clean+inconsistent, acting [29,4,11]
pg 3.1 is active+clean+inconsistent, acting [28,19,5]

But I suppose that all PG *was* having the osd.25 as primary (on the
same host), which is (disabled) buggy OSD.

Question : 12d7 in object path is the snapshot id, right ? If it's the
case, I haven't got any snapshot with this id for the
rb.0.15c26.238e1f29 image.

So, which files should I remove ?

Thanks for your help.


Le jeudi 23 mai 2013 à 15:17 -0700, Samuel Just a écrit :
 Do all of the affected PGs share osd.28 as the primary?  I think the
 only recovery is probably to manually remove the orphaned clones.
 -Sam
 
 On Thu, May 23, 2013 at 5:00 AM, Olivier Bonvalet ceph.l...@daevel.fr wrote:
  Not yet. I keep it for now.
 
  Le mercredi 22 mai 2013 à 15:50 -0700, Samuel Just a écrit :
  rb.0.15c26.238e1f29
 
  Has that rbd volume been removed?
  -Sam
 
  On Wed, May 22, 2013 at 12:18 PM, Olivier Bonvalet ceph.l...@daevel.fr 
  wrote:
   0.61-11-g3b94f03 (0.61-1.1), but the bug occured with bobtail.
  
  
   Le mercredi 22 mai 2013 à 12:00 -0700, Samuel Just a écrit :
   What version are you running?
   -Sam
  
   On Wed, May 22, 2013 at 11:25 AM, Olivier Bonvalet 
   ceph.l...@daevel.fr wrote:
Is it enough ?
   
# tail -n500 -f /var/log/ceph/osd.28.log | grep -A5 -B5 'found clone 
without head'
2013-05-22 15:43:09.308352 7f707dd64700  0 log [INF] : 9.105 scrub ok
2013-05-22 15:44:21.054893 7f707dd64700  0 log [INF] : 9.451 scrub ok
2013-05-22 15:44:52.898784 7f707cd62700  0 log [INF] : 9.784 scrub ok
2013-05-22 15:47:43.148515 7f707cd62700  0 log [INF] : 9.3c3 scrub ok
2013-05-22 15:47:45.717085 7f707dd64700  0 log [INF] : 9.3d0 scrub ok
2013-05-22 15:52:14.573815 7f707dd64700  0 log [ERR] : scrub 3.6b 
ade3c16b/rb.0.15c26.238e1f29.9221/12d7//3 found clone without 
head
2013-05-22 15:55:07.230114 7f707d563700  0 log [ERR] : scrub 3.6b 
261cc0eb/rb.0.15c26.238e1f29.3671/12d7//3 found clone without 
head
2013-05-22 15:56:56.456242 7f707d563700  0 log [ERR] : scrub 3.6b 
b10deaeb/rb.0.15c26.238e1f29.86a2/12d7//3 found clone without 
head
2013-05-22 15:57:51.667085 7f707dd64700  0 log [ERR] : 3.6b scrub 3 
errors
2013-05-22 15:57:55.241224 7f707dd64700  0 log [INF] : 9.450 scrub ok
2013-05-22 15:57:59.800383 7f707cd62700  0 log [INF] : 9.465 scrub ok
2013-05-22 15:59:55.024065 7f707661a700  0 -- 192.168.42.3:6803/12142 
 192.168.42.5:6828/31490 pipe(0x2a689000 sd=108 :6803 s=2 
pgs=200652 cs=73 l=0).fault with nothing to send, going to standby
2013-05-22 16:01:45.542579 7f7022770700  0 -- 192.168.42.3:6803/12142 
 192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=0 pgs=0 cs=0 
l=0).accept connect_seq 74 vs existing 73 state standby
--
2013-05-22 16:29:49.544310 7f707dd64700  0 log [INF] : 9.4eb scrub ok
2013-05-22 16:29:53.190233 7f707dd64700  0 log [INF] : 9.4f4 scrub ok
2013-05-22 16:29:59.478736 7f707dd64700  0 log [INF] : 8.6bb scrub ok
2013-05-22 16:35:12.240246 7f7022770700  0 -- 192.168.42.3:6803/12142 
 192.168.42.5:6828/31490 pipe(0x2a689280 sd=99 :6803 s=2 pgs=200667 
cs=75 l=0).fault with nothing to send, going to standby
2013-05-22 16:35:19.519019 7f707d563700  0 log [INF] : 8.700 scrub ok
2013-05-22 16:39:15.422532 7f707dd64700  0 log [ERR] : scrub 3.1 
b1869301/rb.0.15c26.238e1f29.0836/12d7//3 found clone without 
head
2013-05-22 16:40:04.995256 7f707cd62700  0 log [ERR] : scrub 3.1 
bccad701/rb.0.15c26.238e1f29.9a00/12d7//3 found clone without 
head
2013-05-22 16:41:07.008717 7f707d563700  0 log [ERR] : scrub 3.1 
8a9bec01/rb.0.15c26.238e1f29.9820/12d7//3 found clone without 
head
2013-05-22 16:41:42.460280 7f707c561700  0 log [ERR] : 3.1 scrub 3 
errors
2013-05-22 16:46:12.385678 7f7077735700  0 -- 192.168.42.3:6803/12142 
 192.168.42.5:6828/31490 pipe(0x2a689c80 sd=137 :6803 s=0 pgs=0 
cs=0 l=0).accept connect_seq 76 vs existing 75 state standby
2013-05-22 16:58:36.079010 7f707661a700  0 -- 192.168.42.3:6803/12142 
 192.168.42.3:6801/11745 pipe(0x2a689a00 sd=44 :6803 s=0 pgs=0 cs=0 
l=0).accept connect_seq 40 vs existing 39 state standby
2013-05-22 16:58:36.798038 7f707d563700  0 log [INF] : 9.50c scrub ok
2013-05-22 16:58:40.104159 7f707c561700  0 log [INF] : 9.526 scrub ok
   
   
Note : I have 8 scrub errors like that, on 4 impacted PG, and all 
impacted objects are about the same RBD image (rb.0.15c26.238e1f29).
   
   
   
Le mercredi 22 mai 2013 à 11:01 -0700, Samuel Just a écrit :
Can you post your ceph.log with the period including all of these 
errors?
-Sam
   
On Wed, May 22, 2013 at 5:39 AM, Dzianis Kahanovich
maha...@bspu.unibel.by wrote:
 Olivier 

Re: [ceph-users] scrub error: found clone without head

2013-05-23 Thread Samuel Just
Can you send the filenames in the pg directories for those 4 pgs?
-Sam

On Thu, May 23, 2013 at 3:27 PM, Olivier Bonvalet ceph.l...@daevel.fr wrote:
 No :
 pg 3.7c is active+clean+inconsistent, acting [24,13,39]
 pg 3.6b is active+clean+inconsistent, acting [28,23,5]
 pg 3.d is active+clean+inconsistent, acting [29,4,11]
 pg 3.1 is active+clean+inconsistent, acting [28,19,5]

 But I suppose that all PG *was* having the osd.25 as primary (on the
 same host), which is (disabled) buggy OSD.

 Question : 12d7 in object path is the snapshot id, right ? If it's the
 case, I haven't got any snapshot with this id for the
 rb.0.15c26.238e1f29 image.

 So, which files should I remove ?

 Thanks for your help.


 Le jeudi 23 mai 2013 à 15:17 -0700, Samuel Just a écrit :
 Do all of the affected PGs share osd.28 as the primary?  I think the
 only recovery is probably to manually remove the orphaned clones.
 -Sam

 On Thu, May 23, 2013 at 5:00 AM, Olivier Bonvalet ceph.l...@daevel.fr 
 wrote:
  Not yet. I keep it for now.
 
  Le mercredi 22 mai 2013 à 15:50 -0700, Samuel Just a écrit :
  rb.0.15c26.238e1f29
 
  Has that rbd volume been removed?
  -Sam
 
  On Wed, May 22, 2013 at 12:18 PM, Olivier Bonvalet ceph.l...@daevel.fr 
  wrote:
   0.61-11-g3b94f03 (0.61-1.1), but the bug occured with bobtail.
  
  
   Le mercredi 22 mai 2013 à 12:00 -0700, Samuel Just a écrit :
   What version are you running?
   -Sam
  
   On Wed, May 22, 2013 at 11:25 AM, Olivier Bonvalet 
   ceph.l...@daevel.fr wrote:
Is it enough ?
   
# tail -n500 -f /var/log/ceph/osd.28.log | grep -A5 -B5 'found clone 
without head'
2013-05-22 15:43:09.308352 7f707dd64700  0 log [INF] : 9.105 scrub ok
2013-05-22 15:44:21.054893 7f707dd64700  0 log [INF] : 9.451 scrub ok
2013-05-22 15:44:52.898784 7f707cd62700  0 log [INF] : 9.784 scrub ok
2013-05-22 15:47:43.148515 7f707cd62700  0 log [INF] : 9.3c3 scrub ok
2013-05-22 15:47:45.717085 7f707dd64700  0 log [INF] : 9.3d0 scrub ok
2013-05-22 15:52:14.573815 7f707dd64700  0 log [ERR] : scrub 3.6b 
ade3c16b/rb.0.15c26.238e1f29.9221/12d7//3 found clone 
without head
2013-05-22 15:55:07.230114 7f707d563700  0 log [ERR] : scrub 3.6b 
261cc0eb/rb.0.15c26.238e1f29.3671/12d7//3 found clone 
without head
2013-05-22 15:56:56.456242 7f707d563700  0 log [ERR] : scrub 3.6b 
b10deaeb/rb.0.15c26.238e1f29.86a2/12d7//3 found clone 
without head
2013-05-22 15:57:51.667085 7f707dd64700  0 log [ERR] : 3.6b scrub 3 
errors
2013-05-22 15:57:55.241224 7f707dd64700  0 log [INF] : 9.450 scrub ok
2013-05-22 15:57:59.800383 7f707cd62700  0 log [INF] : 9.465 scrub ok
2013-05-22 15:59:55.024065 7f707661a700  0 -- 
192.168.42.3:6803/12142  192.168.42.5:6828/31490 pipe(0x2a689000 
sd=108 :6803 s=2 pgs=200652 cs=73 l=0).fault with nothing to send, 
going to standby
2013-05-22 16:01:45.542579 7f7022770700  0 -- 
192.168.42.3:6803/12142  192.168.42.5:6828/31490 pipe(0x2a689280 
sd=99 :6803 s=0 pgs=0 cs=0 l=0).accept connect_seq 74 vs existing 73 
state standby
--
2013-05-22 16:29:49.544310 7f707dd64700  0 log [INF] : 9.4eb scrub ok
2013-05-22 16:29:53.190233 7f707dd64700  0 log [INF] : 9.4f4 scrub ok
2013-05-22 16:29:59.478736 7f707dd64700  0 log [INF] : 8.6bb scrub ok
2013-05-22 16:35:12.240246 7f7022770700  0 -- 
192.168.42.3:6803/12142  192.168.42.5:6828/31490 pipe(0x2a689280 
sd=99 :6803 s=2 pgs=200667 cs=75 l=0).fault with nothing to send, 
going to standby
2013-05-22 16:35:19.519019 7f707d563700  0 log [INF] : 8.700 scrub ok
2013-05-22 16:39:15.422532 7f707dd64700  0 log [ERR] : scrub 3.1 
b1869301/rb.0.15c26.238e1f29.0836/12d7//3 found clone 
without head
2013-05-22 16:40:04.995256 7f707cd62700  0 log [ERR] : scrub 3.1 
bccad701/rb.0.15c26.238e1f29.9a00/12d7//3 found clone 
without head
2013-05-22 16:41:07.008717 7f707d563700  0 log [ERR] : scrub 3.1 
8a9bec01/rb.0.15c26.238e1f29.9820/12d7//3 found clone 
without head
2013-05-22 16:41:42.460280 7f707c561700  0 log [ERR] : 3.1 scrub 3 
errors
2013-05-22 16:46:12.385678 7f7077735700  0 -- 
192.168.42.3:6803/12142  192.168.42.5:6828/31490 pipe(0x2a689c80 
sd=137 :6803 s=0 pgs=0 cs=0 l=0).accept connect_seq 76 vs existing 
75 state standby
2013-05-22 16:58:36.079010 7f707661a700  0 -- 
192.168.42.3:6803/12142  192.168.42.3:6801/11745 pipe(0x2a689a00 
sd=44 :6803 s=0 pgs=0 cs=0 l=0).accept connect_seq 40 vs existing 39 
state standby
2013-05-22 16:58:36.798038 7f707d563700  0 log [INF] : 9.50c scrub ok
2013-05-22 16:58:40.104159 7f707c561700  0 log [INF] : 9.526 scrub ok
   
   
Note : I have 8 scrub errors like that, on 4 impacted PG, and all 
impacted objects are about the same RBD image (rb.0.15c26.238e1f29).
   
   
   
Le mercredi 22 mai 2013 à 11:01 -0700, Samuel Just a écrit :
Can you post 

Re: [ceph-users] mkcephfs

2013-05-23 Thread Dewan Shamsul Alam
Hi,

The previous log is based on cuttlefish. This one is based on bobtail. I'm
not using cephx, may be that's what causing the problem?

temp dir is /tmp/mkcephfs.xf5TsinRsL
preparing monmap in /tmp/mkcephfs.xf5TsinRsL/monmap
/usr/bin/monmaptool --create --clobber --add a 192.168.128.10:6789 --add b
192.168.128.11:6789 --add c 192.168.128.12:6789 --print
/tmp/mkcephfs.xf5TsinRsL/monmap
/usr/bin/monmaptool: monmap file /tmp/mkcephfs.xf5TsinRsL/monmap
/usr/bin/monmaptool: generated fsid 1168e717-5db5-488c-a1f7-0b61e7f19138
epoch 0
fsid 1168e717-5db5-488c-a1f7-0b61e7f19138
last_changed 2013-05-24 09:51:15.012839
created 2013-05-24 09:51:15.012839
0: 192.168.128.10:6789/0 mon.a
1: 192.168.128.11:6789/0 mon.b
2: 192.168.128.12:6789/0 mon.c
/usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.xf5TsinRsL/monmap (3
monitors)
=== osd.0 ===
2013-05-24 09:51:16.247443 7f19b2ce5780 -1
filestore(/var/lib/ceph/osd/ceph-0) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-24 09:51:16.604586 7f19b2ce5780 -1 created object store
/var/lib/ceph/osd/ceph-0 journal /var/lib/ceph/osd/ceph-0/journal for osd.0
fsid 1168e717-5db5-488c-a1f7-0b61e7f19138
2013-05-24 09:51:16.604667 7f19b2ce5780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-0/keyring: can't open
/var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2013-05-24 09:51:16.604850 7f19b2ce5780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-0/keyring
=== osd.1 ===
pushing conf and monmap to
ceph1:/tmp/mkfs.ceph.f0a8d758e9f1a3f32160f67a12149281
2013-05-24 09:50:46.405722 7fd7fa8c5780 -1
filestore(/var/lib/ceph/osd/ceph-1) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-24 09:50:46.750885 7fd7fa8c5780 -1 created object store
/var/lib/ceph/osd/ceph-1 journal /var/lib/ceph/osd/ceph-1/journal for osd.1
fsid 1168e717-5db5-488c-a1f7-0b61e7f19138
2013-05-24 09:50:46.750945 7fd7fa8c5780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-1/keyring: can't open
/var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
2013-05-24 09:50:46.751120 7fd7fa8c5780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-1/keyring
collecting osd.1 key
=== osd.2 ===
pushing conf and monmap to
ceph2:/tmp/mkfs.ceph.e07be4351777982bb28d1cc7ab52e01b
2013-05-24 09:51:33.623922 7fa231b87780 -1
filestore(/var/lib/ceph/osd/ceph-2) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2013-05-24 09:51:33.859703 7fa231b87780 -1 created object store
/var/lib/ceph/osd/ceph-2 journal /var/lib/ceph/osd/ceph-2/journal for osd.2
fsid 1168e717-5db5-488c-a1f7-0b61e7f19138
2013-05-24 09:51:33.859772 7fa231b87780 -1 auth: error reading file:
/var/lib/ceph/osd/ceph-2/keyring: can't open
/var/lib/ceph/osd/ceph-2/keyring: (2) No such file or directory
2013-05-24 09:51:33.859930 7fa231b87780 -1 created new key in keyring
/var/lib/ceph/osd/ceph-2/keyring
collecting osd.2 key
=== mds.a ===
creating private key for mds.a keyring /var/lib/ceph/mds/ceph-a/keyring
creating /var/lib/ceph/mds/ceph-a/keyring
=== mds.b ===
pushing conf and monmap to
ceph1:/tmp/mkfs.ceph.5031963c92bcc7e98cd4422ee99d1220
creating private key for mds.b keyring /var/lib/ceph/mds/ceph-b/keyring
creating /var/lib/ceph/mds/ceph-b/keyring
collecting mds.b key
=== mds.c ===
pushing conf and monmap to
ceph2:/tmp/mkfs.ceph.a7475d8f01e340fc9e410fd60a3d8a80
creating private key for mds.c keyring /var/lib/ceph/mds/ceph-c/keyring
creating /var/lib/ceph/mds/ceph-c/keyring
collecting mds.c key
Building generic osdmap from /tmp/mkcephfs.xf5TsinRsL/conf
/usr/bin/osdmaptool: osdmap file '/tmp/mkcephfs.xf5TsinRsL/osdmap'
/usr/bin/osdmaptool: writing epoch 1 to /tmp/mkcephfs.xf5TsinRsL/osdmap
Generating admin key at /tmp/mkcephfs.xf5TsinRsL/keyring.admin
creating /tmp/mkcephfs.xf5TsinRsL/keyring.admin
Building initial monitor keyring
added entity mds.a auth auth(auid = 18446744073709551615
key=AQC9455RSHUiFhAAcxXX2jmAnPk69KGQ7rUczA== with 0 caps)
added entity mds.b auth auth(auid = 18446744073709551615
key=AQCe455RcAEhCRAARPQhkIuPvYIXDElerq+zJg== with 0 caps)
added entity mds.c auth auth(auid = 18446744073709551615
key=AQDM455R0KRfEhAAPByEb/CBqaUK68tssLH/ug== with 0 caps)
added entity osd.0 auth auth(auid = 18446744073709551615
key=AQC0455RWKsKJBAANgtn7Xq8g3u1CPXekMCy7g== with 0 caps)
added entity osd.1 auth auth(auid = 18446744073709551615
key=AQCW455RQJ7CLBAAQzY2cAMjFwHLd36s1w8m6g== with 0 caps)
added entity osd.2 auth auth(auid = 18446744073709551615
key=AQDF455RoDM/MxAAaC/sEyCE4xFpeNHCqkyZSA== with 0 caps)
=== mon.a ===
/usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-a for mon.a
=== mon.b ===
pushing everything to ceph1
/usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-b for mon.b
=== mon.c ===
pushing everything to ceph2
/usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-c for mon.c
placing client.admin keyring in /etc/ceph/keyring
2013-05-24 09:51:36.973708 7f774d118700  0 -- :/21213 

Re: [ceph-users] ceph-deploy

2013-05-23 Thread Dewan Shamsul Alam
I just found that

#ceph-deploy gatherkeys ceph0 ceph1 ceph2

works only if I have bobtail. cuttlefish can't find ceph.client.admin.
keyring

and then when I try this on bobtail, it says,

root@cephdeploy:~/12.04# ceph-deploy osd create ceph0:/dev/sda3
ceph1:/dev/sda3 ceph2:/dev/sda3
ceph-disk: Error: Device is mounted: /dev/sda3
Traceback (most recent call last):
  File /usr/bin/ceph-deploy, line 22, in module
main()
  File /usr/lib/pymodules/python2.7/ceph_deploy/cli.py, line 112, in main
return args.func(args)
  File /usr/lib/pymodules/python2.7/ceph_deploy/osd.py, line 293, in osd
prepare(args, cfg, activate_prepared_disk=True)
  File /usr/lib/pymodules/python2.7/ceph_deploy/osd.py, line 177, in
prepare
dmcrypt_dir=args.dmcrypt_key_dir,
  File /usr/lib/python2.7/dist-packages/pushy/protocol/proxy.py, line
255, in lambda
(conn.operator(type_, self, args, kwargs))
  File /usr/lib/python2.7/dist-packages/pushy/protocol/connection.py,
line 66, in operator
return self.send_request(type_, (object, args, kwargs))
  File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py,
line 323, in send_request
return self.__handle(m)
  File /usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py,
line 639, in __handle
raise e
pushy.protocol.proxy.ExceptionProxy: Command '['ceph-disk-prepare', '--',
'/dev/sda3']' returned non-zero exit status 1
root@cephdeploy:~/12.04#




On Thu, May 23, 2013 at 10:49 PM, Dewan Shamsul Alam 
dewan.sham...@gmail.com wrote:

 Hi,

 I tried ceph-deploy all day. Found that it has a python-setuptools as
 dependency. I knew about python-pushy. But is there any other dependency
 that I'm missing?

 The problem I'm getting are as follows:

 #ceph-deploy gatherkeys ceph0 ceph1 ceph2
 returns the following error,
 Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph0', 'ceph1',
 'ceph2']

 Once I got passed this, I don't know why it works sometimes. I have been
 following the exact steps as mentioned in the blog.

 Then when I try to do

 ceph-deploy osd create ceph0:/dev/sda3 ceph1:/dev/sda3 ceph2:/dev/sda3

 It gets stuck.

 I'm using Ubuntu 13.04 for ceph-deploy and 12.04 for ceph nodes. I just
 need to get the cuttlefish working and willing to change the OS if it is
 required. Please help. :)

 Best Regards,
 Dewan Shamsul Alam

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com