Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-14 Thread Scott Laird
I ended up destroying the EC pool and starting over.  It was killing all of
my OSD machines, and I couldn't keep anything working right with EC in
use.  So, no core dumps and I'm not in a place to reproduce easily
anymore.  This was with Giant on Ubuntu 14.04.

On Thu Feb 12 2015 at 7:07:38 AM Mark Nelson mnel...@redhat.com wrote:

 On 02/08/2015 10:41 PM, Scott Laird wrote:
  Does anyone have a good recommendation for per-OSD memory for EC?  My EC
  test blew up in my face when my OSDs suddenly spiked to 10+ GB per OSD
  process as soon as any reconstruction was needed.  Which (of course)
  caused OSDs to OOM, which meant more reconstruction, which fairly
  immediately led to a dead cluster.  This was with Giant.  Is this
 typical?

 Doh, that shouldn't happen.  Can you reproduce it?  Would be especially
 nice if we could get a core dump or if you could make it happen under
 valgrind.  If the CPUs are spinning, even a perf report might prove useful.

 
  On Fri Feb 06 2015 at 2:41:50 AM Mohamed Pakkeer mdfakk...@gmail.com
  mailto:mdfakk...@gmail.com wrote:
 
  Hi all,
 
  We are building EC cluster with cache tier for CephFS. We are
  planning to use the following 1U chassis along with Intel SSD DC
  S3700 for cache tier. It has 10 * 2.5 slots. Could you recommend a
  suitable Intel processor and amount of RAM to cater 10 * SSDs?.
 
  http://www.supermicro.com/products/system/1U/1028/SYS-1028R-WTRT.cfm
 
 
  Regards
 
  K.Mohamed Pakkeer
 
 
 
  On Fri, Feb 6, 2015 at 2:57 PM, Stephan Seitz
  s.se...@heinlein-support.de mailto:s.se...@heinlein-support.de
  wrote:
 
  Hi,
 
  Am Dienstag, den 03.02.2015, 15:16 + schrieb Colombo Marco:
   Hi all,
I have to build a new Ceph storage cluster, after i‘ve read
 the
   hardware recommendations and some mail from this mailing list
 i would
   like to buy these servers:
 
  just FYI:
 
  SuperMicro already focuses on ceph with a productline:
  http://www.supermicro.com/solutions/datasheet_Ceph.pdf
  http://www.supermicro.com/solutions/storage_ceph.cfm
 
 
 
  regards,
 
 
  Stephan Seitz
 
  --
 
  Heinlein Support GmbH
  Schwedter Str. 8/9b, 10119 Berlin
 
  http://www.heinlein-support.de
 
  Tel: 030 / 405051-44
  Fax: 030 / 405051-19
 
  Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
  Berlin-Charlottenburg,
  Geschäftsführer: Peer Heinlein -- Sitz: Berlin
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
  --
  Thanks  Regards
  K.Mohamed Pakkeer
  Mobile- 0091-8754410114
 
  _
  ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Hello Yehuda,

Thanks for your response!

This is my RGW configuration: 
https://gist.github.com/anonymous/c0f62783feac88e069c7 
https://gist.github.com/anonymous/c0f62783feac88e069c7
and 
This is Tengine configuration: 
https://gist.github.com/anonymous/90b77c168ed0606db03d 
https://gist.github.com/anonymous/90b77c168ed0606db03d

Please let me know if you need something else?

Best!

 On Feb 14, 2015, at 6:22 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote:
 
 
 
 - Original Message -
 From: B L super.itera...@gmail.com
 To: ceph-users@lists.ceph.com
 Sent: Friday, February 13, 2015 11:55:22 PM
 Subject: [ceph-users] Having problem to start Radosgw
 
 Hi all,
 
 I’m having a problem to start radosgw, giving me error that I can’t diagnose:
 
 $ radosgw -c ceph.conf -d
 
 2015-02-14 07:46:58.435802 7f9d739557c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27609
 2015-02-14 07:46:58.437284 7f9d739557c0 -1 asok(0x7f9d74da80a0)
 AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to
 bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': (17)
 File exists
 2015-02-14 07:46:58.499004 7f9d739557c0 0 framework: fastcgi
 2015-02-14 07:46:58.499016 7f9d739557c0 0 starting handler: fastcgi
 2015-02-14 07:46:58.501160 7f9d477fe700 0 ERROR: FCGX_Accept_r returned -9
 2015-02-14 07:46:58.594271 7f9d648ab700 -1 failed to list objects
 pool_iterate returned r=-2
 2015-02-14 07:46:58.594276 7f9d648ab700 0 ERROR: lists_keys_next(): ret=-2
 2015-02-14 07:46:58.594278 7f9d648ab700 0 ERROR: sync_all_users() returned
 ret=-2
 ^C2015-02-14 07:47:29.119185 7f9d47fff700 1 handle_sigterm
 2015-02-14 07:47:29.119214 7f9d47fff700 1 handle_sigterm set alarm for 120
 2015-02-14 07:47:29.119222 7f9d739557c0 -1 shutting down
 2015-02-14 07:47:29.142726 7f9d739557c0 1 final shutdown
 
 
 since it complains that this file exists:
 /var/run/ceph/ceph-client.admin.asok, I removed it, but now, I get this
 error:
 
 $ radosgw -c ceph.conf -d
 
 2015-02-14 07:47:55.140276 7f31cc0637c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27741
 2015-02-14 07:47:55.201561 7f31cc0637c0 0 framework: fastcgi
 2015-02-14 07:47:55.201567 7f31cc0637c0 0 starting handler: fastcgi
 2015-02-14 07:47:55.203443 7f319effd700 0 ERROR: FCGX_Accept_r returned -9
 
 Error 9 is EBADF (bad file number). Looks like there's an issue with the 
 socket created for the fastcgi communication. How did you configure it?
 
 Yehuda
 
 2015-02-14 07:47:55.304048 7f319700 -1 failed to list objects
 pool_iterate returned r=-2
 2015-02-14 07:47:55.304054 7f319700 0 ERROR: lists_keys_next(): ret=-2
 2015-02-14 07:47:55.304060 7f319700 0 ERROR: sync_all_users() returned
 ret=-2
 
 
 Cant somebody help me where to start fixing this?
 
 Thanks!
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CRUSHMAP for chassis balance

2015-02-14 Thread Luke Kao
Hi Gregory,
Thanks for the direction that I  finish with 3 different rule in a ruleset for 
different rep size:
Tested no bad-mapping and host / osd are correctly balanced between 2 chassis.

Not sure if it can be optimized but I am happy with current result:
rule rule_rep2 {
ruleset 0
type replicated
min_size 2
max_size 2
step take chassis1
step chooseleaf firstn 1 type host
step emit
step take chassis2
step chooseleaf firstn 1 type host
step emit
}
rule rule_rep34 {
ruleset 0
type replicated
min_size 3
max_size 4
step take default
step choose firstn 2 type chassis
step chooseleaf firstn 2 type host
step emit
}
rule rule_rep56 {
ruleset 0
type replicated
min_size 5
max_size 6
step take default
step choose firstn 3 type chassis
step chooseleaf firstn 3 type host
step emit
}


Luke

From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: Friday, February 13, 2015 11:01 PM
To: Luke Kao; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] CRUSHMAP for chassis balance

With sufficiently new CRUSH versions (all the latest point releases on LTS?) I 
think you can simply have the rule return extra IDs which are dropped if they 
exceed the number required. So you can choose two chassis, then have those both 
choose to lead OSDs, and return those 4 from the rule.
-Greg
On Fri, Feb 13, 2015 at 6:13 AM Luke Kao 
luke@mycom-osi.commailto:luke@mycom-osi.com wrote:
Dear cepher,
Currently I am working on crushmap to try to make sure the at least one copy 
are going to different chassis.
Say chassis1 has host1,host2,host3, and chassis2 has host4,host5,host6.

With replication =2, it’s not a problem, I can use the following step in rule
step take chasses1
step chooseleaf firstn 1 type host
step emit
step take chasses2
step chooseleaf firstn 1 type host
step emit

But for replication=3, I tried
step take chasses1
step chooseleaf firstn 1 type host
step emit
step take chasses2
step chooseleaf firstn 1 type host
step emit
step take default
step chooseleaf firstn 1 type host
step emit

At the end, the 3rd osd returned in rule test is always duplicate with first 1 
or first 2.

Any idea or what’s the direction to move forward?
Thanks in advance

BR,
Luke
MYCOM-OSI




This electronic message contains information from Mycom which may be privileged 
or confidential. The information is intended to be for the use of the 
individual(s) or entity named above. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or any other use of the 
contents of this information is prohibited. If you have received this 
electronic message in error, please notify us by post or telephone (to the 
numbers or correspondence address above) or by email (at the email address 
above) immediately.
___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



This electronic message contains information from Mycom which may be privileged 
or confidential. The information is intended to be for the use of the 
individual(s) or entity named above. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or any other use of the 
contents of this information is prohibited. If you have received this 
electronic message in error, please notify us by post or telephone (to the 
numbers or correspondence address above) or by email (at the email address 
above) immediately.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread Yehuda Sadeh-Weinraub


- Original Message -
 From: B L super.itera...@gmail.com
 To: ceph-users@lists.ceph.com
 Sent: Friday, February 13, 2015 11:55:22 PM
 Subject: [ceph-users] Having problem to start Radosgw
 
 Hi all,
 
 I’m having a problem to start radosgw, giving me error that I can’t diagnose:
 
 $ radosgw -c ceph.conf -d
 
 2015-02-14 07:46:58.435802 7f9d739557c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27609
 2015-02-14 07:46:58.437284 7f9d739557c0 -1 asok(0x7f9d74da80a0)
 AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to
 bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': (17)
 File exists
 2015-02-14 07:46:58.499004 7f9d739557c0 0 framework: fastcgi
 2015-02-14 07:46:58.499016 7f9d739557c0 0 starting handler: fastcgi
 2015-02-14 07:46:58.501160 7f9d477fe700 0 ERROR: FCGX_Accept_r returned -9
 2015-02-14 07:46:58.594271 7f9d648ab700 -1 failed to list objects
 pool_iterate returned r=-2
 2015-02-14 07:46:58.594276 7f9d648ab700 0 ERROR: lists_keys_next(): ret=-2
 2015-02-14 07:46:58.594278 7f9d648ab700 0 ERROR: sync_all_users() returned
 ret=-2
 ^C2015-02-14 07:47:29.119185 7f9d47fff700 1 handle_sigterm
 2015-02-14 07:47:29.119214 7f9d47fff700 1 handle_sigterm set alarm for 120
 2015-02-14 07:47:29.119222 7f9d739557c0 -1 shutting down
 2015-02-14 07:47:29.142726 7f9d739557c0 1 final shutdown
 
 
 since it complains that this file exists:
 /var/run/ceph/ceph-client.admin.asok, I removed it, but now, I get this
 error:
 
 $ radosgw -c ceph.conf -d
 
 2015-02-14 07:47:55.140276 7f31cc0637c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 27741
 2015-02-14 07:47:55.201561 7f31cc0637c0 0 framework: fastcgi
 2015-02-14 07:47:55.201567 7f31cc0637c0 0 starting handler: fastcgi
 2015-02-14 07:47:55.203443 7f319effd700 0 ERROR: FCGX_Accept_r returned -9

Error 9 is EBADF (bad file number). Looks like there's an issue with the socket 
created for the fastcgi communication. How did you configure it?

Yehuda

 2015-02-14 07:47:55.304048 7f319700 -1 failed to list objects
 pool_iterate returned r=-2
 2015-02-14 07:47:55.304054 7f319700 0 ERROR: lists_keys_next(): ret=-2
 2015-02-14 07:47:55.304060 7f319700 0 ERROR: sync_all_users() returned
 ret=-2
 
 
 Cant somebody help me where to start fixing this?
 
 Thanks!
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Shall I run it like this:

sudo radosgw -c ceph.conf -d strace -F -T -tt -o/tmp/strace.out radosgw -f


 On Feb 14, 2015, at 6:55 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote:
 
 strace -F -T -tt -o/tmp/strace.out radosgw -f

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Hello Yehyda,

The strace command you referred to me, shows this: 
https://gist.github.com/anonymous/8e9f1ced485996a263bb 
https://gist.github.com/anonymous/8e9f1ced485996a263bb

Additionally, I traced this log file: 
/var/log/radosgw/ceph-client.radosgw.gateway

it has the following:

2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using 
default settings.
2015-02-12 18:23:32.247745 7fecca5257c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477
2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider 
(RADOS)
2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using 
default settings.
2015-02-12 18:23:58.494092 7faab31377c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509
2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider 
(RADOS)
2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using 
default settings.
2015-02-14 17:13:03.478778 7f86f09567c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989
2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider 
(RADOS)
2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using 
default settings.
2015-02-14 17:13:29.477595 7ff18226a7c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033
2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider 
(RADOS)
2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config file, using 
default settings.
2015-02-14 17:21:00.950916 7ffee3a3b7c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3086
2015-02-14 17:21:00.954085 7ffee3a3b7c0 -1 Couldn't init storage provider 
(RADOS)



Turns out to be that the last line of the logs is thrown out by this piece of 
code in rgw_main.cc:

…
…

  FCGX_Init();

  RGWStoreManager store_manager;

  if (!store_manager.init(rados, g_ceph_context)) {
derr  Couldn't init storage provider (RADOS)  dendl;  
return EIO;
  }

  RGWProcess process(g_ceph_context, 20);

  process.run();

  return 0;

N.B.  you can find it 
in:(http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc
 
http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc)
 , 10th line from below.

Is that by any means related to the problem?



 On Feb 14, 2015, at 7:24 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote:
 
 sudo strace -F -T -tt -o/tmp/strace.out radosgw -c ceph.conf -f

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread Yehuda Sadeh-Weinraub
- Original Message -

 From: B L super.itera...@gmail.com
 To: Yehuda Sadeh-Weinraub yeh...@redhat.com
 Cc: ceph-users@lists.ceph.com
 Sent: Saturday, February 14, 2015 11:03:42 AM
 Subject: Re: [ceph-users] Having problem to start Radosgw

 Hello Yehyda,

 The strace command you referred to me, shows this:
 https://gist.github.com/anonymous/8e9f1ced485996a263bb

 Additionally, I traced this log file:
 /var/log/radosgw/ceph-client.radosgw.gateway

 it has the following:

 2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using
 default settings.
 2015-02-12 18:23:32.247745 7fecca5257c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477
 2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using
 default settings.
 2015-02-12 18:23:58.494092 7faab31377c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509
 2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using
 default settings.
 2015-02-14 17:13:03.478778 7f86f09567c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989
 2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using
 default settings.
 2015-02-14 17:13:29.477595 7ff18226a7c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033
 2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config file, using
 default settings.
 2015-02-14 17:21:00.950916 7ffee3a3b7c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3086
 2015-02-14 17:21:00.954085 7ffee3a3b7c0 -1 Couldn't init storage provider
 (RADOS)

 Turns out to be that the last line of the logs is thrown out by this piece of
 code in rgw_main.cc:

 …
 …

 FCGX_Init();

 RGWStoreManager store_manager;

 if (!store_manager.init(rados, g_ceph_context)) {
 derr  Couldn't init storage provider (RADOS)  dendl;
 return EIO;
 }

 RGWProcess process(g_ceph_context, 20);

 process.run();

 return 0;

 N.B. you can find it in:(
 http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc
 ) , 10th line from below.

 Is that by any means related to the problem?

Not related. This actually means that it couldn't connect to the rados backend, 
so there's a different issue now. The strace log doesn't provide much with 
regard to the original issue as it didn't get to that part now. You can try 
bumping up the debug level (debug rgw = 20, debug ms = 1). I assume that the 
issue that you're seeing is that the wrong rados user and/or wrong cephx keys 
are being used. Try to run it again as you do usually, and see what the regular 
params that are being passed when starting radosgw; use these when running the 
strace command. 

Yehuda 

  On Feb 14, 2015, at 7:24 PM, Yehuda Sadeh-Weinraub  yeh...@redhat.com 
  wrote:
 

  sudo strace -F -T -tt -o/tmp/strace.out radosgw -c ceph.conf -f
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
That’s what I usually do to check if rgw is running with no problems: sudo 
radosgw -c ceph.conf -d

I already pumped up the log level, but I can’t see any change or verbosity 
level increase of the logs, I still get the same:

2015-02-14 22:27:57.513151 7f26c79d27c0  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 7924
2015-02-14 22:27:57.573564 7f26c79d27c0  0 framework: fastcgi
2015-02-14 22:27:57.573569 7f26c79d27c0  0 starting handler: fastcgi
2015-02-14 22:27:57.575349 7f269affd700  0 ERROR: FCGX_Accept_r returned -9
2015-02-14 22:27:57.670610 7f269bfff700  0 ERROR: can't read user header: ret=-2
2015-02-14 22:27:57.670613 7f269bfff700  0 ERROR: sync_user() failed, 
user=cephtest ret=-2
2015-02-14 22:27:57.671382 7f269bfff700  0 ERROR: can't read user header: ret=-2
2015-02-14 22:27:57.671384 7f269bfff700  0 ERROR: sync_user() failed, 
user=cephtestss ret=-2
^C2015-02-14 22:28:30.693140 7f269b7fe700  1 handle_sigterm
2015-02-14 22:28:30.693170 7f269b7fe700  1 handle_sigterm set alarm for 120
2015-02-14 22:28:30.693179 7f26c79d27c0 -1 shutting down
2015-02-14 22:28:30.717340 7f26c79d27c0  1 final shutdown

Please let me know if I can do something more ..

Now I have 2 questions:
1- what RADOS user you refer to?
2- How would I know that I use wrong cephx keys unless I see authentication 
error or relevant warning?

Thanks!
Beanos

 On Feb 14, 2015, at 11:29 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote:
 
 
 
 From: B L super.itera...@gmail.com
 To: Yehuda Sadeh-Weinraub yeh...@redhat.com
 Cc: ceph-users@lists.ceph.com
 Sent: Saturday, February 14, 2015 11:03:42 AM
 Subject: Re: [ceph-users] Having problem to start Radosgw
 
 Hello Yehyda,
 
 The strace command you referred to me, shows this: 
 https://gist.github.com/anonymous/8e9f1ced485996a263bb 
 https://gist.github.com/anonymous/8e9f1ced485996a263bb
 
 Additionally, I traced this log file: 
 /var/log/radosgw/ceph-client.radosgw.gateway
 
 it has the following:
 
 2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using 
 default settings.
 2015-02-12 18:23:32.247745 7fecca5257c0  0 ceph version 0.80.7 
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477
 2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider 
 (RADOS)
 2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using 
 default settings.
 2015-02-12 18:23:58.494092 7faab31377c0  0 ceph version 0.80.7 
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509
 2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider 
 (RADOS)
 2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using 
 default settings.
 2015-02-14 17:13:03.478778 7f86f09567c0  0 ceph version 0.80.7 
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989
 2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider 
 (RADOS)
 2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using 
 default settings.
 2015-02-14 17:13:29.477595 7ff18226a7c0  0 ceph version 0.80.7 
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033
 2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider 
 (RADOS)
 2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config file, using 
 default settings.
 2015-02-14 17:21:00.950916 7ffee3a3b7c0  0 ceph version 0.80.7 
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3086
 2015-02-14 17:21:00.954085 7ffee3a3b7c0 -1 Couldn't init storage provider 
 (RADOS)
 
 
 
 Turns out to be that the last line of the logs is thrown out by this piece of 
 code in rgw_main.cc:
 
 …
 …
 
   FCGX_Init();
 
   RGWStoreManager store_manager;
 
   if (!store_manager.init(rados, g_ceph_context)) {
 derr  Couldn't init storage provider (RADOS)  dendl;  
 return EIO;
   }
 
   RGWProcess process(g_ceph_context, 20);
 
   process.run();
 
   return 0;
 
 N.B.  you can find it 
 in:(http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc
  
 http://workbench.dachary.org/ceph/ceph/raw/8d63e140777bbdd061baa6845d57e6c3cc771f76/src/rgw/rgw_main.cc)
  , 10th line from below.
 
 Is that by any means related to the problem?
 
 Not related. This actually means that it couldn't connect to the rados 
 backend, so there's a different issue now. The strace log doesn't provide 
 much with regard to the original issue as it didn't get to that part now. You 
 can try bumping up the debug level (debug rgw = 20, debug ms = 1). I assume 
 that the issue that you're seeing is that the wrong rados user and/or wrong 
 cephx keys are being used. Try to run it again as you do usually, and see 
 what the regular params that are being passed when starting radosgw; use 
 these when running the strace command.
 
 Yehuda
 
 
 
 On Feb 14, 2015, at 7:24 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com 
 mailto:yeh...@redhat.com wrote:
 
 sudo 

Re: [ceph-users] Having problem to start Radosgw

2015-02-14 Thread B L
Hello Yehuda,

this is the resulting output after adding “-n client.radosgw.gateway” : 
https://gist.github.com/anonymous/f16701d6cacc8911620f 
https://gist.github.com/anonymous/f16701d6cacc8911620f

I can see one problem only in the above output: -1 Couldn't init storage 
provider (RADOS) .. please check the output, probably you can find something 
useful



 On Feb 15, 2015, at 1:28 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote:
 
 
 add the '-n client.radosgw.gateway' param when you're running the gateway, 
 all your settings are under that user.
 
 Yehuda
 
 - Original Message -
 From: B L super.itera...@gmail.com
 To: Yehuda Sadeh-Weinraub yeh...@redhat.com
 Cc: ceph-users@lists.ceph.com
 Sent: Saturday, February 14, 2015 2:56:54 PM
 Subject: Re: [ceph-users] Having problem to start Radosgw
 
 Yehuda ..
 
 In case you will need to know more about my system
 
 Here is my full cluster configuration:
 https://gist.github.com/anonymous/fb4c314320d7df75569a
 
 And, that’s my ceph cluster status:
 
 $ ceph -s
 
 cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
 health HEALTH_WARN 203 pgs degraded; 203 pgs stuck unclean; recovery 6/151
 objects degraded (3.974%)
 monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2,
 quorum 0 ceph-node1
 osdmap e93: 6 osds: 6 up, 6 in
 pgmap v3676: 1920 pgs, 16 pools, 10241 kB data, 51 objects
 279 MB used, 18086 MB / 18365 MB avail
 6/151 objects degraded (3.974%)
 203 active+degraded
 1717 active+clean
 
 It was fully healthy before adding the radosgw pools .. yet, I still can put
 objects to the cluster (without using RGW)
 
 Best!
 
 
 
 
 
 On Feb 15, 2015, at 12:39 AM, B L  super.itera...@gmail.com  wrote:
 
 That’s what I usually do to check if rgw is running with no problems: sudo
 radosgw -c ceph.conf -d
 
 I already pumped up the log level, but I can’t see any change or verbosity
 level increase of the logs, I still get the same:
 
 2015-02-14 22:27:57.513151 7f26c79d27c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 7924
 2015-02-14 22:27:57.573564 7f26c79d27c0 0 framework: fastcgi
 2015-02-14 22:27:57.573569 7f26c79d27c0 0 starting handler: fastcgi
 2015-02-14 22:27:57.575349 7f269affd700 0 ERROR: FCGX_Accept_r returned -9
 2015-02-14 22:27:57.670610 7f269bfff700 0 ERROR: can't read user header:
 ret=-2
 2015-02-14 22:27:57.670613 7f269bfff700 0 ERROR: sync_user() failed,
 user=cephtest ret=-2
 2015-02-14 22:27:57.671382 7f269bfff700 0 ERROR: can't read user header:
 ret=-2
 2015-02-14 22:27:57.671384 7f269bfff700 0 ERROR: sync_user() failed,
 user=cephtestss ret=-2
 ^C2015-02-14 22:28:30.693140 7f269b7fe700 1 handle_sigterm
 2015-02-14 22:28:30.693170 7f269b7fe700 1 handle_sigterm set alarm for 120
 2015-02-14 22:28:30.693179 7f26c79d27c0 -1 shutting down
 2015-02-14 22:28:30.717340 7f26c79d27c0 1 final shutdown
 
 Please let me know if I can do something more ..
 
 Now I have 2 questions:
 1- what RADOS user you refer to?
 2- How would I know that I use wrong cephx keys unless I see authentication
 error or relevant warning?
 
 Thanks!
 Beanos
 
 
 
 
 On Feb 14, 2015, at 11:29 PM, Yehuda Sadeh-Weinraub  yeh...@redhat.com 
 wrote:
 
 
 
 
 
 
 From: B L  super.itera...@gmail.com 
 To: Yehuda Sadeh-Weinraub  yeh...@redhat.com 
 Cc: ceph-users@lists.ceph.com
 Sent: Saturday, February 14, 2015 11:03:42 AM
 Subject: Re: [ceph-users] Having problem to start Radosgw
 
 Hello Yehyda,
 
 The strace command you referred to me, shows this:
 https://gist.github.com/anonymous/8e9f1ced485996a263bb
 
 Additionally, I traced this log file:
 /var/log/radosgw/ceph-client.radosgw.gateway
 
 it has the following:
 
 2015-02-12 18:23:32.247679 7fecca5257c0 -1 did not load config file, using
 default settings.
 2015-02-12 18:23:32.247745 7fecca5257c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20477
 2015-02-12 18:23:32.251192 7fecca5257c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-12 18:23:58.494026 7faab31377c0 -1 did not load config file, using
 default settings.
 2015-02-12 18:23:58.494092 7faab31377c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 20509
 2015-02-12 18:23:58.497420 7faab31377c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:13:03.478688 7f86f09567c0 -1 did not load config file, using
 default settings.
 2015-02-14 17:13:03.478778 7f86f09567c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 2989
 2015-02-14 17:13:03.482850 7f86f09567c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:13:29.477530 7ff18226a7c0 -1 did not load config file, using
 default settings.
 2015-02-14 17:13:29.477595 7ff18226a7c0 0 ceph version 0.80.7
 (6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 3033
 2015-02-14 17:13:29.481173 7ff18226a7c0 -1 Couldn't init storage provider
 (RADOS)
 2015-02-14 17:21:00.950847 7ffee3a3b7c0 -1 did not load config