Re: [ceph-users] Core dump while getting a volume real size with a python script

2015-10-29 Thread Giuseppe Civitella
... and this is the core dump output while executing the "rbd diff" command:
http://paste.openstack.org/show/477604/

Regards,
Giuseppe

2015-10-28 16:46 GMT+01:00 Giuseppe Civitella <giuseppe.civite...@gmail.com>
:

> Hi all,
>
> I'm trying to get the real disk usage of a Cinder volume converting this
> bash commands to python:
> http://cephnotes.ksperis.com/blog/2013/08/28/rbd-image-real-size
>
> I wrote a small test function which has already worked in many cases but
> it stops with a core dump while trying to calculate the real size of a
> particular volume.
>
> This is the function:
> http://paste.openstack.org/show/477563/
>
> this is the error I get:
> http://paste.openstack.org/show/477567/
>
> and these are the related rbd info:
>  http://paste.openstack.org/show/477568/
>
> Can anyone help me to debug the problem?
>
> Thanks
> Giuseppe
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Core dump while getting a volume real size with a python script

2015-10-28 Thread Giuseppe Civitella
Hi all,

I'm trying to get the real disk usage of a Cinder volume converting this
bash commands to python:
http://cephnotes.ksperis.com/blog/2013/08/28/rbd-image-real-size

I wrote a small test function which has already worked in many cases but it
stops with a core dump while trying to calculate the real size of a
particular volume.

This is the function:
http://paste.openstack.org/show/477563/

this is the error I get:
http://paste.openstack.org/show/477567/

and these are the related rbd info:
 http://paste.openstack.org/show/477568/

Can anyone help me to debug the problem?

Thanks
Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] pgs stuck unclean on a new pool despite the pool size reconfiguration

2015-10-02 Thread Giuseppe Civitella
Hi all,
I have a Firefly cluster which has been upgraded from Emperor.
It has 2 OSD hosts and 3 monitors.
The cluster has default values for what concerns size and min_size of the
pools.
Once upgraded to Firefly, I created a new pool called bench2:
ceph osd pool create bench2 128 128
and set its sizes:
ceph osd pool set bench2 size 2
ceph osd pool set bench2 min_size 1

this is the state of the pools:
pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45
stripe_width 0
pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 2 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
pool 3 'volumes' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 384 pgp_num 384 last_change 2568 stripe_width 0
removed_snaps [1~75]
pool 4 'images' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 384 pgp_num 384 last_change 1895 stripe_width 0
pool 8 'bench2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 128 pgp_num 128 last_change 2580 flags hashpspool
stripe_width 0

despite this I still get a warning about 128 pgs stuck unclean.
The "ceph health detail" shows me the stuck PGs. So i take one to get the
involved OSDs:

pg 8.38 is stuck unclean since forever, current state active, last acting
[22,7]

if I restart the OSD with id 22, the PG 8.38 gets an active+clean state.

This is an incorrect behavior, AFAIK. The cluster should get noticed of the
new size and min_size values without any manual intervention. So my
question is: any idea about why this happens and how to restore the default
behavior? Do I need to restart all of the OSDs to restore an healthy state?

thanks a lot
Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs stuck unclean on a new pool despite the pool size reconfiguration

2015-10-02 Thread Giuseppe Civitella
Hi Warren,

a simple:
ceph osd pool set bench2 hashpspool false
solved my problem.

Thank a lot
Giuseppe

2015-10-02 16:18 GMT+02:00 Warren Wang - ISD <warren.w...@walmart.com>:

> You probably don’t want hashpspool automatically set, since your clients
> may still not understand that crush map feature. You can try to unset it
> for that pool and see what happens, or create a new pool without hashpspool
> enabled from the start.  Just a guess.
>
> Warren
>
> From: Giuseppe Civitella <giuseppe.civite...@gmail.com giuseppe.civite...@gmail.com>>
> Date: Friday, October 2, 2015 at 10:05 AM
> To: ceph-users <ceph-us...@ceph.com<mailto:ceph-us...@ceph.com>>
> Subject: [ceph-users] pgs stuck unclean on a new pool despite the pool
> size reconfiguration
>
> Hi all,
> I have a Firefly cluster which has been upgraded from Emperor.
> It has 2 OSD hosts and 3 monitors.
> The cluster has default values for what concerns size and min_size of the
> pools.
> Once upgraded to Firefly, I created a new pool called bench2:
> ceph osd pool create bench2 128 128
> and set its sizes:
> ceph osd pool set bench2 size 2
> ceph osd pool set bench2 min_size 1
>
> this is the state of the pools:
> pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45
> stripe_width 0
> pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
> pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 2 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
> pool 3 'volumes' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 384 pgp_num 384 last_change 2568 stripe_width 0
> removed_snaps [1~75]
> pool 4 'images' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 384 pgp_num 384 last_change 1895 stripe_width 0
> pool 8 'bench2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 128 pgp_num 128 last_change 2580 flags hashpspool
> stripe_width 0
>
> despite this I still get a warning about 128 pgs stuck unclean.
> The "ceph health detail" shows me the stuck PGs. So i take one to get the
> involved OSDs:
>
> pg 8.38 is stuck unclean since forever, current state active, last acting
> [22,7]
>
> if I restart the OSD with id 22, the PG 8.38 gets an active+clean state.
>
> This is an incorrect behavior, AFAIK. The cluster should get noticed of
> the new size and min_size values without any manual intervention. So my
> question is: any idea about why this happens and how to restore the default
> behavior? Do I need to restart all of the OSDs to restore an healthy state?
>
> thanks a lot
> Giuseppe
>
> This email and any files transmitted with it are confidential and intended
> solely for the individual or entity to whom they are addressed. If you have
> received this email in error destroy it immediately. *** Walmart
> Confidential ***
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Binding a pool to certain OSDs

2015-04-15 Thread Giuseppe Civitella
So it was a PG problem. I added a couple of OSD per host, reconfigured the
CRUSH map and the cluster began to work properly.

Thanks
Giuseppe

2015-04-14 19:02 GMT+02:00 Saverio Proto ziopr...@gmail.com:

 No error message. You just finish the RAM memory and you blow up the
 cluster because of too many PGs.

 Saverio

 2015-04-14 18:52 GMT+02:00 Giuseppe Civitella 
 giuseppe.civite...@gmail.com:
  Hi Saverio,
 
  I first made a test on my test staging lab where I have only 4 OSD.
  On my mon servers (which run other services) I have 16BG RAM, 15GB used
 but
  5 cached. On the OSD servers I have 3GB RAM, 3GB used but 2 cached.
  ceph -s tells me nothing about PGs, shouldn't I get an error message
 from
  its output?
 
  Thanks
  Giuseppe
 
  2015-04-14 18:20 GMT+02:00 Saverio Proto ziopr...@gmail.com:
 
  You only have 4 OSDs ?
  How much RAM per server ?
  I think you have already too many PG. Check your RAM usage.
 
  Check on Ceph wiki guidelines to dimension the correct number of PGs.
  Remeber that everytime to create a new pool you add PGs into the
  system.
 
  Saverio
 
 
  2015-04-14 17:58 GMT+02:00 Giuseppe Civitella
  giuseppe.civite...@gmail.com:
   Hi all,
  
   I've been following this tutorial to realize my setup:
  
  
 http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/
  
   I got this CRUSH map from my test lab:
   http://paste.openstack.org/show/203887/
  
   then I modified the map and uploaded it. This is the final version:
   http://paste.openstack.org/show/203888/
  
   When applied the new CRUSH map, after some rebalancing, I get this
   health
   status:
   [- avalon1 root@controller001 Ceph -] # ceph -s
   cluster af09420b-4032-415e-93fc-6b60e9db064e
health HEALTH_WARN crush map has legacy tunables;
 mon.controller001
   low
   disk space; clock skew detected on mon.controller002
monmap e1: 3 mons at
  
   {controller001=
 10.235.24.127:6789/0,controller002=10.235.24.128:6789/0,controller003=10.235.24.129:6789/0
 },
   election epoch 314, quorum 0,1,2
   controller001,controller002,controller003
osdmap e3092: 4 osds: 4 up, 4 in
 pgmap v785873: 576 pgs, 6 pools, 71548 MB data, 18095 objects
   8842 MB used, 271 GB / 279 GB avail
576 active+clean
  
   and this osd tree:
   [- avalon1 root@controller001 Ceph -] # ceph osd tree
   # idweight  type name   up/down reweight
   -8  2   root sed
   -5  1   host ceph001-sed
   2   1   osd.2   up  1
   -7  1   host ceph002-sed
   3   1   osd.3   up  1
   -1  2   root default
   -4  1   host ceph001-sata
   0   1   osd.0   up  1
   -6  1   host ceph002-sata
   1   1   osd.1   up  1
  
   which seems not a bad situation. The problem rise when I try to
 create a
   new
   pool, the command ceph osd pool create sed 128 128 gets stuck. It
   never
   ends.  And I noticed that my Cinder installation is not able to create
   volumes anymore.
   I've been looking in the logs for errors and found nothing.
   Any hint about how to proceed to restore my ceph cluster?
   Is there something wrong with the steps I take to update the CRUSH
 map?
   Is
   the problem related to Emperor?
  
   Regards,
   Giuseppe
  
  
  
  
   2015-04-13 18:26 GMT+02:00 Giuseppe Civitella
   giuseppe.civite...@gmail.com:
  
   Hi all,
  
   I've got a Ceph cluster which serves volumes to a Cinder
 installation.
   It
   runs Emperor.
   I'd like to be able to replace some of the disks with OPAL disks and
   create a new pool which uses exclusively the latter kind of disk. I'd
   like
   to have a traditional pool and a secure one coexisting on the
 same
   ceph
   host. I'd then use Cinder multi backend feature to serve them.
   My question is: how is it possible to realize such a setup? How can I
   bind
   a pool to certain OSDs?
  
   Thanks
   Giuseppe
  
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Giuseppe Civitella
Hi Saverio,

I first made a test on my test staging lab where I have only 4 OSD.
On my mon servers (which run other services) I have 16BG RAM, 15GB used but
5 cached. On the OSD servers I have 3GB RAM, 3GB used but 2 cached.
ceph -s tells me nothing about PGs, shouldn't I get an error message from
its output?

Thanks
Giuseppe

2015-04-14 18:20 GMT+02:00 Saverio Proto ziopr...@gmail.com:

 You only have 4 OSDs ?
 How much RAM per server ?
 I think you have already too many PG. Check your RAM usage.

 Check on Ceph wiki guidelines to dimension the correct number of PGs.
 Remeber that everytime to create a new pool you add PGs into the
 system.

 Saverio


 2015-04-14 17:58 GMT+02:00 Giuseppe Civitella 
 giuseppe.civite...@gmail.com:
  Hi all,
 
  I've been following this tutorial to realize my setup:
 
 http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/
 
  I got this CRUSH map from my test lab:
  http://paste.openstack.org/show/203887/
 
  then I modified the map and uploaded it. This is the final version:
  http://paste.openstack.org/show/203888/
 
  When applied the new CRUSH map, after some rebalancing, I get this health
  status:
  [- avalon1 root@controller001 Ceph -] # ceph -s
  cluster af09420b-4032-415e-93fc-6b60e9db064e
   health HEALTH_WARN crush map has legacy tunables; mon.controller001
 low
  disk space; clock skew detected on mon.controller002
   monmap e1: 3 mons at
  {controller001=
 10.235.24.127:6789/0,controller002=10.235.24.128:6789/0,controller003=10.235.24.129:6789/0
 },
  election epoch 314, quorum 0,1,2
 controller001,controller002,controller003
   osdmap e3092: 4 osds: 4 up, 4 in
pgmap v785873: 576 pgs, 6 pools, 71548 MB data, 18095 objects
  8842 MB used, 271 GB / 279 GB avail
   576 active+clean
 
  and this osd tree:
  [- avalon1 root@controller001 Ceph -] # ceph osd tree
  # idweight  type name   up/down reweight
  -8  2   root sed
  -5  1   host ceph001-sed
  2   1   osd.2   up  1
  -7  1   host ceph002-sed
  3   1   osd.3   up  1
  -1  2   root default
  -4  1   host ceph001-sata
  0   1   osd.0   up  1
  -6  1   host ceph002-sata
  1   1   osd.1   up  1
 
  which seems not a bad situation. The problem rise when I try to create a
 new
  pool, the command ceph osd pool create sed 128 128 gets stuck. It never
  ends.  And I noticed that my Cinder installation is not able to create
  volumes anymore.
  I've been looking in the logs for errors and found nothing.
  Any hint about how to proceed to restore my ceph cluster?
  Is there something wrong with the steps I take to update the CRUSH map?
 Is
  the problem related to Emperor?
 
  Regards,
  Giuseppe
 
 
 
 
  2015-04-13 18:26 GMT+02:00 Giuseppe Civitella
  giuseppe.civite...@gmail.com:
 
  Hi all,
 
  I've got a Ceph cluster which serves volumes to a Cinder installation.
 It
  runs Emperor.
  I'd like to be able to replace some of the disks with OPAL disks and
  create a new pool which uses exclusively the latter kind of disk. I'd
 like
  to have a traditional pool and a secure one coexisting on the same
 ceph
  host. I'd then use Cinder multi backend feature to serve them.
  My question is: how is it possible to realize such a setup? How can I
 bind
  a pool to certain OSDs?
 
  Thanks
  Giuseppe
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Binding a pool to certain OSDs

2015-04-14 Thread Giuseppe Civitella
Hi all,

I've been following this tutorial to realize my setup:
http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/

I got this CRUSH map from my test lab:
http://paste.openstack.org/show/203887/

then I modified the map and uploaded it. This is the final version:
http://paste.openstack.org/show/203888/

When applied the new CRUSH map, after some rebalancing, I get this health
status:
[- avalon1 root@controller001 Ceph -] # ceph -s
cluster af09420b-4032-415e-93fc-6b60e9db064e
 health HEALTH_WARN crush map has legacy tunables; mon.controller001
low disk space; clock skew detected on mon.controller002
 monmap e1: 3 mons at {controller001=
10.235.24.127:6789/0,controller002=10.235.24.128:6789/0,controller003=10.235.24.129:6789/0},
election epoch 314, quorum 0,1,2 controller001,controller002,controller003
 osdmap e3092: 4 osds: 4 up, 4 in
  pgmap v785873: 576 pgs, 6 pools, 71548 MB data, 18095 objects
8842 MB used, 271 GB / 279 GB avail
 576 active+clean

and this osd tree:
[- avalon1 root@controller001 Ceph -] # ceph osd tree
# idweight  type name   up/down reweight
-8  2   root sed
-5  1   host ceph001-sed
2   1   osd.2   up  1
-7  1   host ceph002-sed
3   1   osd.3   up  1
-1  2   root default
-4  1   host ceph001-sata
0   1   osd.0   up  1
-6  1   host ceph002-sata
1   1   osd.1   up  1

which seems not a bad situation. The problem rise when I try to create a
new pool, the command ceph osd pool create sed 128 128 gets stuck. It
never ends.  And I noticed that my Cinder installation is not able to
create volumes anymore.
I've been looking in the logs for errors and found nothing.
Any hint about how to proceed to restore my ceph cluster?
Is there something wrong with the steps I take to update the CRUSH map? Is
the problem related to Emperor?

Regards,
Giuseppe




2015-04-13 18:26 GMT+02:00 Giuseppe Civitella giuseppe.civite...@gmail.com
:

 Hi all,

 I've got a Ceph cluster which serves volumes to a Cinder installation. It
 runs Emperor.
 I'd like to be able to replace some of the disks with OPAL disks and
 create a new pool which uses exclusively the latter kind of disk. I'd like
 to have a traditional pool and a secure one coexisting on the same ceph
 host. I'd then use Cinder multi backend feature to serve them.
 My question is: how is it possible to realize such a setup? How can I bind
 a pool to certain OSDs?

 Thanks
 Giuseppe

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Binding a pool to certain OSDs

2015-04-13 Thread Giuseppe Civitella
Hi all,

I've got a Ceph cluster which serves volumes to a Cinder installation. It
runs Emperor.
I'd like to be able to replace some of the disks with OPAL disks and create
a new pool which uses exclusively the latter kind of disk. I'd like to have
a traditional pool and a secure one coexisting on the same ceph host.
I'd then use Cinder multi backend feature to serve them.
My question is: how is it possible to realize such a setup? How can I bind
a pool to certain OSDs?

Thanks
Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rbd image's data deletion

2015-03-03 Thread Giuseppe Civitella
Hi all,

what happens to data contained in an rbd image when the image itself gets
deleted?
Are the data just unlinked or are them destroyed in a way that make them
unreadable?

thanks
Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Giuseppe Civitella
Hi all,

I'm working on a lab setup regarding Ceph serving rbd images as ISCSI
datastores to VMWARE via a LIO box. Is there someone that already did
something similar wanting to share some knowledge? Any production
deployments? What about LIO's HA and luns' performances?

Thanks
Giuseppe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph-deploy install and pinning on Ubuntu 14.04

2014-12-20 Thread Giuseppe Civitella
Hi all,

I'm using deph-deploy on Ubuntu 14.04. When I do a ceph-deploy install I
see packages getting installed from ubuntu repositories instead of ceph's
ones, am I missing something? Do I need to do some pinning on repositories?

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+degraded on an empty new cluster

2014-12-10 Thread Giuseppe Civitella
Craig, Gregory,

my disks were a bit smaller than 10GB, I changed them with 20GB disks and
the cluster's health went OK.

Thanks a lot

2014-12-10 0:08 GMT+01:00 Craig Lewis cle...@centraldesktop.com:

 When I first created a test cluster, I used 1 GiB disks.  That causes
 problems.

 Ceph has a CRUSH weight.  By default, the weight is the size of the disk
 in TiB, truncated to 2 decimal places.  ie, any disk smaller than 10 GiB
 will have a weight of 0.00.

 I increased all of my virtual disks to 10 GiB.  After rebooting the nodes
 (to see the changes), everything healed.


 On Tue, Dec 9, 2014 at 9:45 AM, Gregory Farnum g...@gregs42.com wrote:

 It looks like your OSDs all have weight zero for some reason. I'd fix
 that. :)
 -Greg

 On Tue, Dec 9, 2014 at 6:24 AM Giuseppe Civitella 
 giuseppe.civite...@gmail.com wrote:

 Hi,

 thanks for the quick answer.
 I did try the force_create_pg on a pg but is stuck on creating:
 root@ceph-mon1:/home/ceph# ceph pg dump |grep creating
 dumped all in format plain
 2.2f0   0   0   0   0   0   0   creating
2014-12-09 13:11:37.384808  0'0 0:0 []  -1  []
-1  0'0 0.000'0  0.00

 root@ceph-mon1:/home/ceph# ceph pg 2.2f query
 { state: active+degraded,
   epoch: 105,
   up: [
 0],
   acting: [
 0],
   actingbackfill: [
 0],
   info: { pgid: 2.2f,
   last_update: 0'0,
   last_complete: 0'0,
   log_tail: 0'0,
   last_user_version: 0,
   last_backfill: MAX,
   purged_snaps: [],
   last_scrub: 0'0,
   last_scrub_stamp: 2014-12-06 14:15:11.499769,
   last_deep_scrub: 0'0,
   last_deep_scrub_stamp: 2014-12-06 14:15:11.499769,
   last_clean_scrub_stamp: 0.00,
   log_size: 0,
   ondisk_log_size: 0,
   stats_invalid: 0,
   stat_sum: { num_bytes: 0,
   num_objects: 0,
   num_object_clones: 0,
   num_object_copies: 0,
   num_objects_missing_on_primary: 0,
   num_objects_degraded: 0,
   num_objects_unfound: 0,
   num_objects_dirty: 0,
   num_whiteouts: 0,
   num_read: 0,
   num_read_kb: 0,
   num_write: 0,
   num_write_kb: 0,
   num_scrub_errors: 0,
   num_shallow_scrub_errors: 0,
   num_deep_scrub_errors: 0,
   num_objects_recovered: 0,
   num_bytes_recovered: 0,
   num_keys_recovered: 0,
   num_objects_omap: 0,
   num_objects_hit_set_archive: 0},
   stat_cat_sum: {},
   up: [
 0],
   acting: [
 0],
   up_primary: 0,
   acting_primary: 0},
   empty: 1,
   dne: 0,
   incomplete: 0,
   last_epoch_started: 104,
   hit_set_history: { current_last_update: 0'0,
   current_last_stamp: 0.00,
   current_info: { begin: 0.00,
   end: 0.00,
   version: 0'0},
   history: []}},
   peer_info: [],
   recovery_state: [
 { name: Started\/Primary\/Active,
   enter_time: 2014-12-09 12:12:52.760384,
   might_have_unfound: [],
   recovery_progress: { backfill_targets: [],
   waiting_on_backfill: [],
   last_backfill_started: 0\/\/0\/\/-1,
   backfill_info: { begin: 0\/\/0\/\/-1,
   end: 0\/\/0\/\/-1,
   objects: []},
   peer_backfill_info: [],
   backfills_in_flight: [],
   recovering: [],
   pg_backend: { pull_from_peer: [],
   pushing: []}},
   scrub: { scrubber.epoch_start: 0,
   scrubber.active: 0,
   scrubber.block_writes: 0,
   scrubber.finalizing: 0,
   scrubber.waiting_on: 0,
   scrubber.waiting_on_whom: []}},
 { name: Started,
   enter_time: 2014-12-09 12:12:51.845686}],
   agent_state: {}}root@ceph-mon1:/home/ceph#



 2014-12-09 13:01 GMT+01:00 Irek Fasikhov malm...@gmail.com:

 Hi.

 http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

 ceph pg force_create_pg pgid


 2014-12-09 14:50 GMT+03:00 Giuseppe Civitella 
 giuseppe.civite...@gmail.com:

 Hi all,

 last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04
 with default kernel.
 There is a ceph monitor a two osd hosts. Here are some datails:
 ceph -s
 cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
  health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
  monmap e1: 1 mons at {ceph-mon1=10.1.1.83:6789/0}, election
 epoch 1, quorum 0 ceph-mon1
  osdmap e83: 6 osds: 6 up, 6 in
   pgmap v231: 192 pgs, 3 pools, 0 bytes data, 0 objects
 207 MB used, 30446 MB / 30653 MB avail
  192 active+degraded

 root@ceph-mon1:/home/ceph# ceph

[ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Giuseppe Civitella
Hi all,

last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04 with
default kernel.
There is a ceph monitor a two osd hosts. Here are some datails:
ceph -s
cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
 health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
 monmap e1: 1 mons at {ceph-mon1=10.1.1.83:6789/0}, election epoch 1,
quorum 0 ceph-mon1
 osdmap e83: 6 osds: 6 up, 6 in
  pgmap v231: 192 pgs, 3 pools, 0 bytes data, 0 objects
207 MB used, 30446 MB / 30653 MB avail
 192 active+degraded

root@ceph-mon1:/home/ceph# ceph osd dump
epoch 99
fsid c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
created 2014-12-06 13:15:06.418843
modified 2014-12-09 11:38:04.353279
flags
pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 18 flags hashpspool
crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 19 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 20 flags hashpspool stripe_width 0
max_osd 6
osd.0 up   in  weight 1 up_from 90 up_thru 90 down_at 89
last_clean_interval [58,89) 10.1.1.84:6805/995 10.1.1.84:6806/4000995
10.1.1.84:6807/4000995 10.1.1.84:6808/4000995 exists,up
e3895075-614d-48e2-b956-96e13dbd87fe
osd.1 up   in  weight 1 up_from 88 up_thru 0 down_at 87 last_clean_interval
[8,87) 10.1.1.85:6800/23146 10.1.1.85:6815/7023146 10.1.1.85:6816/7023146
10.1.1.85:6817/7023146 exists,up 144bc6ee-2e3d-4118-a460-8cc2bb3ec3e8
osd.2 up   in  weight 1 up_from 61 up_thru 0 down_at 60 last_clean_interval
[11,60) 10.1.1.85:6805/26784 10.1.1.85:6802/5026784 10.1.1.85:6811/5026784
10.1.1.85:6812/5026784 exists,up 8d5c7108-ef11-4947-b28c-8e20371d6d78
osd.3 up   in  weight 1 up_from 95 up_thru 0 down_at 94 last_clean_interval
[57,94) 10.1.1.84:6800/810 10.1.1.84:6810/3000810 10.1.1.84:6811/3000810
10.1.1.84:6812/3000810 exists,up bd762b2d-f94c-4879-8865-cecd63895557
osd.4 up   in  weight 1 up_from 97 up_thru 0 down_at 96 last_clean_interval
[74,96) 10.1.1.84:6801/9304 10.1.1.84:6802/2009304 10.1.1.84:6803/2009304
10.1.1.84:6813/2009304 exists,up 7d28a54b-b474-4369-b958-9e6bf6c856aa
osd.5 up   in  weight 1 up_from 99 up_thru 0 down_at 98 last_clean_interval
[79,98) 10.1.1.85:6801/19513 10.1.1.85:6808/2019513 10.1.1.85:6810/2019513
10.1.1.85:6813/2019513 exists,up f4d76875-0e40-487c-a26d-320f8b8d60c5

root@ceph-mon1:/home/ceph# ceph osd tree
# idweight  type name   up/down reweight
-1  0   root default
-2  0   host ceph-osd1
0   0   osd.0   up  1
3   0   osd.3   up  1
4   0   osd.4   up  1
-3  0   host ceph-osd2
1   0   osd.1   up  1
2   0   osd.2   up  1
5   0   osd.5   up  1

Current HEALTH_WARN state says 192 active+degraded since I rebooted an
osd host. Previously it was incomplete. It never reached a HEALTH_OK
state.
Any hint about what to do next to have an healthy cluster?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Giuseppe Civitella
Hi,

thanks for the quick answer.
I did try the force_create_pg on a pg but is stuck on creating:
root@ceph-mon1:/home/ceph# ceph pg dump |grep creating
dumped all in format plain
2.2f0   0   0   0   0   0   0   creating
 2014-12-09 13:11:37.384808  0'0 0:0 []  -1  []
 -1  0'0 0.000'0  0.00

root@ceph-mon1:/home/ceph# ceph pg 2.2f query
{ state: active+degraded,
  epoch: 105,
  up: [
0],
  acting: [
0],
  actingbackfill: [
0],
  info: { pgid: 2.2f,
  last_update: 0'0,
  last_complete: 0'0,
  log_tail: 0'0,
  last_user_version: 0,
  last_backfill: MAX,
  purged_snaps: [],
  last_scrub: 0'0,
  last_scrub_stamp: 2014-12-06 14:15:11.499769,
  last_deep_scrub: 0'0,
  last_deep_scrub_stamp: 2014-12-06 14:15:11.499769,
  last_clean_scrub_stamp: 0.00,
  log_size: 0,
  ondisk_log_size: 0,
  stats_invalid: 0,
  stat_sum: { num_bytes: 0,
  num_objects: 0,
  num_object_clones: 0,
  num_object_copies: 0,
  num_objects_missing_on_primary: 0,
  num_objects_degraded: 0,
  num_objects_unfound: 0,
  num_objects_dirty: 0,
  num_whiteouts: 0,
  num_read: 0,
  num_read_kb: 0,
  num_write: 0,
  num_write_kb: 0,
  num_scrub_errors: 0,
  num_shallow_scrub_errors: 0,
  num_deep_scrub_errors: 0,
  num_objects_recovered: 0,
  num_bytes_recovered: 0,
  num_keys_recovered: 0,
  num_objects_omap: 0,
  num_objects_hit_set_archive: 0},
  stat_cat_sum: {},
  up: [
0],
  acting: [
0],
  up_primary: 0,
  acting_primary: 0},
  empty: 1,
  dne: 0,
  incomplete: 0,
  last_epoch_started: 104,
  hit_set_history: { current_last_update: 0'0,
  current_last_stamp: 0.00,
  current_info: { begin: 0.00,
  end: 0.00,
  version: 0'0},
  history: []}},
  peer_info: [],
  recovery_state: [
{ name: Started\/Primary\/Active,
  enter_time: 2014-12-09 12:12:52.760384,
  might_have_unfound: [],
  recovery_progress: { backfill_targets: [],
  waiting_on_backfill: [],
  last_backfill_started: 0\/\/0\/\/-1,
  backfill_info: { begin: 0\/\/0\/\/-1,
  end: 0\/\/0\/\/-1,
  objects: []},
  peer_backfill_info: [],
  backfills_in_flight: [],
  recovering: [],
  pg_backend: { pull_from_peer: [],
  pushing: []}},
  scrub: { scrubber.epoch_start: 0,
  scrubber.active: 0,
  scrubber.block_writes: 0,
  scrubber.finalizing: 0,
  scrubber.waiting_on: 0,
  scrubber.waiting_on_whom: []}},
{ name: Started,
  enter_time: 2014-12-09 12:12:51.845686}],
  agent_state: {}}root@ceph-mon1:/home/ceph#



2014-12-09 13:01 GMT+01:00 Irek Fasikhov malm...@gmail.com:

 Hi.

 http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

 ceph pg force_create_pg pgid


 2014-12-09 14:50 GMT+03:00 Giuseppe Civitella 
 giuseppe.civite...@gmail.com:

 Hi all,

 last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04
 with default kernel.
 There is a ceph monitor a two osd hosts. Here are some datails:
 ceph -s
 cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
  health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
  monmap e1: 1 mons at {ceph-mon1=10.1.1.83:6789/0}, election epoch
 1, quorum 0 ceph-mon1
  osdmap e83: 6 osds: 6 up, 6 in
   pgmap v231: 192 pgs, 3 pools, 0 bytes data, 0 objects
 207 MB used, 30446 MB / 30653 MB avail
  192 active+degraded

 root@ceph-mon1:/home/ceph# ceph osd dump
 epoch 99
 fsid c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
 created 2014-12-06 13:15:06.418843
 modified 2014-12-09 11:38:04.353279
 flags
 pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 18 flags hashpspool
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 64 pgp_num 64 last_change 19 flags hashpspool
 stripe_width 0
 pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 20 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 90 up_thru 90 down_at 89
 last_clean_interval [58,89) 10.1.1.84:6805/995 10.1.1.84:6806/4000995
 10.1.1.84:6807/4000995 10.1.1.84:6808/4000995 exists,up
 e3895075-614d-48e2-b956-96e13dbd87fe
 osd.1 up   in  weight 1 up_from 88 up_thru 0 down_at 87
 last_clean_interval