Re: [ceph-users] Erasure coded PGs incomplete

2015-01-09 Thread Nick Fisk
Hi Italo,

 

If you check for a post from me from a couple of days back, I have done exactly 
this.

 

I created a k=5 m=3 over 4 hosts. This ensured that I could lose a whole host 
and then an OSD on another host and the cluster was still fully operational.

 

I’m not sure if my method I used in the Crush map was the best way to achieve 
what I did, but it seemed to work.

 

Nick

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Italo 
Santos
Sent: 08 January 2015 22:35
To: Loic Dachary
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure coded PGs incomplete

 

Thanks for your answer. But another doubt raised…

 

Suppose I have 4 hosts with a erasure pool created with k=3, m=1 and failure 
domain by host and I lost a host. On this case I’ll face with the same issue on 
the beginning of this thread because k+m  number of hosts, right?

 

- On this scenario, with one host less I still able to read and write data on 
cluster?

- To solve the issue I’ll need add another host on cluster?

 

Regards.

 

Italo Santos

 http://italosantos.com.br/ http://italosantos.com.br/

 

On Wednesday, December 17, 2014 at 20:19, Loic Dachary wrote:

 

 

On 17/12/2014 19:46, Italo Santos wrote: Understood.

Thanks for your help, the cluster is healthy now :D

 

Also, using for example k=6,m=1 and failure domain by host I’ll be able lose 
all OSD on the same host, but if a lose 2 disks on different hosts I can lose 
data right? So, it is possible been a failure domain which allow me to lose an 
OSD or a host?

 

That's actually a good way to put it :-)

 

 

Regards.

 

*Italo Santos*

http://italosantos.com.br/

 

On Wednesday, December 17, 2014 at 4:27 PM, Loic Dachary wrote:

 

 

 

On 17/12/2014 19:22, Italo Santos wrote:

Loic,

 

So, if want have a failure domain by host, I’ll need set up a erasure profile 
which k+m = total number of hosts I have, right?

 

Yes, k+m has to be = number of hosts.

 

 

Regards.

 

*Italo Santos*

http://italosantos.com.br/

 

On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:

 

 

 

On 17/12/2014 18:18, Italo Santos wrote:

Hello,

 

I’ve take a look to this documentation (which help a lot) and if I understand 
right, when I set a profile like:

 

===

ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host

===

 

And create a pool following the recommendations on doc, I’ll need (100*16)/2 = 
800 PGs, I’ll need the sufficient number of hosts to support create total PGs?

 

You will need k+m = 10 host per OSD. If you only have 10 hosts that should be 
ok and the 800 PGs will use these 10 OSD in various orders. It also means that 
you will end up having 800 PG per OSD which is a bit too mche. If you have 20 
OSDs that will be better : each PG will get 10 OSD out of 20 and each OSD will 
have 400 PGs. Ideally you want the number of PG per OSD to be in the range 
(approximately) [20,300].

 

Cheers

 

 

Regards.

 

*Italo Santos*

http://italosantos.com.br/

 

On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:

 

Hi,

 

Thanks for the update : good news are much appreciated :-) Would you have time 
to review the documentation at https://github.com/ceph/ceph/pull/3194/files ? 
It was partly motivated by the problem you had.

 

Cheers

 

On 17/12/2014 14:03, Italo Santos wrote:

Hello Loic,

 

Thanks for you help, I’ve take a look to my crush map and I replace step 
chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all PGs was 
created successfully.

 

At.

 

*Italo Santos*

http://italosantos.com.br/

 

On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

 

Hi,

 

The 2147483647 means that CRUSH did not find enough OSD for a given PG. If you 
check the crush rule associated with the erasure coded pool, you will most 
probably find why.

 

Cheers

 

On 16/12/2014 23:32, Italo Santos wrote:

Hello,

 

I'm trying to create an erasure pool following 
http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but when I try 
create a pool with a specifc erasure-code-profile (myprofile) the PGs became 
on incomplete state.

 

Anyone can help me?

 

Below the profile I created:

root@ceph0001:~# ceph osd erasure-code-profile get myprofile

directory=/usr/lib/ceph/erasure-code

k=6

m=2

plugin=jerasure

technique=reed_sol_van

 

The status of cluster:

root@ceph0001:~# ceph health

HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean

 

health detail:

root@ceph0001:~# ceph health detail

HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean

pg 2.9 is stuck inactive since forever, current state incomplete, last acting 
[4,10,15,2147483647,3,2147483647,2147483647,2147483647]

pg 2.8 is stuck inactive since forever, current state incomplete, last acting 
[0,2147483647,4,2147483647,10,2147483647,15,2147483647]

pg 2.b is stuck inactive since forever, current state incomplete, last acting

Re: [ceph-users] Erasure coded PGs incomplete

2015-01-08 Thread Italo Santos
Thanks for your answer. But another doubt raised…

Suppose I have 4 hosts with a erasure pool created with k=3, m=1 and failure 
domain by host and I lost a host. On this case I’ll face with the same issue on 
the beginning of this thread because k+m  number of hosts, right?

- On this scenario, with one host less I still able to read and write data on 
cluster?
- To solve the issue I’ll need add another host on cluster?

Regards.

Italo Santos
http://italosantos.com.br/


On Wednesday, December 17, 2014 at 20:19, Loic Dachary wrote:

  
  
 On 17/12/2014 19:46, Italo Santos wrote: Understood.
  Thanks for your help, the cluster is healthy now :D
   
  Also, using for example k=6,m=1 and failure domain by host I’ll be able 
  lose all OSD on the same host, but if a lose 2 disks on different hosts I 
  can lose data right? So, it is possible been a failure domain which allow 
  me to lose an OSD or a host?
  
 That's actually a good way to put it :-)
  
   
  Regards.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Wednesday, December 17, 2014 at 4:27 PM, Loic Dachary wrote:
   


   On 17/12/2014 19:22, Italo Santos wrote:
Loic,
 
So, if want have a failure domain by host, I’ll need set up a erasure 
profile which k+m = total number of hosts I have, right?

   Yes, k+m has to be = number of hosts.

 
Regards.
 
*Italo Santos*
http://italosantos.com.br/
 
On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:
 
  
  
 On 17/12/2014 18:18, Italo Santos wrote:
  Hello,
   
  I’ve take a look to this documentation (which help a lot) and if I 
  understand right, when I set a profile like:
   
  ===
  ceph osd erasure-code-profile set isilon k=8 m=2 
  ruleset-failure-domain=host
  ===
   
  And create a pool following the recommendations on doc, I’ll need 
  (100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to 
  support create total PGs?
  
 You will need k+m = 10 host per OSD. If you only have 10 hosts that 
 should be ok and the 800 PGs will use these 10 OSD in various orders. 
 It also means that you will end up having 800 PG per OSD which is a 
 bit too mche. If you have 20 OSDs that will be better : each PG will 
 get 10 OSD out of 20 and each OSD will have 400 PGs. Ideally you want 
 the number of PG per OSD to be in the range (approximately) [20,300].
  
 Cheers
  
   
  Regards.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:
   
   Hi,

   Thanks for the update : good news are much appreciated :-) Would 
   you have time to review the documentation at 
   https://github.com/ceph/ceph/pull/3194/files ? It was partly 
   motivated by the problem you had.

   Cheers

   On 17/12/2014 14:03, Italo Santos wrote:
Hello Loic,
 
Thanks for you help, I’ve take a look to my crush map and I 
replace step chooseleaf indep 0 type osd” by step choose 
indep 0 type osd” and all PGs was created successfully.
 
At.
 
*Italo Santos*
http://italosantos.com.br/
 
On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:
 
 Hi,
  
 The 2147483647 means that CRUSH did not find enough OSD for a 
 given PG. If you check the crush rule associated with the 
 erasure coded pool, you will most probably find why.
  
 Cheers
  
 On 16/12/2014 23:32, Italo Santos wrote:
  Hello,
   
  I'm trying to create an erasure pool following 
  http://docs.ceph.com/docs/master/rados/operations/erasure-code/,
   but when I try create a pool with a specifc 
  erasure-code-profile (myprofile) the PGs became on 
  incomplete state.
   
  Anyone can help me?
   
  Below the profile I created:
  root@ceph0001:~# ceph osd erasure-code-profile get myprofile
  directory=/usr/lib/ceph/erasure-code
  k=6
  m=2
  plugin=jerasure
  technique=reed_sol_van
   
  The status of cluster:
  root@ceph0001:~# ceph health
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 
  pgs stuck unclean
   
  health detail:
  root@ceph0001:~# ceph health detail
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 
  pgs stuck unclean
  pg 2.9 is stuck inactive since forever, current state 
  incomplete, last acting 
  [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
  pg 2.8 is stuck inactive since forever, current state 
  incomplete, last acting 
  

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Italo Santos
Hello Loic,  

Thanks for you help, I’ve take a look to my crush map and I replace step 
chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all PGs was 
created successfully.  

At.

Italo Santos
http://italosantos.com.br/


On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

 Hi,
  
 The 2147483647 means that CRUSH did not find enough OSD for a given PG. If 
 you check the crush rule associated with the erasure coded pool, you will 
 most probably find why.
  
 Cheers
  
 On 16/12/2014 23:32, Italo Santos wrote:
  Hello,
   
  I'm trying to create an erasure pool following 
  http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but when I 
  try create a pool with a specifc erasure-code-profile (myprofile) the PGs 
  became on incomplete state.
   
  Anyone can help me?
   
  Below the profile I created:
  root@ceph0001:~# ceph osd erasure-code-profile get myprofile
  directory=/usr/lib/ceph/erasure-code
  k=6
  m=2
  plugin=jerasure
  technique=reed_sol_van
   
  The status of cluster:
  root@ceph0001:~# ceph health
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
   
  health detail:
  root@ceph0001:~# ceph health detail
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
  pg 2.9 is stuck inactive since forever, current state incomplete, last 
  acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
  pg 2.8 is stuck inactive since forever, current state incomplete, last 
  acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
  pg 2.b is stuck inactive since forever, current state incomplete, last 
  acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
  pg 2.a is stuck inactive since forever, current state incomplete, last 
  acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
  pg 2.5 is stuck inactive since forever, current state incomplete, last 
  acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
  pg 2.4 is stuck inactive since forever, current state incomplete, last 
  acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
  pg 2.7 is stuck inactive since forever, current state incomplete, last 
  acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
  pg 2.6 is stuck inactive since forever, current state incomplete, last 
  acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
  pg 2.1 is stuck inactive since forever, current state incomplete, last 
  acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
  pg 2.0 is stuck inactive since forever, current state incomplete, last 
  acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
  pg 2.3 is stuck inactive since forever, current state incomplete, last 
  acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
  pg 2.2 is stuck inactive since forever, current state incomplete, last 
  acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
  pg 2.9 is stuck unclean since forever, current state incomplete, last 
  acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
  pg 2.8 is stuck unclean since forever, current state incomplete, last 
  acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
  pg 2.b is stuck unclean since forever, current state incomplete, last 
  acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
  pg 2.a is stuck unclean since forever, current state incomplete, last 
  acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
  pg 2.5 is stuck unclean since forever, current state incomplete, last 
  acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
  pg 2.4 is stuck unclean since forever, current state incomplete, last 
  acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
  pg 2.7 is stuck unclean since forever, current state incomplete, last 
  acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
  pg 2.6 is stuck unclean since forever, current state incomplete, last 
  acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
  pg 2.1 is stuck unclean since forever, current state incomplete, last 
  acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
  pg 2.0 is stuck unclean since forever, current state incomplete, last 
  acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
  pg 2.3 is stuck unclean since forever, current state incomplete, last 
  acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
  pg 2.2 is stuck unclean since forever, current state incomplete, last 
  acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
  pg 2.9 is incomplete, acting 
  [4,10,15,2147483647,3,2147483647,2147483647,2147483647] (reducing pool 
  ecpool min_size from 6 may help; search ceph.com/docs 
  (http://ceph.com/docs) for 'incomplete')
  pg 2.8 is incomplete, acting 
  [0,2147483647,4,2147483647,10,2147483647,15,2147483647] (reducing pool 
  ecpool min_size from 6 may help; search ceph.com/docs 
  (http://ceph.com/docs) for 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Loic Dachary
Hi,

Thanks for the update : good news are much appreciated :-) Would you have time 
to review the documentation at https://github.com/ceph/ceph/pull/3194/files ? 
It was partly motivated by the problem you had.

Cheers

On 17/12/2014 14:03, Italo Santos wrote:
 Hello Loic,
 
 Thanks for you help, I’ve take a look to my crush map and I replace step 
 chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all PGs 
 was created successfully.
 
 At.
 
 *Italo Santos*
 http://italosantos.com.br/
 
 On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:
 
 Hi,

 The 2147483647 means that CRUSH did not find enough OSD for a given PG. If 
 you check the crush rule associated with the erasure coded pool, you will 
 most probably find why.

 Cheers

 On 16/12/2014 23:32, Italo Santos wrote:
 Hello,

 I'm trying to create an erasure pool following 
 http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but when I 
 try create a pool with a specifc erasure-code-profile (myprofile) the PGs 
 became on incomplete state.

 Anyone can help me?

 Below the profile I created:
 root@ceph0001:~# ceph osd erasure-code-profile get myprofile
 directory=/usr/lib/ceph/erasure-code
 k=6
 m=2
 plugin=jerasure
 technique=reed_sol_van

 The status of cluster:
 root@ceph0001:~# ceph health
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean

 health detail:
 root@ceph0001:~# ceph health detail
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
 pg 2.9 is stuck inactive since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck inactive since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck inactive since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck inactive since forever, current state incomplete, last 
 acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck inactive since forever, current state incomplete, last 
 acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck inactive since forever, current state incomplete, last 
 acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck inactive since forever, current state incomplete, last 
 acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck inactive since forever, current state incomplete, last 
 acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck inactive since forever, current state incomplete, last 
 acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck inactive since forever, current state incomplete, last 
 acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck inactive since forever, current state incomplete, last 
 acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck inactive since forever, current state incomplete, last 
 acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is stuck unclean since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck unclean since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck unclean since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck unclean since forever, current state incomplete, last 
 acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck unclean since forever, current state incomplete, last 
 acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck unclean since forever, current state incomplete, last 
 acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck unclean since forever, current state incomplete, last 
 acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck unclean since forever, current state incomplete, last 
 acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck unclean since forever, current state incomplete, last 
 acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck unclean since forever, current state incomplete, last 
 acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck unclean since forever, current state incomplete, last 
 acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck unclean since forever, current state incomplete, last 
 acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is incomplete, acting 
 [4,10,15,2147483647,3,2147483647,2147483647,2147483647] (reducing pool 
 ecpool min_size from 6 may help; search ceph.com/docs 
 http://ceph.com/docs for 'incomplete')
 pg 2.8 is 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Italo Santos
Hello,  

I’ve take a look to this documentation (which help a lot) and if I understand 
right, when I set a profile like:

===
ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host
===

And create a pool following the recommendations on doc, I’ll need (100*16)/2 = 
800 PGs, I’ll need the sufficient number of hosts to support create total PGs?  

Regards.

Italo Santos
http://italosantos.com.br/


On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:

 Hi,
  
 Thanks for the update : good news are much appreciated :-) Would you have 
 time to review the documentation at 
 https://github.com/ceph/ceph/pull/3194/files ? It was partly motivated by the 
 problem you had.
  
 Cheers
  
 On 17/12/2014 14:03, Italo Santos wrote:
  Hello Loic,
   
  Thanks for you help, I’ve take a look to my crush map and I replace step 
  chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all PGs 
  was created successfully.
   
  At.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:
   
   Hi,

   The 2147483647 means that CRUSH did not find enough OSD for a given PG. 
   If you check the crush rule associated with the erasure coded pool, you 
   will most probably find why.

   Cheers

   On 16/12/2014 23:32, Italo Santos wrote:
Hello,
 
I'm trying to create an erasure pool following 
http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but 
when I try create a pool with a specifc erasure-code-profile 
(myprofile) the PGs became on incomplete state.
 
Anyone can help me?
 
Below the profile I created:
root@ceph0001:~# ceph osd erasure-code-profile get myprofile
directory=/usr/lib/ceph/erasure-code
k=6
m=2
plugin=jerasure
technique=reed_sol_van
 
The status of cluster:
root@ceph0001:~# ceph health
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
unclean
 
health detail:
root@ceph0001:~# ceph health detail
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
unclean
pg 2.9 is stuck inactive since forever, current state incomplete, last 
acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
pg 2.8 is stuck inactive since forever, current state incomplete, last 
acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
pg 2.b is stuck inactive since forever, current state incomplete, last 
acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
pg 2.a is stuck inactive since forever, current state incomplete, last 
acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
pg 2.5 is stuck inactive since forever, current state incomplete, last 
acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
pg 2.4 is stuck inactive since forever, current state incomplete, last 
acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
pg 2.7 is stuck inactive since forever, current state incomplete, last 
acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
pg 2.6 is stuck inactive since forever, current state incomplete, last 
acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
pg 2.1 is stuck inactive since forever, current state incomplete, last 
acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
pg 2.0 is stuck inactive since forever, current state incomplete, last 
acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
pg 2.3 is stuck inactive since forever, current state incomplete, last 
acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
pg 2.2 is stuck inactive since forever, current state incomplete, last 
acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
pg 2.9 is stuck unclean since forever, current state incomplete, last 
acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
pg 2.8 is stuck unclean since forever, current state incomplete, last 
acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
pg 2.b is stuck unclean since forever, current state incomplete, last 
acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
pg 2.a is stuck unclean since forever, current state incomplete, last 
acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
pg 2.5 is stuck unclean since forever, current state incomplete, last 
acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
pg 2.4 is stuck unclean since forever, current state incomplete, last 
acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
pg 2.7 is stuck unclean since forever, current state incomplete, last 
acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
pg 2.6 is stuck unclean since forever, current state incomplete, last 
acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Loic Dachary


On 17/12/2014 18:18, Italo Santos wrote:
 Hello,
 
 I’ve take a look to this documentation (which help a lot) and if I understand 
 right, when I set a profile like:
 
 ===
 ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host
 ===
 
 And create a pool following the recommendations on doc, I’ll need (100*16)/2 
 = 800 PGs, I’ll need the sufficient number of hosts to support create total 
 PGs?

You will need k+m = 10 host per OSD. If you only have 10 hosts that should be 
ok and the 800 PGs will use these 10 OSD in various orders. It also means that 
you will end up having 800 PG per OSD which is a bit too mche. If you have 20 
OSDs that will be better : each PG will get 10 OSD out of 20 and each OSD will 
have 400 PGs. Ideally you want the number of PG per OSD to be in the range 
(approximately) [20,300].

Cheers

 
 Regards.
 
 *Italo Santos*
 http://italosantos.com.br/
 
 On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:
 
 Hi,

 Thanks for the update : good news are much appreciated :-) Would you have 
 time to review the documentation at 
 https://github.com/ceph/ceph/pull/3194/files ? It was partly motivated by 
 the problem you had.

 Cheers

 On 17/12/2014 14:03, Italo Santos wrote:
 Hello Loic,

 Thanks for you help, I’ve take a look to my crush map and I replace step 
 chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all PGs 
 was created successfully.

 At.

 *Italo Santos*
 http://italosantos.com.br/

 On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

 Hi,

 The 2147483647 means that CRUSH did not find enough OSD for a given PG. If 
 you check the crush rule associated with the erasure coded pool, you will 
 most probably find why.

 Cheers

 On 16/12/2014 23:32, Italo Santos wrote:
 Hello,

 I'm trying to create an erasure pool following 
 http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but when 
 I try create a pool with a specifc erasure-code-profile (myprofile) the 
 PGs became on incomplete state.

 Anyone can help me?

 Below the profile I created:
 root@ceph0001:~# ceph osd erasure-code-profile get myprofile
 directory=/usr/lib/ceph/erasure-code
 k=6
 m=2
 plugin=jerasure
 technique=reed_sol_van

 The status of cluster:
 root@ceph0001:~# ceph health
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean

 health detail:
 root@ceph0001:~# ceph health detail
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
 pg 2.9 is stuck inactive since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck inactive since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck inactive since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck inactive since forever, current state incomplete, last 
 acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck inactive since forever, current state incomplete, last 
 acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck inactive since forever, current state incomplete, last 
 acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck inactive since forever, current state incomplete, last 
 acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck inactive since forever, current state incomplete, last 
 acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck inactive since forever, current state incomplete, last 
 acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck inactive since forever, current state incomplete, last 
 acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck inactive since forever, current state incomplete, last 
 acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck inactive since forever, current state incomplete, last 
 acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is stuck unclean since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck unclean since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck unclean since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck unclean since forever, current state incomplete, last 
 acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck unclean since forever, current state incomplete, last 
 acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck unclean since forever, current state incomplete, last 
 acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck unclean since forever, 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Italo Santos
Loic,

So, if want have a failure domain by host, I’ll need set up a erasure profile 
which k+m = total number of hosts I have, right?  

Regards.

Italo Santos
http://italosantos.com.br/


On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:

  
  
 On 17/12/2014 18:18, Italo Santos wrote:
  Hello,
   
  I’ve take a look to this documentation (which help a lot) and if I 
  understand right, when I set a profile like:
   
  ===
  ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host
  ===
   
  And create a pool following the recommendations on doc, I’ll need 
  (100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to support 
  create total PGs?
  
 You will need k+m = 10 host per OSD. If you only have 10 hosts that should be 
 ok and the 800 PGs will use these 10 OSD in various orders. It also means 
 that you will end up having 800 PG per OSD which is a bit too mche. If you 
 have 20 OSDs that will be better : each PG will get 10 OSD out of 20 and each 
 OSD will have 400 PGs. Ideally you want the number of PG per OSD to be in the 
 range (approximately) [20,300].
  
 Cheers
  
   
  Regards.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:
   
   Hi,

   Thanks for the update : good news are much appreciated :-) Would you have 
   time to review the documentation at 
   https://github.com/ceph/ceph/pull/3194/files ? It was partly motivated by 
   the problem you had.

   Cheers

   On 17/12/2014 14:03, Italo Santos wrote:
Hello Loic,
 
Thanks for you help, I’ve take a look to my crush map and I replace 
step chooseleaf indep 0 type osd” by step choose indep 0 type osd” 
and all PGs was created successfully.
 
At.
 
*Italo Santos*
http://italosantos.com.br/
 
On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:
 
 Hi,
  
 The 2147483647 means that CRUSH did not find enough OSD for a given 
 PG. If you check the crush rule associated with the erasure coded 
 pool, you will most probably find why.
  
 Cheers
  
 On 16/12/2014 23:32, Italo Santos wrote:
  Hello,
   
  I'm trying to create an erasure pool following 
  http://docs.ceph.com/docs/master/rados/operations/erasure-code/, 
  but when I try create a pool with a specifc erasure-code-profile 
  (myprofile) the PGs became on incomplete state.
   
  Anyone can help me?
   
  Below the profile I created:
  root@ceph0001:~# ceph osd erasure-code-profile get myprofile
  directory=/usr/lib/ceph/erasure-code
  k=6
  m=2
  plugin=jerasure
  technique=reed_sol_van
   
  The status of cluster:
  root@ceph0001:~# ceph health
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
  unclean
   
  health detail:
  root@ceph0001:~# ceph health detail
  HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
  unclean
  pg 2.9 is stuck inactive since forever, current state incomplete, 
  last acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
  pg 2.8 is stuck inactive since forever, current state incomplete, 
  last acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
  pg 2.b is stuck inactive since forever, current state incomplete, 
  last acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
  pg 2.a is stuck inactive since forever, current state incomplete, 
  last acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
  pg 2.5 is stuck inactive since forever, current state incomplete, 
  last acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
  pg 2.4 is stuck inactive since forever, current state incomplete, 
  last acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
  pg 2.7 is stuck inactive since forever, current state incomplete, 
  last acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
  pg 2.6 is stuck inactive since forever, current state incomplete, 
  last acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
  pg 2.1 is stuck inactive since forever, current state incomplete, 
  last acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
  pg 2.0 is stuck inactive since forever, current state incomplete, 
  last acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
  pg 2.3 is stuck inactive since forever, current state incomplete, 
  last acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
  pg 2.2 is stuck inactive since forever, current state incomplete, 
  last acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
  pg 2.9 is stuck unclean since forever, current state incomplete, 
  last acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
  pg 2.8 is stuck unclean 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Loic Dachary


On 17/12/2014 19:22, Italo Santos wrote:
 Loic,
 
 So, if want have a failure domain by host, I’ll need set up a erasure profile 
 which k+m = total number of hosts I have, right?

Yes, k+m has to be = number of hosts.

 
 Regards.
 
 *Italo Santos*
 http://italosantos.com.br/
 
 On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:
 


 On 17/12/2014 18:18, Italo Santos wrote:
 Hello,

 I’ve take a look to this documentation (which help a lot) and if I 
 understand right, when I set a profile like:

 ===
 ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host
 ===

 And create a pool following the recommendations on doc, I’ll need 
 (100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to support 
 create total PGs?

 You will need k+m = 10 host per OSD. If you only have 10 hosts that should 
 be ok and the 800 PGs will use these 10 OSD in various orders. It also means 
 that you will end up having 800 PG per OSD which is a bit too mche. If you 
 have 20 OSDs that will be better : each PG will get 10 OSD out of 20 and 
 each OSD will have 400 PGs. Ideally you want the number of PG per OSD to be 
 in the range (approximately) [20,300].

 Cheers


 Regards.

 *Italo Santos*
 http://italosantos.com.br/

 On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:

 Hi,

 Thanks for the update : good news are much appreciated :-) Would you have 
 time to review the documentation at 
 https://github.com/ceph/ceph/pull/3194/files ? It was partly motivated by 
 the problem you had.

 Cheers

 On 17/12/2014 14:03, Italo Santos wrote:
 Hello Loic,

 Thanks for you help, I’ve take a look to my crush map and I replace step 
 chooseleaf indep 0 type osd” by step choose indep 0 type osd” and all 
 PGs was created successfully.

 At.

 *Italo Santos*
 http://italosantos.com.br/

 On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

 Hi,

 The 2147483647 means that CRUSH did not find enough OSD for a given PG. 
 If you check the crush rule associated with the erasure coded pool, you 
 will most probably find why.

 Cheers

 On 16/12/2014 23:32, Italo Santos wrote:
 Hello,

 I'm trying to create an erasure pool following 
 http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but 
 when I try create a pool with a specifc erasure-code-profile 
 (myprofile) the PGs became on incomplete state.

 Anyone can help me?

 Below the profile I created:
 root@ceph0001:~# ceph osd erasure-code-profile get myprofile
 directory=/usr/lib/ceph/erasure-code
 k=6
 m=2
 plugin=jerasure
 technique=reed_sol_van

 The status of cluster:
 root@ceph0001:~# ceph health
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
 unclean

 health detail:
 root@ceph0001:~# ceph health detail
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
 unclean
 pg 2.9 is stuck inactive since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck inactive since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck inactive since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck inactive since forever, current state incomplete, last 
 acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck inactive since forever, current state incomplete, last 
 acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck inactive since forever, current state incomplete, last 
 acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck inactive since forever, current state incomplete, last 
 acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck inactive since forever, current state incomplete, last 
 acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck inactive since forever, current state incomplete, last 
 acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck inactive since forever, current state incomplete, last 
 acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck inactive since forever, current state incomplete, last 
 acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck inactive since forever, current state incomplete, last 
 acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is stuck unclean since forever, current state incomplete, last 
 acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck unclean since forever, current state incomplete, last 
 acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck unclean since forever, current state incomplete, last 
 acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck unclean since forever, current state incomplete, last 
 acting 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Italo Santos
Understood.
Thanks for your help, the cluster is healthy now :D

Also, using for example k=6,m=1 and failure domain by host I’ll be able lose 
all OSD on the same host, but if a lose 2 disks on different hosts I can lose 
data right? So, it is possible been a failure domain which allow me to lose an 
OSD or a host?  

Regards.

Italo Santos
http://italosantos.com.br/


On Wednesday, December 17, 2014 at 4:27 PM, Loic Dachary wrote:

  
  
 On 17/12/2014 19:22, Italo Santos wrote:
  Loic,
   
  So, if want have a failure domain by host, I’ll need set up a erasure 
  profile which k+m = total number of hosts I have, right?
  
 Yes, k+m has to be = number of hosts.
  
   
  Regards.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:
   


   On 17/12/2014 18:18, Italo Santos wrote:
Hello,
 
I’ve take a look to this documentation (which help a lot) and if I 
understand right, when I set a profile like:
 
===
ceph osd erasure-code-profile set isilon k=8 m=2 
ruleset-failure-domain=host
===
 
And create a pool following the recommendations on doc, I’ll need 
(100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to 
support create total PGs?

   You will need k+m = 10 host per OSD. If you only have 10 hosts that 
   should be ok and the 800 PGs will use these 10 OSD in various orders. It 
   also means that you will end up having 800 PG per OSD which is a bit too 
   mche. If you have 20 OSDs that will be better : each PG will get 10 OSD 
   out of 20 and each OSD will have 400 PGs. Ideally you want the number of 
   PG per OSD to be in the range (approximately) [20,300].

   Cheers

 
Regards.
 
*Italo Santos*
http://italosantos.com.br/
 
On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:
 
 Hi,
  
 Thanks for the update : good news are much appreciated :-) Would you 
 have time to review the documentation at 
 https://github.com/ceph/ceph/pull/3194/files ? It was partly 
 motivated by the problem you had.
  
 Cheers
  
 On 17/12/2014 14:03, Italo Santos wrote:
  Hello Loic,
   
  Thanks for you help, I’ve take a look to my crush map and I replace 
  step chooseleaf indep 0 type osd” by step choose indep 0 type 
  osd” and all PGs was created successfully.
   
  At.
   
  *Italo Santos*
  http://italosantos.com.br/
   
  On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:
   
   Hi,

   The 2147483647 means that CRUSH did not find enough OSD for a 
   given PG. If you check the crush rule associated with the erasure 
   coded pool, you will most probably find why.

   Cheers

   On 16/12/2014 23:32, Italo Santos wrote:
Hello,
 
I'm trying to create an erasure pool following 
http://docs.ceph.com/docs/master/rados/operations/erasure-code/,
 but when I try create a pool with a specifc 
erasure-code-profile (myprofile) the PGs became on incomplete 
state.
 
Anyone can help me?
 
Below the profile I created:
root@ceph0001:~# ceph osd erasure-code-profile get myprofile
directory=/usr/lib/ceph/erasure-code
k=6
m=2
plugin=jerasure
technique=reed_sol_van
 
The status of cluster:
root@ceph0001:~# ceph health
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs 
stuck unclean
 
health detail:
root@ceph0001:~# ceph health detail
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs 
stuck unclean
pg 2.9 is stuck inactive since forever, current state 
incomplete, last acting 
[4,10,15,2147483647,3,2147483647,2147483647,2147483647]
pg 2.8 is stuck inactive since forever, current state 
incomplete, last acting 
[0,2147483647,4,2147483647,10,2147483647,15,2147483647]
pg 2.b is stuck inactive since forever, current state 
incomplete, last acting 
[8,3,14,2147483647,5,2147483647,2147483647,2147483647]
pg 2.a is stuck inactive since forever, current state 
incomplete, last acting 
[11,7,2,2147483647,2147483647,2147483647,15,2147483647]
pg 2.5 is stuck inactive since forever, current state 
incomplete, last acting 
[12,8,5,1,2147483647,2147483647,2147483647,2147483647]
pg 2.4 is stuck inactive since forever, current state 
incomplete, last acting 
[5,2147483647,13,1,2147483647,2147483647,8,2147483647]
pg 2.7 is stuck inactive since forever, current state 
incomplete, last acting 
[12,2,10,7,2147483647,2147483647,2147483647,2147483647]
pg 2.6 is stuck inactive since forever, current state 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-17 Thread Loic Dachary


On 17/12/2014 19:46, Italo Santos wrote: Understood.
 Thanks for your help, the cluster is healthy now :D
 
 Also, using for example k=6,m=1 and failure domain by host I’ll be able lose 
 all OSD on the same host, but if a lose 2 disks on different hosts I can lose 
 data right? So, it is possible been a failure domain which allow me to lose 
 an OSD or a host?

That's actually a good way to put it :-)

 
 Regards.
 
 *Italo Santos*
 http://italosantos.com.br/
 
 On Wednesday, December 17, 2014 at 4:27 PM, Loic Dachary wrote:
 


 On 17/12/2014 19:22, Italo Santos wrote:
 Loic,

 So, if want have a failure domain by host, I’ll need set up a erasure 
 profile which k+m = total number of hosts I have, right?

 Yes, k+m has to be = number of hosts.


 Regards.

 *Italo Santos*
 http://italosantos.com.br/

 On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:



 On 17/12/2014 18:18, Italo Santos wrote:
 Hello,

 I’ve take a look to this documentation (which help a lot) and if I 
 understand right, when I set a profile like:

 ===
 ceph osd erasure-code-profile set isilon k=8 m=2 
 ruleset-failure-domain=host
 ===

 And create a pool following the recommendations on doc, I’ll need 
 (100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to support 
 create total PGs?

 You will need k+m = 10 host per OSD. If you only have 10 hosts that should 
 be ok and the 800 PGs will use these 10 OSD in various orders. It also 
 means that you will end up having 800 PG per OSD which is a bit too mche. 
 If you have 20 OSDs that will be better : each PG will get 10 OSD out of 
 20 and each OSD will have 400 PGs. Ideally you want the number of PG per 
 OSD to be in the range (approximately) [20,300].

 Cheers


 Regards.

 *Italo Santos*
 http://italosantos.com.br/

 On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:

 Hi,

 Thanks for the update : good news are much appreciated :-) Would you 
 have time to review the documentation at 
 https://github.com/ceph/ceph/pull/3194/files ? It was partly motivated 
 by the problem you had.

 Cheers

 On 17/12/2014 14:03, Italo Santos wrote:
 Hello Loic,

 Thanks for you help, I’ve take a look to my crush map and I replace 
 step chooseleaf indep 0 type osd” by step choose indep 0 type osd” 
 and all PGs was created successfully.

 At.

 *Italo Santos*
 http://italosantos.com.br/

 On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

 Hi,

 The 2147483647 means that CRUSH did not find enough OSD for a given 
 PG. If you check the crush rule associated with the erasure coded 
 pool, you will most probably find why.

 Cheers

 On 16/12/2014 23:32, Italo Santos wrote:
 Hello,

 I'm trying to create an erasure pool following 
 http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but 
 when I try create a pool with a specifc erasure-code-profile 
 (myprofile) the PGs became on incomplete state.

 Anyone can help me?

 Below the profile I created:
 root@ceph0001:~# ceph osd erasure-code-profile get myprofile
 directory=/usr/lib/ceph/erasure-code
 k=6
 m=2
 plugin=jerasure
 technique=reed_sol_van

 The status of cluster:
 root@ceph0001:~# ceph health
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
 unclean

 health detail:
 root@ceph0001:~# ceph health detail
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck 
 unclean
 pg 2.9 is stuck inactive since forever, current state incomplete, 
 last acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck inactive since forever, current state incomplete, 
 last acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck inactive since forever, current state incomplete, 
 last acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck inactive since forever, current state incomplete, 
 last acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck inactive since forever, current state incomplete, 
 last acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck inactive since forever, current state incomplete, 
 last acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck inactive since forever, current state incomplete, 
 last acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck inactive since forever, current state incomplete, 
 last acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck inactive since forever, current state incomplete, 
 last acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck inactive since forever, current state incomplete, 
 last acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck inactive since forever, current state incomplete, 
 last acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck inactive since forever, current state incomplete, 
 last acting 

Re: [ceph-users] Erasure coded PGs incomplete

2014-12-16 Thread Loic Dachary
Hi,

The 2147483647 means that CRUSH did not find enough OSD for a given PG. If you 
check the crush rule associated with the erasure coded pool, you will most 
probably find why.

Cheers

On 16/12/2014 23:32, Italo Santos wrote:
 Hello,
 
 I'm trying to create an erasure pool following  
 http://docs.ceph.com/docs/master/rados/operations/erasure-code/, but when I 
 try create a pool with a specifc erasure-code-profile (myprofile) the PGs 
 became on incomplete state.
 
 Anyone can help me?
 
 Below the profile I created:
 root@ceph0001:~# ceph osd erasure-code-profile get myprofile
 directory=/usr/lib/ceph/erasure-code
 k=6
 m=2
 plugin=jerasure
 technique=reed_sol_van
 
 The status of cluster:
 root@ceph0001:~# ceph health
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
 
 health detail:
 root@ceph0001:~# ceph health detail
 HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
 pg 2.9 is stuck inactive since forever, current state incomplete, last acting 
 [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck inactive since forever, current state incomplete, last acting 
 [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck inactive since forever, current state incomplete, last acting 
 [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck inactive since forever, current state incomplete, last acting 
 [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck inactive since forever, current state incomplete, last acting 
 [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck inactive since forever, current state incomplete, last acting 
 [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck inactive since forever, current state incomplete, last acting 
 [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck inactive since forever, current state incomplete, last acting 
 [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck inactive since forever, current state incomplete, last acting 
 [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck inactive since forever, current state incomplete, last acting 
 [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck inactive since forever, current state incomplete, last acting 
 [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck inactive since forever, current state incomplete, last acting 
 [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is stuck unclean since forever, current state incomplete, last acting 
 [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
 pg 2.8 is stuck unclean since forever, current state incomplete, last acting 
 [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
 pg 2.b is stuck unclean since forever, current state incomplete, last acting 
 [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
 pg 2.a is stuck unclean since forever, current state incomplete, last acting 
 [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
 pg 2.5 is stuck unclean since forever, current state incomplete, last acting 
 [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
 pg 2.4 is stuck unclean since forever, current state incomplete, last acting 
 [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
 pg 2.7 is stuck unclean since forever, current state incomplete, last acting 
 [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
 pg 2.6 is stuck unclean since forever, current state incomplete, last acting 
 [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
 pg 2.1 is stuck unclean since forever, current state incomplete, last acting 
 [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
 pg 2.0 is stuck unclean since forever, current state incomplete, last acting 
 [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
 pg 2.3 is stuck unclean since forever, current state incomplete, last acting 
 [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
 pg 2.2 is stuck unclean since forever, current state incomplete, last acting 
 [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
 pg 2.9 is incomplete, acting 
 [4,10,15,2147483647,3,2147483647,2147483647,2147483647] (reducing pool ecpool 
 min_size from 6 may help; search ceph.com/docs for 'incomplete')
 pg 2.8 is incomplete, acting 
 [0,2147483647,4,2147483647,10,2147483647,15,2147483647] (reducing pool ecpool 
 min_size from 6 may help; search ceph.com/docs for 'incomplete')
 pg 2.b is incomplete, acting 
 [8,3,14,2147483647,5,2147483647,2147483647,2147483647] (reducing pool ecpool 
 min_size from 6 may help; search ceph.com/docs for 'incomplete')
 pg 2.a is incomplete, acting 
 [11,7,2,2147483647,2147483647,2147483647,15,2147483647] (reducing pool ecpool 
 min_size from 6 may help; search ceph.com/docs for 'incomplete')
 pg 2.5 is incomplete, acting