[ceph-users] erasure coded pool

2015-02-20 Thread Deneau, Tom
Is it possible to run an erasure coded pool using default k=2, m=2 profile on a 
single node?
(this is just for functionality testing). The single node has 3 OSDs. 
Replicated pools run fine.

ceph.conf does contain:
   osd crush chooseleaf type = 0


-- Tom Deneau

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool

2015-02-20 Thread Loic Dachary
Hi Tom,

On 20/02/2015 22:59, Deneau, Tom wrote:
 Is it possible to run an erasure coded pool using default k=2, m=2 profile on 
 a single node?
 (this is just for functionality testing). The single node has 3 OSDs. 
 Replicated pools run fine.

For k=2 m=2 to work you need four (k+m) OSDs. As long the the crush rule allows 
it, you can have them on the same host.

Cheers

 
 ceph.conf does contain:
osd crush chooseleaf type = 0
 
 
 -- Tom Deneau
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool why ever k1?

2015-01-22 Thread Loic Dachary

Hi,

On 22/01/2015 16:37, Chad William Seys wrote:
 Hi Loic,
 The size of each chunk is object size / K. If you have K=1 and M=2 it will
 be the same as 3 replicas with none of the advantages ;-)
 
 Interesting!  I did not see this explained so explicitly.
 
 So is the general explanation of k and m something like:
 k, m: fault tolerance of m+1 replicas, space of 1/k*(m+k) replicas,  plus 
 slowness
 ?

I'm not sure to understand the space formula but it looks like you got the idea.

 So one should never bother with k=1 b/c:
 k=1, m:  fault tolerance of m+1, space of m+1 replicas, plus slowness.
 (therefore, just use m+1 replicas!)
 
 but
 k=2, m=1:
 might be useful instead of 2 replicas b/c it has fault tolerance of 2 
 replicas, space of 1/2*(1+2) = 3/2 = 1.5 replicas, plus slowness.
 
 And
 k=2, m=2:
 which should be as tolerant as 3 replicas,  but take up as much space as 
 (1/2)*(2+2)=2 replicas (right?).

That's also how I understand it :-)

Cheers

 Thanks again!
 Chad.
 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool why ever k1?

2015-01-22 Thread Chad William Seys
Hi Loic,
 The size of each chunk is object size / K. If you have K=1 and M=2 it will
 be the same as 3 replicas with none of the advantages ;-)

Interesting!  I did not see this explained so explicitly.

So is the general explanation of k and m something like:
k, m: fault tolerance of m+1 replicas, space of 1/k*(m+k) replicas,  plus 
slowness
?

So one should never bother with k=1 b/c:
k=1, m:  fault tolerance of m+1, space of m+1 replicas, plus slowness.
(therefore, just use m+1 replicas!)

but
k=2, m=1:
might be useful instead of 2 replicas b/c it has fault tolerance of 2 
replicas, space of 1/2*(1+2) = 3/2 = 1.5 replicas, plus slowness.

And
k=2, m=2:
which should be as tolerant as 3 replicas,  but take up as much space as 
(1/2)*(2+2)=2 replicas (right?).

Thanks again!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] erasure coded pool why ever k1?

2015-01-21 Thread Chad William Seys
Hello all,
  What reasons would one want k1?
  I read that m determines the number of OSD which can fail before loss.  But 
I don't see explained how to choose k.  Any benefits for choosing k1?

Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool why ever k1?

2015-01-21 Thread Loic Dachary


On 21/01/2015 22:42, Chad William Seys wrote:
 Hello all,
   What reasons would one want k1?
   I read that m determines the number of OSD which can fail before loss.  But 
 I don't see explained how to choose k.  Any benefits for choosing k1?

The size of each chunk is object size / K. If you have K=1 and M=2 it will be 
the same as 3 replicas with none of the advantages ;-)

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool why ever k1?

2015-01-21 Thread Don Doerner
Well, look at it this way: with 3X replication, for each TB of data you need 3 
TB disk.  With (for example) 10+3 EC, you get better protection, and for each 
TB of data you need 1.3 TB disk.

-don-


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Loic 
Dachary
Sent: 21 January, 2015 15:18
To: Chad William Seys; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] erasure coded pool why ever k1?



On 21/01/2015 22:42, Chad William Seys wrote:
 Hello all,
   What reasons would one want k1?
   I read that m determines the number of OSD which can fail before 
 loss.  But I don't see explained how to choose k.  Any benefits for choosing 
 k1?

The size of each chunk is object size / K. If you have K=1 and M=2 it will be 
the same as 3 replicas with none of the advantages ;-)

Cheers

--
Loïc Dachary, Artisan Logiciel Libre

--
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] erasure coded pool k=7,m=5

2014-12-23 Thread Stéphane DUGRAVOT
Hi all, 

Soon, we should have a 3 datacenters (dc) ceph cluster with 4 hosts in each dc. 
Each host will have 12 OSD. 

We can accept the loss of one datacenter and one host on the remaining 2 
datacenters. 
In order to use erasure coded pool : 


1. Is the solution for a strategy k = 7, m = 5 is acceptable ? 
2. Is this is the only one that guarantees us our premise ? 
3. And more generally, is there a formula (based on the number of dc, host 
and OSD) that allows us to calculate the profile ? 

Thanks. 
Stephane. 

-- 
Université de Lorraine 
Stéphane DUGRAVOT - Direction du numérique - Infrastructure 
Jabber : stephane.dugra...@univ-lorraine.fr 
Tél.: +33 3 83 68 20 98 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] erasure coded pool k=7,m=5

2014-12-23 Thread Loic Dachary
Hi Stéphane,

On 23/12/2014 14:34, Stéphane DUGRAVOT wrote:
 Hi all,
 
 Soon, we should have a 3 datacenters (dc) ceph cluster with 4 hosts in each 
 dc. Each host will have 12 OSD.
 
 We can accept the loss of one datacenter and one host on the remaining 2 
 datacenters.
 In order to use erasure coded pool :
 
  1. Is the solution for a strategy k = 7, m = 5 is acceptable ?

If you want to sustain the loss of one datacenter, k=2,m=1 is what you want, 
with a ruleset that require that no two shards must be in the same datacenter. 
It also sustains the loss of one host within a datacenter: the missing chunk on 
the lost host will be reconstructed using the two other chunks from the two 
other datacenter.

If, in addition, you want to sustain the loss of one machine while a datacenter 
is down, you would need to use the LRC plugin.

  2. Is this is the only one that guarantees us our premise ?
  3. And more generally, is there a formula (based on the number of dc, host 
 and OSD) that allows us to calculate the profile ?

I don't think there is such a formula.

Cheers

 Thanks.
 Stephane.
 
 -- 
 *Université de Lorraine**/
 /*Stéphane DUGRAVOT - Direction du numérique - Infrastructure
 Jabber : /stephane.dugra...@univ-lorraine.fr/
 Tél.: /+33 3 83 68 20 98/
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure coded pool suitable for MDS?

2014-06-20 Thread Loic Dachary


On 20/06/2014 00:06, Erik Logtenberg wrote:
 Hi Loic,
 
 That is a nice idea. And if I then use newfs against that replicated
 cache pool, it'll work reliably?

It will not be limited by the erasure coded pool features, indeed.

Cheers

 
 Kind regards,
 
 Erik.
 
 
 On 06/19/2014 11:09 PM, Loic Dachary wrote:


 On 19/06/2014 22:51, Wido den Hollander wrote:




 Op 19 jun. 2014 om 16:10 heeft Erik Logtenberg
 e...@logtenberg.eu het volgende geschreven:

 Hi,

 Are erasure coded pools suitable for use with MDS?


 I don't think so. It does in-place updates of objects and that
 doesn't work with EC pools.

 Hi Erik,

 This is correct. You can however set the replicated pool to be the
 cache of an erasure coded pool.

 https://ceph.com/docs/master/dev/cache-pool/

 Cheers



 I tried to give it a go by creating two new pools like so:

 # ceph osd pool create ecdata 128 128 erasure # ceph osd pool
 create ecmetadata 128 128 erasure

 Then looked up their id's:

 # ceph osd lspools ..., 6 ecdata,7 ecmetadata

 # ceph mds newfs 7 6 --yes-i-really-mean-it

 But then when I start MDS, it crashes horribly. I did notice
 that MDS created a couple of objects in the ecmetadata pool:

 # rados ls -p ecmetadata mds0_sessionmap mds0_inotable 
 1..inode 200. mds_anchortable mds_snaptable 
 100..inode

 However it crashes immediately after. I started mds manually to
 try and see what's up:

 # ceph-mds -i 0 -d

 This spews out so much information that I saved it in a
 logfile, added as an attachment.

 Kind regards,

 Erik. mds.log 
 ___ ceph-users
 mailing list ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___ ceph-users
 mailing list ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure coded pool suitable for MDS?

2014-06-19 Thread Erik Logtenberg
Hi,

Are erasure coded pools suitable for use with MDS?

I tried to give it a go by creating two new pools like so:

# ceph osd pool create ecdata 128 128 erasure
# ceph osd pool create ecmetadata 128 128 erasure

Then looked up their id's:

# ceph osd lspools
..., 6 ecdata,7 ecmetadata

# ceph mds newfs 7 6 --yes-i-really-mean-it

But then when I start MDS, it crashes horribly. I did notice that MDS
created a couple of objects in the ecmetadata pool:

# rados ls -p ecmetadata
mds0_sessionmap
mds0_inotable
1..inode
200.
mds_anchortable
mds_snaptable
100..inode

However it crashes immediately after. I started mds manually to try and
see what's up:

# ceph-mds -i 0 -d

This spews out so much information that I saved it in a logfile, added
as an attachment.

Kind regards,

Erik.
2014-06-19 22:07:34.492328 7f3572f6e7c0  0 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process ceph-mds, pid 2943
starting mds.0 at :/0
2014-06-19 22:07:35.793309 7f356dd88700  1 mds.-1.0 handle_mds_map standby
2014-06-19 22:07:35.876689 7f356dd88700  1 mds.0.15 handle_mds_map i am now mds.0.15
2014-06-19 22:07:35.876695 7f356dd88700  1 mds.0.15 handle_mds_map state change up:standby -- up:creating
2014-06-19 22:07:35.876931 7f356dd88700  0 mds.0.cache creating system inode with ino:1
2014-06-19 22:07:35.877204 7f356dd88700  0 mds.0.cache creating system inode with ino:100
2014-06-19 22:07:35.877209 7f356dd88700  0 mds.0.cache creating system inode with ino:600
2014-06-19 22:07:35.877369 7f356dd88700  0 mds.0.cache creating system inode with ino:601
2014-06-19 22:07:35.877455 7f356dd88700  0 mds.0.cache creating system inode with ino:602
2014-06-19 22:07:35.877519 7f356dd88700  0 mds.0.cache creating system inode with ino:603
2014-06-19 22:07:35.877566 7f356dd88700  0 mds.0.cache creating system inode with ino:604
2014-06-19 22:07:35.877606 7f356dd88700  0 mds.0.cache creating system inode with ino:605
2014-06-19 22:07:35.877683 7f356dd88700  0 mds.0.cache creating system inode with ino:606
2014-06-19 22:07:35.877723 7f356dd88700  0 mds.0.cache creating system inode with ino:607
2014-06-19 22:07:35.877780 7f356dd88700  0 mds.0.cache creating system inode with ino:608
2014-06-19 22:07:35.877819 7f356dd88700  0 mds.0.cache creating system inode with ino:609
2014-06-19 22:07:35.877858 7f356dd88700  0 mds.0.cache creating system inode with ino:200
mds/CDir.cc: In function 'virtual void C_Dir_Committed::finish(int)' thread 7f356dd88700 time 2014-06-19 22:07:35.881337
mds/CDir.cc: 1809: FAILED assert(r == 0)
 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: ceph-mds() [0x75c6f1]
 2: (Context::complete(int)+0x9) [0x56cff9]
 3: (C_Gather::sub_finish(Context*, int)+0x1f7) [0x56e9a7]
 4: (C_Gather::C_GatherSub::finish(int)+0x12) [0x56eab2]
 5: (Context::complete(int)+0x9) [0x56cff9]
 6: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xf4e) [0x7d26ee]
 7: (MDS::handle_core_message(Message*)+0xb1f) [0x58e5ef]
 8: (MDS::_dispatch(Message*)+0x32) [0x58e7f2]
 9: (MDS::ms_dispatch(Message*)+0xa3) [0x5901d3]
 10: (DispatchQueue::entry()+0x57a) [0x99d9da]
 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x8be63d]
 12: (()+0x7c53) [0x7f3572366c53]
 13: (clone()+0x6d) [0x7f3571257dbd]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.
2014-06-19 22:07:35.883239 7f356dd88700 -1 mds/CDir.cc: In function 'virtual void C_Dir_Committed::finish(int)' thread 7f356dd88700 time 2014-06-19 22:07:35.881337
mds/CDir.cc: 1809: FAILED assert(r == 0)

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: ceph-mds() [0x75c6f1]
 2: (Context::complete(int)+0x9) [0x56cff9]
 3: (C_Gather::sub_finish(Context*, int)+0x1f7) [0x56e9a7]
 4: (C_Gather::C_GatherSub::finish(int)+0x12) [0x56eab2]
 5: (Context::complete(int)+0x9) [0x56cff9]
 6: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xf4e) [0x7d26ee]
 7: (MDS::handle_core_message(Message*)+0xb1f) [0x58e5ef]
 8: (MDS::_dispatch(Message*)+0x32) [0x58e7f2]
 9: (MDS::ms_dispatch(Message*)+0xa3) [0x5901d3]
 10: (DispatchQueue::entry()+0x57a) [0x99d9da]
 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x8be63d]
 12: (()+0x7c53) [0x7f3572366c53]
 13: (clone()+0x6d) [0x7f3571257dbd]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.

--- begin dump of recent events ---
  -144 2014-06-19 22:07:34.489920 7f3572f6e7c0  5 asok(0x1e0) register_command perfcounters_dump hook 0x1dc8010
  -143 2014-06-19 22:07:34.489992 7f3572f6e7c0  5 asok(0x1e0) register_command 1 hook 0x1dc8010
  -142 2014-06-19 22:07:34.490003 7f3572f6e7c0  5 asok(0x1e0) register_command perf dump hook 0x1dc8010
  -141 2014-06-19 22:07:34.490015 7f3572f6e7c0  5 asok(0x1e0) register_command perfcounters_schema hook 0x1dc8010
  -140 2014-06-19 22:07:34.490027 7f3572f6e7c0  5 asok(0x1e0) register_command 2 hook 0x1dc8010
  -139 2014-06-19 22:07:34.490035 7f3572f6e7c0  5 asok(0x1e0)