Hi Gregory,

Many thanks for your reply. I couldn't spot any resources that describe/show 
how you can successfully write / append to an EC pool with the librados API on 
those links. Do you know of any such examples or resources? Or is it just 
simply not possible?

Best regards,

James Norman

> On 6 Oct 2016, at 19:17, Gregory Farnum <gfar...@redhat.com> wrote:
> On Thu, Oct 6, 2016 at 4:08 AM, James Norman <ja...@storagemadeeasy.com 
> <mailto:ja...@storagemadeeasy.com>> wrote:
>> Hi there,
>> I am developing a web application that supports browsing, uploading,
>> downloading, moving files in Ceph Rados pool. Internally to write objects we
>> use rados_append, as it's often too memory intensive for us to have the full
>> file in memory to do a rados_write_full.
>> We do not control our customer's Ceph installations, such as whether they
>> use replicated pools, EC pools etc. We've found that when dealing with a EC
>> pool, our rados_append calls return error code 95 and message "Operation not
>> supported".
>> I've had several discussions with members in the IRC chatroom regarding
>> this, and the general consensus I've got is:
>> 1) Use write alignment.
>> 2) Put a replicated pool in front of the EC pool
>> 3) EC pools have a limited feature set
>> Regarding point 1), are there any actual code example for how you would
>> handle this in the context of rados_append? I have struggled to find even
>> one. This seems to me something that should be handled by either the API
>> libraries, or Ceph itself, not the client trying to write some data.
> librados requires a fair bit of knowledge from the user applications,
> yes. One thing you mention that sounds concerning is that you can't
> hold the objects in-memory — RADOS is not comfortable with very large
> objects and you'll find that things like backfill might not perform as
> you expect. (At this point everything will *probably* function, but it
> may be so slow as to make no difference to you when it hits that
> situation.) Certainly if your objects do not all fit neatly into
> buckets of a particular size and you have some that are very large,
> you will have a very not-uniform balance.
> But, if you want to learn about EC pools there is some documentation
> at http://docs.ceph.com/docs/master/dev/osd_internals/erasure_coding/ 
> <http://docs.ceph.com/docs/master/dev/osd_internals/erasure_coding/>
> (or in ceph.git/doc/dev/osd_internals/erasure_coding) from when they
> were being created.
>> Regarding point 2) This seems to be a workaround, and generally not
>> something we want to recommend to our customers. Is it detrimental to us an
>> EC pool without a replicated pool? What are the performance costs of doing
>> so?
> Yeah, don't do that. Cache pools are really tricky to use properly and
> turned out not to perform very well.
>> Regarding point 3) Can you point me towards resources that describe what
>> features / abilities you lose by adopting an EC pool?
> Same as above links, apparently. But really, you can read from and
> append to them. There are no object classes, no arbitrary overwrites,
> no omaps.
> -Greg

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

ceph-users mailing list

Reply via email to