Re: Ceph performance

2012-10-30 Thread Roman Alekseev
On 29.10.2012 22:57, Sam Lang wrote: Hi Roman, Is this with the ceph fuse client or the ceph kernel module? Its not surprising that the local file system (/home) is so much faster than a mounted ceph volume, especially the first time the directory tree is traversed (metadata results are

Fwd: Delivery Status Notification (Failure)

2012-10-30 Thread hemant surale
-- Forwarded message -- From: hemant surale hemant.sur...@gmail.com Date: Tue, Oct 30, 2012 at 2:28 PM Subject: Re: Delivery Status Notification (Failure) To: Dan Mick dan.m...@inktank.com Sir, Now mkcephfs is working fine but even after that when i tried to execute service

Re: Ceph performance

2012-10-30 Thread Gregory Farnum
On Tue, Oct 30, 2012 at 9:27 AM, Roman Alekseev rs.aleks...@gmail.com wrote: On 29.10.2012 22:57, Sam Lang wrote: Hi Roman, Is this with the ceph fuse client or the ceph kernel module? Its not surprising that the local file system (/home) is so much faster than a mounted ceph volume,

Limitaion of CephFS

2012-10-30 Thread Eric_YH_Chen
Hi all: I have some question about the limitation of CephFS. Would you please help to answer these questions? Thanks! 1. Max file size 2. Max number of files 3. Max filename length 4. filename character set, ex: any byte, except null, / 5. max pathname length And one question about RBD 1. max

Re: Ceph performance

2012-10-30 Thread Maciej Gałkiewicz
I have been experiencing the same problem. Tell us more about your disks. Are they shared with OS and/or mds, journal? Paste ceph.conf. regards Maciej Galkiewicz -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More

Re: Limitaion of CephFS

2012-10-30 Thread Gregory Farnum
On Tue, Oct 30, 2012 at 10:45 AM, eric_yh_c...@wiwynn.com wrote: Hi all: I have some question about the limitation of CephFS. Would you please help to answer these questions? Thanks! 1. Max file size It's currently set to an (arbitrary) 1TB. It can be set wherever you like but is limited by

Re: Ceph performance

2012-10-30 Thread Roman Alekseev
On 30.10.2012 13:10, Gregory Farnum wrote: On Tue, Oct 30, 2012 at 9:27 AM, Roman Alekseev rs.aleks...@gmail.com wrote: On 29.10.2012 22:57, Sam Lang wrote: Hi Roman, Is this with the ceph fuse client or the ceph kernel module? Its not surprising that the local file system (/home) is so

Re: Ceph performance

2012-10-30 Thread Gregory Farnum
On Tue, Oct 30, 2012 at 11:04 AM, Roman Alekseev rs.aleks...@gmail.com wrote: On 30.10.2012 13:10, Gregory Farnum wrote: On Tue, Oct 30, 2012 at 9:27 AM, Roman Alekseev rs.aleks...@gmail.com wrote: On 29.10.2012 22:57, Sam Lang wrote: Hi Roman, Is this with the ceph fuse client or the

Re: Monitor issue

2012-10-30 Thread Joao Eduardo Luis
On 10/30/2012 06:06 AM, Roman Alekseev wrote: On 29.10.2012 18:59, Wido den Hollander wrote: On 10/29/2012 03:48 PM, Roman Alekseev wrote: Hello, I have 3 monitors on different nodes and when 'mon.a' was stopped whole cluster stopped work too. My conf: http://pastebin.com/hT3qEhUF Could

Re: Ceph performance

2012-10-30 Thread Maciej Gałkiewicz
ServerA(mon+osd): FilesystemSize Used Avail Use% Mounted on /dev/sda1 9.2G 2.4G 6.4G 27% / tmpfs 5.9G 0 5.9G 0% /lib/init/rw udev 5.9G 148K 5.9G 1% /dev tmpfs 5.9G 0 5.9G 0% /dev/shm /dev/sda7

Re: Ceph performance

2012-10-30 Thread Roman Alekseev
On 30.10.2012 14:47, Maciej Gałkiewicz wrote: ServerA(mon+osd): FilesystemSize Used Avail Use% Mounted on /dev/sda1 9.2G 2.4G 6.4G 27% / tmpfs 5.9G 0 5.9G 0% /lib/init/rw udev 5.9G 148K 5.9G 1% /dev tmpfs

Re: Ceph performance

2012-10-30 Thread Maciej Gałkiewicz
Give me a moment to check this out and thank you for opening my eyes. Let me to notify you about the results of implementing your recommendations. I am looking forward to hearing from you. One more thing. Run rados bench right now to see from what performance level are you starting. I suggest 2

Re: [PATCH, resend] rbd: simplify rbd_rq_fn()

2012-10-30 Thread Alex Elder
On 10/29/2012 03:29 PM, Josh Durgin wrote: This is much easier to read now. It might be useful to add messages for the different failure cases in bio_chain_clone_range later. I've added this to my growing list of cleanup tasks. Reviewed-by: Josh Durgin josh.dur...@inktank.com -- To

Re: [PATCH 6/8] rbd: define image specification structure

2012-10-30 Thread Alex Elder
On 10/29/2012 05:13 PM, Josh Durgin wrote: A couple notes below, but looks good. I responded to all of your notes below. And I will update the code/comments as appropriate after we have a chance to talk about it but for the time being I'm going to commit what I posted as-is.

Re: [PATCH 7/8] rbd: add reference counting to rbd_spec

2012-10-30 Thread Alex Elder
On 10/29/2012 05:20 PM, Josh Durgin wrote: On 10/26/2012 04:03 PM, Alex Elder wrote: With layered images we'll share rbd_spec structures, so add a reference count to it. It neatens up some code also. Could you explain your plan for these data structures? What will the structs and their

Re: [PATCH 8/8] rbd: fill rbd_spec in rbd_add_parse_args()

2012-10-30 Thread Alex Elder
On 10/29/2012 05:30 PM, Josh Durgin wrote: On 10/26/2012 04:03 PM, Alex Elder wrote: Pass the address of an rbd_spec structure to rbd_add_parse_args(). Use it to hold the information defining the rbd image to be mapped in an rbd_add() call. Use the result in the caller to initialize the

Re: production ready?

2012-10-30 Thread Gregory Farnum
Not a lot of people are publicly discussing their sizes on things like that, unfortunately. I believe DreamHost is still the most open. They have an (RGW-based) object storage service which is backed by ~800 OSDs and are currently beta-testing a compute service using RBD, which you can see

Re: production ready?

2012-10-30 Thread Gregory Farnum
On Tue, Oct 30, 2012 at 2:36 PM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2012/10/30 Gregory Farnum g...@inktank.com: Not a lot of people are publicly discussing their sizes on things like that, unfortunately. I believe DreamHost is still the most open. They have an

Re: production ready?

2012-10-30 Thread Stefan Priebe - Profihost AG
Am 30.10.2012 14:36, schrieb Gandalf Corvotempesta: 2012/10/30 Gregory Farnum g...@inktank.com: Not a lot of people are publicly discussing their sizes on things like that, unfortunately. I believe DreamHost is still the most open. They have an (RGW-based) object storage service which is backed

Re: production ready?

2012-10-30 Thread Gregory Farnum
On Tue, Oct 30, 2012 at 2:38 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 30.10.2012 14:36, schrieb Gandalf Corvotempesta: 2012/10/30 Gregory Farnum g...@inktank.com: Not a lot of people are publicly discussing their sizes on things like that, unfortunately. I believe

Re: production ready?

2012-10-30 Thread 袁冬
Nothing prevents me to offer a service directly based on RADOS API, if S3 compatibility is not needed, right ? Correct, That is librados. What I don't understand is how can I access a single file from RGW. If LibRBD and RGW are 'gateway' to a RADOS store, i'll have access to a block

Re: production ready?

2012-10-30 Thread 袁冬
In this case, can a single block device (for example a huge virtual machine image) be striped across many OSDs to archieve better performance in reading? an image striped across 3 disks, should get 3*IOPS when reading Yes, but network (and many other isssues) must be considered. Another

ceph-fuse on os x

2012-10-30 Thread Sage Weil
It is probably a relatively straighforward porting job (fixing up #includes, etc.) to get ceph-fuse working under OS X with macfuse or osxfuse or whatever the latest and greatest is. Any Mac users out there interested? sage -- To unsubscribe from this list: send the line unsubscribe

RGW in Bobtail

2012-10-30 Thread Yehuda Sadeh
We've been quite busy in the last few months, and the next ceph long term is right around the corner so here's a list of some of the new features rgw is getting: - Garbage collection This removes the requirement of running a periodic cleanup process to purge stale data, as rgw now handles it by

Re: RGW in Bobtail

2012-10-30 Thread Wido den Hollander
Hi, On 30-10-12 18:36, Yehuda Sadeh wrote: We've been quite busy in the last few months, and the next ceph long term is right around the corner so here's a list of some of the new features rgw is getting: - Garbage collection This removes the requirement of running a periodic cleanup

Re: RGW in Bobtail

2012-10-30 Thread Yehuda Sadeh
On Tue, Oct 30, 2012 at 10:54 AM, Wido den Hollander w...@widodh.nl wrote: Hi, On 30-10-12 18:36, Yehuda Sadeh wrote: We've been quite busy in the last few months, and the next ceph long term is right around the corner so here's a list of some of the new features rgw is getting: -

First Ceph workshop is this Friday in Amsterdam!

2012-10-30 Thread Sage Weil
If you're in Europe and want to check out the first day-long Ceph workshop, now is your last chance to get in on the action! http://cephworkshops.eventbrite.nl/# I'll be there, as well as Wido (who is organizing the event), Greg, Ross, and many other exciting people--developers and

Re: [PATCH 6/8] rbd: define image specification structure

2012-10-30 Thread Josh Durgin
On 10/30/2012 05:40 AM, Alex Elder wrote: On 10/29/2012 05:13 PM, Josh Durgin wrote: A couple notes below, but looks good. I responded to all of your notes below. And I will update the code/comments as appropriate after we have a chance to talk about it but for the time being I'm going to

Re: [PATCH 7/8] rbd: add reference counting to rbd_spec

2012-10-30 Thread Josh Durgin
On 10/30/2012 05:59 AM, Alex Elder wrote: On 10/29/2012 05:20 PM, Josh Durgin wrote: On 10/26/2012 04:03 PM, Alex Elder wrote: With layered images we'll share rbd_spec structures, so add a reference count to it. It neatens up some code also. Could you explain your plan for these data

Re: [PATCH 8/8] rbd: fill rbd_spec in rbd_add_parse_args()

2012-10-30 Thread Josh Durgin
On 10/30/2012 06:09 AM, Alex Elder wrote: On 10/29/2012 05:30 PM, Josh Durgin wrote: On 10/26/2012 04:03 PM, Alex Elder wrote: Pass the address of an rbd_spec structure to rbd_add_parse_args(). Use it to hold the information defining the rbd image to be mapped in an rbd_add() call. Use the

Re: production ready?

2012-10-30 Thread Stefan Priebe
Am 30.10.2012 14:45, schrieb Gregory Farnum: But there's still the problem of slow random write IOP/s. At least i haven't seen any good benchmarks. It's not magic — I haven't done extensive testing but I believe people see aggregate IOPs of about what you can calculate: (number of storage

Re: production ready?

2012-10-30 Thread Dan Mick
On 10/30/2012 07:59 AM, Gandalf Corvotempesta wrote: 2012/10/30 袁冬 yuandong1...@gmail.com: Yes, but network (and many other isssues) must be considered. Obviously 3 is suggested. Any contraindication running mon in the same OSD server? Generally that's considered OK. ceph-mon

Re: Different geoms for an rbd block device

2012-10-30 Thread Josh Durgin
On 10/28/2012 03:02 AM, Andrey Korolyov wrote: Hi, Should following behavior considered to be normal? $ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key $ fdisk /dev/rbd1 Command (m for help): p Disk /dev/rbd1: 671 MB, 671088640 bytes 255 heads, 63 sectors/track, 81

rbd: wrap up of initialization rework

2012-10-30 Thread Alex Elder
These four patches sort of finish out this bunch of patches that have been reworking how rbd initialization of devices is done. With this foundation in place, probing for parents of layered device fits in fairly neatly. [PATCH 1/4] rbd: don't pass rbd_dev to rbd_get_client() This makes

[PATCH 1/4] rbd: don't pass rbd_dev to rbd_get_client()

2012-10-30 Thread Alex Elder
The only reason rbd_dev is passed to rbd_get_client() is so its rbd_client field can get assigned. Instead, just return the rbd_client pointer as a result and have the caller do the assignment. Change rbd_put_client() so it takes an rbd_client structure, so follows the more typical symmetry with

[PATCH 2/4] rbd: consolidate rbd_dev init in rbd_add()

2012-10-30 Thread Alex Elder
Group the allocation and initialization of fields of the rbd device structure created in rbd_add(). Move the grouped code down later in the function, just prior to the call to rbd_dev_probe(). This is for the most part simple code movement. Signed-off-by: Alex Elder el...@inktank.com ---

[PATCH 3/4] rbd: define rbd_dev_{create,destroy}() helpers

2012-10-30 Thread Alex Elder
Encapsulate the creation/initialization and destruction of rbd device structures. The rbd_client and the rbd_spec structures provided on creation hold references whose ownership is transferred to the new rbd_device structure. Signed-off-by: Alex Elder el...@inktank.com --- drivers/block/rbd.c |

Re: production ready?

2012-10-30 Thread Sage Weil
On Tue, 30 Oct 2012, Gandalf Corvotempesta wrote: 2012/10/30 Dan Mick dan.m...@inktank.com: Generally that's considered OK. ceph-mon doesn't use very much disk or CPU or network bandwidth. In this case, should I reserve some space to ceph-mon (a partition or a dedicated disk) or ceph-mon

Re: Different geoms for an rbd block device

2012-10-30 Thread Andrey Korolyov
On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin josh.dur...@inktank.com wrote: On 10/28/2012 03:02 AM, Andrey Korolyov wrote: Hi, Should following behavior considered to be normal? $ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key $ fdisk /dev/rbd1 Command (m for help): p

Re: Different geoms for an rbd block device

2012-10-30 Thread Josh Durgin
On 10/30/2012 02:41 PM, Andrey Korolyov wrote: On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin josh.dur...@inktank.com wrote: On 10/28/2012 03:02 AM, Andrey Korolyov wrote: Hi, Should following behavior considered to be normal? $ rbd map test-rack0/debiantest --user qemukvm --secret

[PATCH 0/6] rbd: version 2 parent probing

2012-10-30 Thread Alex Elder
This series puts in place a few remaining pieces before finally implementing the call to rbd_dev_probe() for the parent of a layered rbd image if present. -Alex [PATCH 1/6] rbd: skip getting image id if known [PATCH 2/6] rbd: allow null image name

[PATCH 1/6] rbd: skip getting image id if known

2012-10-30 Thread Alex Elder
We will know the image id for format 2 parent images, but won't initially know its image name. Avoid making the query for an image id in rbd_dev_image_id() if it's already known. Signed-off-by: Alex Elder el...@inktank.com --- drivers/block/rbd.c |8 1 file changed, 8 insertions(+)

[PATCH 2/6] rbd: allow null image name

2012-10-30 Thread Alex Elder
Format 2 parent images are partially identified by their image id, but it may not be possible to determine their image name. The name is not strictly needed for correct operation, so we won't be treating it as an error if we don't know it. Handle this case gracefully in rbd_name_show().

[PATCH 3/6] rbd: get parent spec for version 2 images

2012-10-30 Thread Alex Elder
Add support for getting the the information identifying the parent image for rbd images that have them. The child image holds a reference to its parent image specification structure. Create a new entry parent in /sys/bus/rbd/image/N/ to report the identifying information for the parent image, if

[PATCH 4/6] libceph: define ceph_pg_pool_name_by_id()

2012-10-30 Thread Alex Elder
Define and export function ceph_pg_pool_name_by_id() to supply the name of a pg pool whose id is given. This will be used by the next patch. Signed-off-by: Alex Elder el...@inktank.com --- include/linux/ceph/osdmap.h |1 + net/ceph/osdmap.c | 16 2 files

[PATCH 5/6] rbd: get additional info in parent spec

2012-10-30 Thread Alex Elder
When a layered rbd image has a parent, that parent is identified only by its pool id, image id, and snapshot id. Images that have been mapped also record *names* for those three id's. Add code to look up these names for parent images so they match mapped images more closely. Skip doing this for

[PATCH 6/6] rbd: probe the parent of an image if present

2012-10-30 Thread Alex Elder
Call the probe function for the parent device. Signed-off-by: Alex Elder el...@inktank.com --- drivers/block/rbd.c | 79 +-- 1 file changed, 76 insertions(+), 3 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index