Re: [ceph-users] About conf parameter mon_initial_members

2014-10-21 Thread Nicheal
Hi guys, I try to bootstrap the monitor without setting the parameter mon_initial_members or leaving it as none . But the mon still can be created and run correctly. So as the osd. Actually, I find that the command tool and osd hunts the mon based on setting below, e.g.: [mon.b] host = ceph0

[ceph-users] About conf parameter mon_initial_members

2014-10-21 Thread Nicheal
Hi guys, ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] why the erasure code pool not support random write?

2014-10-21 Thread Nicheal
2014-10-21 7:40 GMT+08:00 Lionel Bouton lionel+c...@bouton.name: Hi, Le 21/10/2014 01:10, 池信泽 a écrit : Thanks. Another reason is the checksum in the attr of object used for deep scrub in EC pools should be computed when modify the object. When supporting the random write, We should

Re: [ceph-users] why the erasure code pool not support random write?

2014-10-21 Thread Nicheal
2014-10-20 22:39 GMT+08:00 Wido den Hollander w...@42on.com: On 10/20/2014 03:25 PM, 池信泽 wrote: hi, cephers: When I look into the ceph source code, I found the erasure code pool not support the random write, it only support the append write. Why? Is that random write of is erasure

Re: [ceph-users] recovery process stops

2014-10-21 Thread Harald Rößler
Hi all, thank you for your support, now the file system is not degraded any more. Now I have a minus degrading :-) 2014-10-21 10:15:22.303139 mon.0 [INF] pgmap v43376478: 3328 pgs: 3281 active+clean, 47 active+remapped; 1609 GB data, 5022 GB used, 1155 GB / 6178 GB avail; 8034B/s rd, 3548KB/s

[ceph-users] ceph-deploy problem on centos6

2014-10-21 Thread Luca Mazzaferro
Dear Users, whenever I run the command ceph-deploy with whatever option I got at the end this error message: Error in sys.exitfunc: The command seems to work. Can it be ignored? Anyone else had this problem? I'm running a Centos 6.5 Thank you. Cheer Luca Mazzaferro attachment:

[ceph-users] Giant release schedule

2014-10-21 Thread Andrei Mikhailovsky
Hello cephers, Does anyone know when is the planned release date for Giant? Cheers Andrei ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] real beginner question

2014-10-21 Thread Ranju Upadhyay
Hi Dan and Ashish, Many thanks for the replies. It is partly for the learning( and perhaps a production system in future) that we are playing with ceph. I guess, to start with, a simple set up of ,may be 4 vms, where one vm is admin node and one is mons and 2 are osds ( as suggested in

Re: [ceph-users] real beginner question

2014-10-21 Thread Christian Balzer
Hello, On Tue, 21 Oct 2014 10:53:54 +0100 Ranju Upadhyay wrote: Hi Dan and Ashish, Many thanks for the replies. It is partly for the learning( and perhaps a production system in future) that we are playing with ceph. Learning about Ceph (configuration, operation wise) with VMs is fine.

Re: [ceph-users] real beginner question

2014-10-21 Thread Dan Geist
Agreed. The getting started instructions walk you through creating most of a cluster then adding once more node piecemeal, but this seems needlessly complex for a beginner. I found the easiest way to get up and running was with three VMs with one or two logical partitions for OSDs on each and

Re: [ceph-users] Giant release schedule

2014-10-21 Thread Sage Weil
On Tue, 21 Oct 2014, Andrei Mikhailovsky wrote: Hello cephers, Does anyone know when is the planned release date for Giant? Looking at another week or two. There are still a few outstanding issues we'd like to squash! sage ___ ceph-users mailing

[ceph-users] pgs stuck in 'incomplete' state, blocked ops, query command hangs

2014-10-21 Thread Lincoln Bryant
Hi cephers, We have two pgs that are stuck in 'incomplete' state across two different pools: pg 2.525 is stuck inactive since forever, current state incomplete, last acting [55,89] pg 0.527 is stuck inactive since forever, current state incomplete, last acting [55,89] pg 0.527 is stuck

Re: [ceph-users] Giant release schedule

2014-10-21 Thread Andrei Mikhailovsky
Thanks, Sage, Can't wait to try it out and see if there are any improvements in the caching pool tier. Cheers Andrei - Original Message - From: Sage Weil s...@newdream.net To: Andrei Mikhailovsky and...@arhont.com Cc: ceph-users ceph-us...@ceph.com Sent: Tuesday, 21 October,

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-21 Thread Chad Seys
Hi Craig, It's part of the way the CRUSH hashing works. Any change to the CRUSH map causes the algorithm to change slightly. Dan@cern could not replicate my observations, so I plan to follow his procedure (fake create an OSD, wait for rebalance, remove fake OSD) in the near future to see if

Re: [ceph-users] recovery process stops

2014-10-21 Thread Craig Lewis
That will fix itself over time. remapped just means that Ceph is moving the data around. It's normal to see PGs in the remapped and/or backfilling state after OSD restarts. They should go down steadily over time. How long depends on how much data is in the PGs, how fast your hardware is, how

Re: [ceph-users] Same rbd mount from multiple servers

2014-10-21 Thread Alexandre DERUMIER
Thank you for your quick response! Okay I see, is there any preferred clustered FS in this case? OCFS2, GFS? Hi, I'm using ocfs2 on top of rbd in production, works fine. (you need to disable writeback/rbd_cache) - Mail original - De: Mihály Árva-Tóth

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-21 Thread Gregory Farnum
On Tuesday, October 21, 2014, Chad Seys cws...@physics.wisc.edu wrote: Hi Craig, It's part of the way the CRUSH hashing works. Any change to the CRUSH map causes the algorithm to change slightly. Dan@cern could not replicate my observations, so I plan to follow his procedure (fake

Re: [ceph-users] Few questions.

2014-10-21 Thread Robert LeBlanc
I'm still pretty new at Ceph so take this with a grain of salt. 1. In our experience, we have tried SSD journals and bcache, we have had more stability and performance by just using SSD journals. We have created an SSD pool with the rest of the space and it did not perform much better

Re: [ceph-users] pgs stuck in 'incomplete' state, blocked ops, query command hangs

2014-10-21 Thread Lincoln Bryant
A small update on this, I rebooted all of the Ceph nodes and was able to then query one of the misbehaving pgs. I've attached the query for pg 2.525. {\rtf1\ansi\ansicpg1252\cocoartf1138\cocoasubrtf510 {\fonttbl\f0\fswiss\fcharset0 Helvetica;} {\colortbl;\red255\green255\blue255;}

Re: [ceph-users] pgs stuck in 'incomplete' state, blocked ops, query command hangs

2014-10-21 Thread Lincoln Bryant
Whoops, didn't mean to attach the file as rtf. Plaintext attached { state: incomplete, epoch: 224352, up: [ 55, 89], acting: [ 55, 89], info: { pgid: 2.525, last_update: 0'0, last_complete: 0'0, log_tail: 0'0, last_user_version: 0,

[ceph-users] Question/idea about performance problems with a few overloaded OSDs

2014-10-21 Thread Lionel Bouton
Hi, I've yet to install 0.80.7 on one node to confirm its stability and use the new IO prirority tuning parameters enabling prioritized access to data from client requests. In the meantime, faced with large slowdowns caused by resync or external IO load (although external IO load is not expected

[ceph-users] OSDs will not come up

2014-10-21 Thread tsuraan
I configured a three-monitor Ceph cluster following the manual instructions at http://ceph.com/docs/v0.80.5/install/manual-deployment/ and http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . The monitor cluster came up without a problem, and seems to be fine. ceph -s currently shows

[ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Sergey Nazarov
Hi I just built a new cluster using this quickstart instructions: http://ceph.com/docs/master/start/ And here is what I am seeing: # time for i in {1..10}; do echo $i $i.txt ; done real 0m0.081s user 0m0.000s sys 0m0.004s And if I try to repeat the same command (when files already created):

Re: [ceph-users] Question/idea about performance problems with a few overloaded OSDs

2014-10-21 Thread Gregory Farnum
On Tue, Oct 21, 2014 at 10:15 AM, Lionel Bouton lionel+c...@bouton.name wrote: Hi, I've yet to install 0.80.7 on one node to confirm its stability and use the new IO prirority tuning parameters enabling prioritized access to data from client requests. In the meantime, faced with large

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Gregory Farnum
Are these tests conducted using a local fs on RBD, or using CephFS? If CephFS, do you have multiple clients mounting the FS, and what are they doing? What client (kernel or ceph-fuse)? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Oct 21, 2014 at 9:05 AM, Sergey

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Sergey Nazarov
It is CephFS mounted via ceph-fuse. I am getting the same results not depending on how many other clients are having this fs mounted and their activity. Cluster is working on Debian Wheezy, kernel 3.2.0-4-amd64. On Tue, Oct 21, 2014 at 1:44 PM, Gregory Farnum g...@inktank.com wrote: Are these

Re: [ceph-users] Question/idea about performance problems with a few overloaded OSDs

2014-10-21 Thread Lionel Bouton
Hi Gregory, Le 21/10/2014 19:39, Gregory Farnum a écrit : On Tue, Oct 21, 2014 at 10:15 AM, Lionel Bouton lionel+c...@bouton.name wrote: [...] Any thought? Is it based on wrong assumptions? Would it prove to be a can of worms if someone tried to implement it? Yeah, there's one big thing

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Gregory Farnum
Can you enable debugging on the client (debug ms = 1, debug client = 20) and mds (debug ms = 1, debug mds = 20), run this test again, and post them somewhere for me to look at? While you're at it, can you try rados bench and see what sort of results you get? -Greg Software Engineer #42 @

Re: [ceph-users] Question/idea about performance problems with a few overloaded OSDs

2014-10-21 Thread Mark Nelson
On 10/21/2014 01:06 PM, Lionel Bouton wrote: Hi Gregory, Le 21/10/2014 19:39, Gregory Farnum a écrit : On Tue, Oct 21, 2014 at 10:15 AM, Lionel Bouton lionel+c...@bouton.name wrote: [...] Any thought? Is it based on wrong assumptions? Would it prove to be a can of worms if someone tried to

Re: [ceph-users] recovery process stops

2014-10-21 Thread Harald Rößler
After more than 10 hours the same situation, I don’t think it will fix self over time. How I can find out what is the problem. Am 21.10.2014 um 17:28 schrieb Craig Lewis cle...@centraldesktop.commailto:cle...@centraldesktop.com: That will fix itself over time. remapped just means that Ceph

Re: [ceph-users] recovery process stops

2014-10-21 Thread Robert LeBlanc
I've had issues magically fix themselves over night after waiting/trying things for hours. On Tue, Oct 21, 2014 at 1:02 PM, Harald Rößler harald.roess...@btd.de wrote: After more than 10 hours the same situation, I don’t think it will fix self over time. How I can find out what is the problem.

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Sergey Nazarov
I enabled logging and performed same tests. Here is the link on archive with logs, they are only from one node (from the node where active MDS was sitting): https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0 Rados bench results: # rados bench -p test 10 write Maintaining 16 concurrent

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Sergey Nazarov
Ouch, I think client log is missing. Here it goes: https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0 On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov nataraj...@gmail.com wrote: I enabled logging and performed same tests. Here is the link on archive with logs, they are only

Re: [ceph-users] recovery process stops

2014-10-21 Thread Craig Lewis
In that case, take a look at ceph pg dump | grep remapped. In the up or active column, there should be one or two common OSDs between the stuck PGs. Try restarting those OSD daemons. I've had a few OSDs get stuck scheduling recovery, particularly around toofull situations. I've also had

Re: [ceph-users] RADOS pool snaps and RBD

2014-10-21 Thread Xavier Trilla
Hi Sage, Yes, I know about rbd diff, but the motivation behind this was to be able to dump an entire RBD pool via RADOS to another cluster, as our primary cluster uses quite expensive SSD storage and we would like to avoid constantly keeping one snapshot for every RBD image. The idea would

Re: [ceph-users] Few questions.

2014-10-21 Thread Christian Balzer
On Mon, 20 Oct 2014 11:07:43 +0200 Leszek Master wrote: 1) If i want to use cache tier should i use it with ssd journaling or i can get better perfomance using more ssd GB for cache tier? From reading what others on this ML experienced and what Robert already pointed out, cache tiering is

[ceph-users] mon_osd_down_out_subtree_limit stuck at rack

2014-10-21 Thread Christian Balzer
Hello, I'm trying to change the value of mon_osd_down_out_subtree_limit from rack to something, anything else with ceph 0.80.(6|7). Using injectargs it tells me that this isn't a runtime supported change and changing it in the config file (global section) to either host or room has no effect.