Re: Client can't reboot when rbd volume is mounted.

2013-02-11 Thread Roman Alekseev
On 11.02.2013 17:52, Sage Weil wrote: On Mon, 11 Feb 2013, Roman Alekseev wrote: On 11.02.2013 09:36, Sage Weil wrote: On Mon, 11 Feb 2013, Roman Alekseev wrote: Hi, When I try to reboot a client server without unmounting of rbd volume manually its services stop working but server doesn't re

slow requests, hunting for new mon

2013-02-11 Thread Chris Dunlop
Hi, What are likely causes for "slow requests" and "monclient: hunting for new mon" messages? E.g.: 2013-02-12 16:27:07.318943 7f9c0bc16700 0 monclient: hunting for new mon ... 2013-02-12 16:27:45.892314 7f9c13c26700 0 log [WRN] : 6 slow requests, 6 included below; oldest blocked for > 30.3838

Re: .gitignore issues

2013-02-11 Thread Josh Durgin
On 02/11/2013 06:28 PM, David Zafman wrote: After updating to latest master I have the following files listed by git status: These are mostly renamed binaries. If you run 'make clean' on the version before the name changes (133295ed001a950e3296f4e88a916ab2405be0cc) they'll be removed. If yo

.gitignore issues

2013-02-11 Thread David Zafman
After updating to latest master I have the following files listed by git status: $ git status # On branch master # Untracked files: # (use "git add ..." to include in what will be committed) # # src/bench_log # src/ceph-filestore-dump # src/ceph.conf # src/dupstore #

Re: osd down (for 2 about 2 minutes) error after adding a new host to my cluster

2013-02-11 Thread Isaac Otsiabah
Yes, there were osd daemons running on the same node that the monitor was running on.  If that is the case then i will run a test case with the monitor running on a different node where no osd is running and see what happens. Thank you. Isaac From: Gregory F

Re: File exists not handled in 0.48argonaut1

2013-02-11 Thread Samuel Just
The actual problem appears to be a corrupted log file. You should rename out of the way the directory: /mnt/osd97/current/corrupt_log_2013-02-08_18:50_2.fa8. Then, restart the osd with debug osd = 20, debug filestore = 20, and debug ms = 1 in the [osd] section of the ceph.conf. -Sam On Mon, Feb

Re: rest mgmt api

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Gregory Farnum wrote: > [...] > ...but my instinct is to want one canonical code path in the monitors, > not two. Two allows for discrepancies in what each method allows to > [...] Yeah, I'm convinced. Just chatted with Dan and Josh a bit about this. Josh had the interesting

Re: Crash and strange things on MDS

2013-02-11 Thread Kevin Decherf
On Mon, Feb 11, 2013 at 02:47:13PM -0800, Gregory Farnum wrote: > On Mon, Feb 11, 2013 at 2:24 PM, Kevin Decherf wrote: > > On Mon, Feb 11, 2013 at 12:25:59PM -0800, Gregory Farnum wrote: > > Yes, there is a dump of 100,000 events for this backtrace in the linked > > archive (I need 7 hours to upl

Re: Unable to mount cephfs - can't read superblock

2013-02-11 Thread Ross David Turk
On Feb 9, 2013, at 3:25 AM, Adam Nielsen wrote: > I will use that list as soon as it appears on GMane, since I find their NNTP > interface a lot easier than managing a bunch of mailing list subscriptions! > Maybe someone with more authority than myself can add it? > > http://gmane.org/subscr

Re: Crash and strange things on MDS

2013-02-11 Thread Gregory Farnum
On Mon, Feb 11, 2013 at 2:24 PM, Kevin Decherf wrote: > On Mon, Feb 11, 2013 at 12:25:59PM -0800, Gregory Farnum wrote: >> On Mon, Feb 4, 2013 at 10:01 AM, Kevin Decherf wrote: >> > References: >> > [1] http://www.spinics.net/lists/ceph-devel/msg04903.html >> > [2] ceph version 0.56.1 (e4a541624d

Re: rest mgmt api

2013-02-11 Thread Gregory Farnum
On Mon, Feb 11, 2013 at 2:00 PM, Sage Weil wrote: > On Mon, 11 Feb 2013, Gregory Farnum wrote: >> On Wed, Feb 6, 2013 at 12:14 PM, Sage Weil wrote: >> > On Wed, 6 Feb 2013, Dimitri Maziuk wrote: >> >> On 02/06/2013 01:34 PM, Sage Weil wrote: >> >> >> >> > I think the one caveat here is that havin

Re: rest mgmt api

2013-02-11 Thread Dimitri Maziuk
On 02/11/2013 04:00 PM, Sage Weil wrote: > On Mon, 11 Feb 2013, Gregory Farnum wrote: ... > That doesn't really help; it means the mon still has to understand the > CLI grammar. > > What we are talking about is the difference between: > > [ 'osd', 'down', '123' ] > > and > > { > URI: '/osd/

Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type

2013-02-11 Thread Namjae Jeon
2013/2/12, Dave Chinner : > On Mon, Feb 11, 2013 at 05:25:58PM +0900, Namjae Jeon wrote: >> From: Namjae Jeon >> >> This patch is a follow up on below patch: >> >> [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type >> commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 > >> diff -

Re: Crash and strange things on MDS

2013-02-11 Thread Kevin Decherf
On Mon, Feb 11, 2013 at 12:25:59PM -0800, Gregory Farnum wrote: > On Mon, Feb 4, 2013 at 10:01 AM, Kevin Decherf wrote: > > References: > > [1] http://www.spinics.net/lists/ceph-devel/msg04903.html > > [2] ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) > > 1: /usr/bin/ceph-mds(

Re: File exists not handled in 0.48argonaut1

2013-02-11 Thread Mandell Degerness
Since the attachment didn't work, apparently, here is a link to the log: http://dl.dropbox.com/u/766198/error17.log.gz On Mon, Feb 11, 2013 at 1:42 PM, Samuel Just wrote: > I don't see the more complete log. > -Sam > > On Mon, Feb 11, 2013 at 11:12 AM, Mandell Degerness > wrote: >> Anyone have

Re: chain_fsetxattr extra chunk removal

2013-02-11 Thread Loic Dachary
Hi, I amended the unit tests ( https://github.com/ceph/ceph/pull/40/files ) to cover the code below. A review would be much appreciated :-) Cheers On 02/11/2013 09:08 PM, Loic Dachary wrote: > > > On 02/11/2013 06:13 AM, Yehuda Sadeh wrote: >> On Thu, Feb 7, 2013 at 12:59 PM, Loic Dachary wr

Re: rest mgmt api

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Gregory Farnum wrote: > On Wed, Feb 6, 2013 at 12:14 PM, Sage Weil wrote: > > On Wed, 6 Feb 2013, Dimitri Maziuk wrote: > >> On 02/06/2013 01:34 PM, Sage Weil wrote: > >> > >> > I think the one caveat here is that having a single registry for commands > >> > in the monitor mea

Re: File exists not handled in 0.48argonaut1

2013-02-11 Thread Samuel Just
I don't see the more complete log. -Sam On Mon, Feb 11, 2013 at 11:12 AM, Mandell Degerness wrote: > Anyone have any thoughts on this??? It looks like I may have to wipe > out the OSDs effected and rebuild them, but I'm afraid that may result > in data loss because of the old OSD first crush map

Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type

2013-02-11 Thread Dave Chinner
On Mon, Feb 11, 2013 at 05:25:58PM +0900, Namjae Jeon wrote: > From: Namjae Jeon > > This patch is a follow up on below patch: > > [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type > commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 > diff --git a/fs/xfs/xfs_export.c b/fs/xf

Re: OSD Weights

2013-02-11 Thread Gregory Farnum
On Mon, Feb 11, 2013 at 12:43 PM, Holcombe, Christopher wrote: > Hi Everyone, > > I just wanted to confirm my thoughts on the ceph osd weightings. My > understanding is they are a statistical distribution number. My current > setup has 3TB hard drives and they all have the default weight of 1.

OSD Weights

2013-02-11 Thread Holcombe, Christopher
Hi Everyone, I just wanted to confirm my thoughts on the ceph osd weightings. My understanding is they are a statistical distribution number. My current setup has 3TB hard drives and they all have the default weight of 1. I was thinking that if I mixed in 4TB hard drives in the future it wou

Re: osd down (for 2 about 2 minutes) error after adding a new host to my cluster

2013-02-11 Thread Gregory Farnum
jIsaac, I'm sorry I haven't been able to wrangle any time to look into this more yet, but Sage pointed out in a related thread that there might be some buggy handling of things like this if the OSD and the monitor are located on the same host. Am I correct in assuming that with your small cluster,

Re: Crash and strange things on MDS

2013-02-11 Thread Gregory Farnum
On Mon, Feb 4, 2013 at 10:01 AM, Kevin Decherf wrote: > References: > [1] http://www.spinics.net/lists/ceph-devel/msg04903.html > [2] ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: /usr/bin/ceph-mds() [0x817e82] > 2: (()+0xf140) [0x7f9091d30140] > 3: (MDCache::requ

Re: Unable to mount cephfs - can't read superblock

2013-02-11 Thread Gregory Farnum
On Sat, Feb 9, 2013 at 2:13 PM, Adam Nielsen wrote: $ ceph -s health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean monmap e1: 1 mons at {0=192.168.0.6:6789/0}, election epoch 0, quorum 0 0 osdmap e3: 1 osds: 1 up, 1 in pgmap v119: 192 pgs: 192 active+degraded; 0

Re: preferred OSD

2013-02-11 Thread Gregory Farnum
On Fri, Feb 8, 2013 at 4:45 PM, Sage Weil wrote: > Hi Marcus- > > On Fri, 8 Feb 2013, Marcus Sorensen wrote: >> I know people have been disscussing on and off about providing a >> "preferred OSD" for things like multi-datacenter, or even within a >> datacenter, choosing an OSD that would avoid tra

Re: chain_fsetxattr extra chunk removal

2013-02-11 Thread Loic Dachary
On 02/11/2013 06:13 AM, Yehuda Sadeh wrote: > On Thu, Feb 7, 2013 at 12:59 PM, Loic Dachary wrote: >> Hi, >> >> While writing unit tests for chain_xattr.cc I tried to understand how to >> create the conditions to trigger this part of the chain_fsetxattr function: >> >> /* if we're exactly at

Re: rest mgmt api

2013-02-11 Thread Gregory Farnum
On Wed, Feb 6, 2013 at 12:14 PM, Sage Weil wrote: > On Wed, 6 Feb 2013, Dimitri Maziuk wrote: >> On 02/06/2013 01:34 PM, Sage Weil wrote: >> >> > I think the one caveat here is that having a single registry for commands >> > in the monitor means that commands can come in two flavors: vector >> > (

Re: File exists not handled in 0.48argonaut1

2013-02-11 Thread Mandell Degerness
Anyone have any thoughts on this??? It looks like I may have to wipe out the OSDs effected and rebuild them, but I'm afraid that may result in data loss because of the old OSD first crush map in place :(. On Fri, Feb 8, 2013 at 1:36 PM, Mandell Degerness wrote: > We ran into an error which appea

Re: Crash and strange things on MDS

2013-02-11 Thread Kevin Decherf
On Mon, Feb 11, 2013 at 11:00:15AM -0600, Sam Lang wrote: > Hi Kevin, sorry for the delayed response. > This looks like the mds cache is thrashing quite a bit, and with > multiple MDSs the tree partitioning is causing those estale messages. > In your case, you should probably run with just a single

Re: Crash and strange things on MDS

2013-02-11 Thread Sam Lang
On Mon, Feb 11, 2013 at 7:05 AM, Kevin Decherf wrote: > On Mon, Feb 04, 2013 at 07:01:54PM +0100, Kevin Decherf wrote: >> Hey everyone, >> >> It's my first post here to expose a potential issue I found today using >> Ceph 0.56.1. >> >> The cluster configuration is, briefly: 27 osd of ~900GB and 3

Re: [ceph-users] snapshot, clone and mount a VM-Image

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Wolfgang Hennerbichler wrote: > > > On 02/11/2013 03:02 PM, Wido den Hollander wrote: > > > You are looking at a way to "extract" the snapshot, correct? > > No. > > > Why would > > you want to mount it and backup the files? > > because then I can do things like incrementa

Re: IPv6 address confusion in OSDs

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Simon Leinen wrote: > Sage Weil writes: > > On Mon, 11 Feb 2013, Simon Leinen wrote: > >> We run a ten-node 64-OSD Ceph cluster and use IPv6 where possible. > > I should have mentioned that this is under Ubuntu 12.10 with version > 0.56.1-1quantal of the ceph packages. Sorry

Re: [PATCH 01/15] kv_flat_btree_async.cc: use vector instead of VLA's

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Danny Al-Gaaf wrote: > Am 10.02.2013 06:57, schrieb Sage Weil: > > On Thu, 7 Feb 2013, Danny Al-Gaaf wrote: > >> Fix "variable length array of non-POD element type" errors caused by > >> using librados::ObjectWriteOperation VLAs. (-Wvla) > >> > >> Signed-off-by: Danny Al-Gaaf

Re: IPv6 address confusion in OSDs

2013-02-11 Thread Simon Leinen
Sage Weil writes: > On Mon, 11 Feb 2013, Simon Leinen wrote: >> We run a ten-node 64-OSD Ceph cluster and use IPv6 where possible. I should have mentioned that this is under Ubuntu 12.10 with version 0.56.1-1quantal of the ceph packages. Sorry about the omission. >> Today I noticed this error me

Re: Client can't reboot when rbd volume is mounted.

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Roman Alekseev wrote: > On 11.02.2013 09:36, Sage Weil wrote: > > On Mon, 11 Feb 2013, Roman Alekseev wrote: > > > Hi, > > > > > > When I try to reboot a client server without unmounting of rbd volume > > > manually > > > its services stop working but server doesn't reboot co

Re: IPv6 address confusion in OSDs

2013-02-11 Thread Sage Weil
On Mon, 11 Feb 2013, Simon Leinen wrote: > We run a ten-node 64-OSD Ceph cluster and use IPv6 where possible. > > Today I noticed this error message from an OSD just after I restarted > it (in an attempt to resolve an issue with some "stuck" pgs that > included that OSD): > > 2013-02-11 09:24:57.

Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type

2013-02-11 Thread Sage Weil
Acked-by: Sage Weil On Mon, 11 Feb 2013, Namjae Jeon wrote: > From: Namjae Jeon > > This patch is a follow up on below patch: > > [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type > commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 > > Signed-off-by: Namjae Jeon > Signed-off

Re: [PATCH 01/15] kv_flat_btree_async.cc: use vector instead of VLA's

2013-02-11 Thread Danny Al-Gaaf
Am 10.02.2013 06:57, schrieb Sage Weil: > On Thu, 7 Feb 2013, Danny Al-Gaaf wrote: >> Fix "variable length array of non-POD element type" errors caused by >> using librados::ObjectWriteOperation VLAs. (-Wvla) >> >> Signed-off-by: Danny Al-Gaaf >> --- >> src/key_value_store/kv_flat_btree_async.cc

Re: Crash and strange things on MDS

2013-02-11 Thread Kevin Decherf
On Mon, Feb 04, 2013 at 07:01:54PM +0100, Kevin Decherf wrote: > Hey everyone, > > It's my first post here to expose a potential issue I found today using > Ceph 0.56.1. > > The cluster configuration is, briefly: 27 osd of ~900GB and 3 MON/MDS. > All nodes are running Exherbo (source-based distri

Re: Client can't reboot when rbd volume is mounted.

2013-02-11 Thread Roman Alekseev
On 11.02.2013 09:36, Sage Weil wrote: On Mon, 11 Feb 2013, Roman Alekseev wrote: Hi, When I try to reboot a client server without unmounting of rbd volume manually its services stop working but server doesn't reboot completely and show the following logs in KVM console: [235618.0202207] libce

IPv6 address confusion in OSDs

2013-02-11 Thread Simon Leinen
We run a ten-node 64-OSD Ceph cluster and use IPv6 where possible. Today I noticed this error message from an OSD just after I restarted it (in an attempt to resolve an issue with some "stuck" pgs that included that OSD): 2013-02-11 09:24:57.232811 osd.35 [ERR] map e768 had wrong cluster addr ([

[PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type

2013-02-11 Thread Namjae Jeon
From: Namjae Jeon This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon Signed-off-by: Vivek Trivedi Acked-by: Steven Whitehouse --- fs/btrfs/export.c |4