What versions of all the Ceph pieces are you using? (Kernel
client/ceph-fuse, MDS, etc)
Can you provide more details on exactly what the program is doing on
which nodes?
-Greg
On Fri, Jan 9, 2015 at 5:15 PM, Lorieri lori...@gmail.com wrote:
first 3 stat commands shows blocks and size changing,
On Wed, Jan 7, 2015 at 9:55 PM, Christian Balzer ch...@gol.com wrote:
On Wed, 7 Jan 2015 17:07:46 -0800 Craig Lewis wrote:
On Mon, Dec 29, 2014 at 4:49 PM, Alexandre Oliva ol...@gnu.org wrote:
However, I suspect that temporarily setting min size to a lower number
could be enough for the
On Fri, Jan 9, 2015 at 1:24 AM, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
Hi all,
as mentioned last year, our ceph cluster is still broken and unusable.
We are still investigating what has happened and I am taking more deep
looks into the output of ceph pg pgnum query.
The
100GB objects (or ~40 on a hard drive!) are way too large for you to
get an effective random distribution.
-Greg
On Thu, Jan 8, 2015 at 5:25 PM, Mark Nelson mark.nel...@inktank.com wrote:
On 01/08/2015 03:35 PM, Michael J Brewer wrote:
Hi all,
I'm working on filling a cluster to near
On Thu, Jan 8, 2015 at 5:46 AM, Zeeshan Ali Shah zas...@pdc.kth.se wrote:
I just finished configuring ceph up to 100 TB with openstack ... Since we
are also using Lustre in our HPC machines , just wondering what is the
bottle neck in ceph going on Peta Scale like Lustre .
any idea ? or
On Fri, Jan 9, 2015 at 2:00 AM, Nico Schottelius
nico-ceph-us...@schottelius.org wrote:
Lionel, Christian,
we do have the exactly same trouble as Christian,
namely
Christian Eichelmann [Fri, Jan 09, 2015 at 10:43:20AM +0100]:
We still don't know what caused this specific error...
and
:15 PM, Gregory Farnum g...@gregs42.com wrote:
On Thu, Jan 8, 2015 at 5:46 AM, Zeeshan Ali Shah zas...@pdc.kth.se
wrote:
I just finished configuring ceph up to 100 TB with openstack ... Since
we
are also using Lustre in our HPC machines , just wondering what is the
bottle neck in ceph
perf reset on the admin socket. I'm not sure what version it went in
to; you can check the release logs if it doesn't work on whatever you
have installed. :)
-Greg
On Mon, Jan 12, 2015 at 2:26 PM, Shain Miley smi...@npr.org wrote:
Is there a way to 'reset' the osd perf counters?
The numbers
12, 2015 at 5:14 PM, Gregory Farnum g...@gregs42.com wrote:
What versions of all the Ceph pieces are you using? (Kernel
client/ceph-fuse, MDS, etc)
Can you provide more details on exactly what the program is doing on
which nodes?
-Greg
On Fri, Jan 9, 2015 at 5:15 PM, Lorieri lori...@gmail.com
Awesome, thanks for the bug report and the fix, guys. :)
-Greg
On Mon, Jan 12, 2015 at 11:18 PM, 严正 z...@redhat.com wrote:
I tracked down the bug. Please try the attached patch
Regards
Yan, Zheng
在 2015年1月13日,07:40,Gregory Farnum g...@gregs42.com 写道:
Zheng, this looks like a kernel
On Mon, Jan 12, 2015 at 8:25 AM, Dan Van Der Ster
daniel.vanders...@cern.ch wrote:
On 12 Jan 2015, at 17:08, Sage Weil s...@newdream.net wrote:
On Mon, 12 Jan 2015, Dan Van Der Ster wrote:
Moving forward, I think it would be good for Ceph to a least document
this behaviour, but better would
Unmapping is an operation local to the host and doesn't communicate
with the cluster at all (at least, in the kernel you're running...in
very new code it might involve doing an unwatch, which will require
communication). That means there's no need for a keyring, since its
purpose is to validate
There are a lot of next steps on
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
You probably want to look at the bits about using the admin socket,
and diagnosing slow requests. :)
-Greg
On Sun, Feb 8, 2015 at 8:48 PM, Matthew Monaco m...@monaco.cx wrote:
Hello!
***
On Fri, Feb 6, 2015 at 3:37 PM, David J. Arias david.ar...@getecsa.co wrote:
Hello!
I am sysadmin for a small IT consulting enterprise in México.
We are trying to integrate three servers running RHEL 5.9 into a new
CEPH cluster.
I downloaded the source code and tried compiling it, though I
On Sun, Feb 8, 2015 at 6:00 PM, Sumit Gaur sumitkg...@gmail.com wrote:
Hi
I have installed 6 node ceph cluster and doing a performance bench mark for
the same using Nova VMs. What I have observed that FIO random write reports
around 250 MBps for 1M block size and PGs 4096 and 650MBps for iM
On Mon, Feb 9, 2015 at 7:12 PM, Matthew Monaco m...@monaco.cx wrote:
On 02/09/2015 08:20 AM, Gregory Farnum wrote:
There are a lot of next steps on
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
You probably want to look at the bits about using the admin socket
With sufficiently new CRUSH versions (all the latest point releases on
LTS?) I think you can simply have the rule return extra IDs which are
dropped if they exceed the number required. So you can choose two chassis,
then have those both choose to lead OSDs, and return those 4 from the rule.
-Greg
It's not entirely clear, but it looks like all the ops are just your
caching pool OSDs trying to promote objects, and your backing pool OSD's
aren't fast enough to satisfy all the IO demanded of them. You may be
overloading the system.
-Greg
On Fri, Feb 13, 2015 at 6:06 AM Mohamed Pakkeer
On Mon, Feb 9, 2015 at 11:58 AM, Christopher Armstrong
ch...@opdemand.com wrote:
Hi folks,
One of our users is seeing machine crashes almost daily. He's using Ceph
v0.87 giant, and is seeing this crash:
What version of Ceph are you running? It's varied by a bit.
But I think you want to just turn off the MDS and run the fail
command — deactivate is actually the command for removing a logical
MDS from the cluster, and you can't do that for a lone MDS because
there's nobody to pass off the data to.
Message-
From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: 12 February 2015 16:25
To: Jeffs, Warren (STFC,RAL,ISIS)
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] CephFS removal.
What version of Ceph are you running? It's varied by a bit.
But I think you want to just turn
I'm afraid I don't know what would happen if you change those options.
Hopefully we've set it up so things continue to work, but we definitely
don't test it.
-Greg
On Tue, Jan 6, 2015 at 8:22 AM Lionel Bouton lionel+c...@bouton.name
wrote:
On 01/06/15 02:36, Gregory Farnum wrote
On Wed, Feb 18, 2015 at 3:30 PM, Florian Haas flor...@hastexo.com wrote:
On Wed, Feb 18, 2015 at 11:41 PM, Gregory Farnum g...@gregs42.com wrote:
On Wed, Feb 18, 2015 at 1:58 PM, Florian Haas flor...@hastexo.com wrote:
On Wed, Feb 18, 2015 at 10:28 PM, Oliver Schulz osch...@mpp.mpg.de wrote
On Wed, Mar 18, 2015 at 3:28 AM, Chris Murray chrismurra...@gmail.com wrote:
Hi again Greg :-)
No, it doesn't seem to progress past that point. I started the OSD again a
couple of nights ago:
2015-03-16 21:34:46.221307 7fe4a8aa7780 10 journal op_apply_finish 13288339
open_ops 1 - 0,
On Wed, Mar 18, 2015 at 8:04 AM, Nick Fisk n...@fisk.me.uk wrote:
Hi Greg,
Thanks for your input and completely agree that we cannot expect developers
to fully document what impact each setting has on a cluster, particularly in
a performance related way
That said, if you or others could
On Wed, Mar 11, 2015 at 8:40 AM, Artem Savinov asavi...@asdco.ru wrote:
hello.
ceph transfers osd node in the down status by default , after receiving 3
reports about disabled nodes. Reports are sent per osd heartbeat grace
seconds, but the settings of mon_osd_adjust_heartbeat_gratse = true,
On Wed, Mar 11, 2015 at 2:25 PM, Nick Fisk n...@fisk.me.uk wrote:
I’m not sure if it’s something I’m doing wrong or just experiencing an
oddity, but when my cache tier flushes dirty blocks out to the base tier, the
writes seem to hit the OSD’s straight away instead of coalescing in the
On Wed, Mar 11, 2015 at 3:49 PM, Francois Lafont flafdiv...@free.fr wrote:
Hi,
I was always in the same situation: I couldn't remove an OSD without
have some PGs definitely stuck to the active+remapped state.
But I remembered I read on IRC that, before to mark out an OSD, it
could be
The information you're giving sounds a little contradictory, but my
guess is that you're seeing the impacts of object promotion and
flushing. You can sample the operations the OSDs are doing at any
given time by running ops_in_progress (or similar, I forget exact
phrasing) command on the OSD admin
intervals without also increasing the
filestore_wbthrottle_* limits is not going to work well for you.
-Greg
On Mon, Mar 16, 2015 at 3:58 PM, Nick Fisk n...@fisk.me.uk wrote:
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Gregory Farnum
Sent
On Fri, Mar 20, 2015 at 4:03 PM, Chris Murray chrismurra...@gmail.com wrote:
Ah, I was wondering myself if compression could be causing an issue, but I'm
reconsidering now. My latest experiment should hopefully help troubleshoot.
So, I remembered that ZLIB is slower, but is more 'safe for old
On Thu, Mar 19, 2015 at 4:46 AM, Matthijs Möhlmann
matth...@cacholong.nl wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
- From the documentation:
Cache Tier readonly:
Read-only Mode: When admins configure tiers with readonly mode, Ceph
clients write data to the backing tier.
On Wed, Mar 18, 2015 at 11:10 PM, Christian Balzer ch...@gol.com wrote:
Hello,
On Wed, 18 Mar 2015 11:05:47 -0700 Gregory Farnum wrote:
On Wed, Mar 18, 2015 at 8:04 AM, Nick Fisk n...@fisk.me.uk wrote:
Hi Greg,
Thanks for your input and completely agree that we cannot expect
On Thu, Mar 19, 2015 at 2:41 PM, Nick Fisk n...@fisk.me.uk wrote:
I'm looking at trialling OSD's with a small flashcache device over them to
hopefully reduce the impact of metadata updates when doing small block io.
Inspiration from here:-
On Fri, Mar 20, 2015 at 12:39 PM, Daniel Takatori Ohara
dtoh...@mochsl.org.br wrote:
Hello,
Anybody help me, please? Appear any messages in log of my mds.
And after the shell of my clients freeze.
2015-03-20 12:23:54.068005 7f1608d49700 0 log_channel(default) log [WRN] :
client.3197487
On Fri, Mar 20, 2015 at 1:05 PM, Ridwan Rashid ridwan...@gmail.com wrote:
Gregory Farnum greg@... writes:
On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote:
Hi,
I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with
cephFS. I have installed hadoop
On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan...@gmail.com wrote:
Hi,
I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with
cephFS. I have installed hadoop-1.1.1 in the nodes and changed the
conf/core-site.xml file according to the ceph documentation
On Mon, Mar 16, 2015 at 11:14 AM, Georgios Dimitrakakis
gior...@acmac.uoc.gr wrote:
Hi all!
I have recently updated to CEPH version 0.80.9 (latest Firefly release)
which presumably
supports direct upload.
I 've tried to upload a file using this functionality and it seems that is
working
On Mon, Mar 16, 2015 at 12:12 PM, Craig Lewis cle...@centraldesktop.com wrote:
Out of curiousity, what's the frequency of the peaks and troughs?
RadosGW has configs on how long it should wait after deleting before garbage
collecting, how long between GC runs, and how many objects it can GC in
This might be related to the backtrace assert, but that's the problem
you need to focus on. In particular, both of these errors are caused
by the scrub code, which Sage suggested temporarily disabling — if
you're still getting these messages, you clearly haven't done so
successfully.
That said,
On Tue, Mar 10, 2015 at 4:20 AM, Florent B flor...@coppint.com wrote:
Hi all,
I'm testing flock() locking system on CephFS (Giant) using Fuse.
It seems that lock works per client, and not over all clients.
Am I right or is it supposed to work over different clients ? Does MDS
has such a
On Tue, Mar 24, 2015 at 12:13 AM, Christian Balzer ch...@gol.com wrote:
On Tue, 24 Mar 2015 09:41:04 +0300 Kamil Kuramshin wrote:
Yes I read it and do no not understand what you mean when say *verify
this*? All 3335808 inodes are definetly files and direcories created by
ceph OSD process:
On Wed, Mar 25, 2015 at 1:20 AM, Udo Lembke ulem...@polarzone.de wrote:
Hi,
due to two more hosts (now 7 storage nodes) I want to create an new
ec-pool and get an strange effect:
ceph@admin:~$ ceph health detail
HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2
pgs
Yes.
On Wed, Mar 25, 2015 at 4:13 AM, Frédéric Nass
frederic.n...@univ-lorraine.fr wrote:
Hi Greg,
Thank you for this clarification. It helps a lot.
Does this can't think of any issues apply to both rbd and pool snapshots ?
Frederic.
On Tue, Mar 24,
On Wed, Mar 25, 2015 at 1:24 AM, Saverio Proto ziopr...@gmail.com wrote:
Hello there,
I started to push data into my ceph cluster. There is something I
cannot understand in the output of ceph -w.
When I run ceph -w I get this kinkd of output:
2015-03-25 09:11:36.785909 mon.0 [INF] pgmap
On Wed, Mar 25, 2015 at 10:36 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk wrote:
Dear All,
Please forgive this post if it's naive, I'm trying to familiarise myself
with cephfs!
I'm using Scientific Linux 6.6. with Ceph 0.87.1
My first steps with cephfs using a replicated pool worked OK.
Now
are
noticeably slow.
-Greg
On Fri, Mar 27, 2015 at 4:50 PM, Gregory Farnum g...@gregs42.com wrote:
On Fri, Mar 27, 2015 at 2:46 PM, Barclay Jameson
almightybe...@gmail.com wrote:
Yes it's the exact same hardware except for the MDS server (although I
tried using the MDS on the old node).
I
On Mon, Mar 30, 2015 at 1:01 PM, Garg, Pankaj
pankaj.g...@caviumnetworks.com wrote:
Hi,
I’m benchmarking my small cluster with HDDs vs HDDs with SSD Journaling. I
am using both RADOS bench and Block device (using fio) for testing.
I am seeing significant Write performance improvements, as
On Mon, Mar 30, 2015 at 3:15 PM, Francois Lafont flafdiv...@free.fr wrote:
Hi,
Gregory Farnum wrote:
The MDS doesn't have any data tied to the machine you're running it
on. You can either create an entirely new one on a different machine,
or simply copy the config file and cephx keyring
On Mon, Mar 30, 2015 at 1:51 PM, Steve Hindle mech...@gmail.com wrote:
Hi!
I mistakenly created my MDS node on the 'wrong' server a few months back.
Now I realized I placed it on a machine lacking IPMI and would like to move
it to another node in my cluster.
Is it possible to
On Mon, Mar 30, 2015 at 8:02 PM, Lindsay Mathieson
lindsay.mathie...@gmail.com wrote:
On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote:
Hi, all
I have a two-node Ceph cluster, and both are monitor and osd. When
they're both up, osd are all up and in, everything is fine... almost:
Two
?
regards
Udo
Am 26.03.2015 15:03, schrieb Gregory Farnum:
You shouldn't rely on rados ls when working with cache pools. It
doesn't behave properly and is a silly operation to run against a pool
of any size even when it does. :)
More specifically, rados ls is invoking the pgls operation
On Tue, Mar 31, 2015 at 7:50 AM, Quentin Hartman
qhart...@direwolfdigital.com wrote:
I'm working on redeploying a 14-node cluster. I'm running giant 0.87.1. Last
friday I got everything deployed and all was working well, and I set noout
and shut all the OSD nodes down over the weekend.
On Tue, Mar 31, 2015 at 2:50 AM, 张皓宇 zhanghaoyu1...@hotmail.com wrote:
Who can help me?
One monitor in my ceph cluster can not be started.
Before that, I added '[mon] mon_compact_on_start = true' to
/etc/ceph/ceph.conf on three monitor hosts. Then I did 'ceph tell
mon.computer05 compact ' on
involved in the fix.
-Greg
QH
On Tue, Mar 31, 2015 at 1:35 PM, Gregory Farnum g...@gregs42.com wrote:
On Tue, Mar 31, 2015 at 7:50 AM, Quentin Hartman
qhart...@direwolfdigital.com wrote:
I'm working on redeploying a 14-node cluster. I'm running giant 0.87.1.
Last
friday I got everything
On Mon, Mar 23, 2015 at 6:21 AM, Olivier Bonvalet ceph.l...@daevel.fr wrote:
Hi,
I'm still trying to find why there is much more write operations on
filestore since Emperor/Firefly than from Dumpling.
Do you have any history around this? It doesn't sound familiar,
although I bet it's because
On Sun, Mar 22, 2015 at 2:55 AM, Saverio Proto ziopr...@gmail.com wrote:
Hello,
I started to work with CEPH few weeks ago, I might ask a very newbie
question, but I could not find an answer in the docs or in the ml
archive for this.
Quick description of my setup:
I have a ceph cluster with
On Mon, Mar 23, 2015 at 4:31 AM, f...@univ-lr.fr f...@univ-lr.fr wrote:
Hi Somnath,
Thank you, please find my answers below
Somnath Roy somnath@sandisk.com a écrit le 22/03/15 18:16 :
Hi Frederick,
Need some information here.
1. Just to clarify, you are saying it is happening g in
On Sun, Mar 22, 2015 at 11:22 AM, Somnath Roy somnath@sandisk.com wrote:
You should be having replicated copies on other OSDs (disks), so, no need to
worry about the data loss. You add a new drive and follow the steps in the
following link (either 1 or 2)
Except that's not the case if
On Sat, Mar 21, 2015 at 10:46 AM, shylesh kumar shylesh.mo...@gmail.com wrote:
Hi ,
I was going through this simplified crush algorithm given in ceph website.
def crush(pg):
all_osds = ['osd.0', 'osd.1', 'osd.2', ...]
result = []
# size is the number of copies; primary+replicas
a failure. :)
-Greg
Thanks to all for helping !
Saverio
2015-03-23 14:58 GMT+01:00 Gregory Farnum g...@gregs42.com:
On Sun, Mar 22, 2015 at 2:55 AM, Saverio Proto ziopr...@gmail.com wrote:
Hello,
I started to work with CEPH few weeks ago, I might ask a very newbie
question, but I could
The ceph tool got moved into ceph-common at some point, so it
shouldn't be in the ceph rpm. I'm not sure what step in the
installation process should have handled that, but I imagine it's your
problem.
-Greg
On Mon, Mar 2, 2015 at 11:24 AM, Michael Kuriger mk7...@yp.com wrote:
Hi all,
When
PM Bill Sanders billysand...@gmail.com wrote:
Forgive me if this is unhelpful, but could it be something to do with
permissions of the directory and not Ceph at all?
http://superuser.com/a/528467
Bill
On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum g...@gregs42.com wrote:
On Mon, Mar 2
On Mon, Mar 2, 2015 at 3:39 PM, Scottix scot...@gmail.com wrote:
We have a file system running CephFS and for a while we had this issue when
doing an ls -la we get question marks in the response.
-rw-r--r-- 1 wwwrun root14761 Feb 9 16:06
data.2015-02-08_00-00-00.csv.bz2
-? ? ?
On Mon, Mar 2, 2015 at 7:15 PM, Nathan O'Sullivan nat...@mammoth.com.au wrote:
On 11/02/2015 1:46 PM, 杨万元 wrote:
Hello!
We use Ceph+Openstack in our private cloud. Recently we upgrade our
centos6.5 based cluster from Ceph Emperor to Ceph Firefly.
At first,we use redhat yum repo epel
On Tue, Mar 3, 2015 at 9:24 AM, John Spray john.sp...@redhat.com wrote:
On 03/03/2015 14:07, Daniel Takatori Ohara wrote:
$ls test-daniel-old/
total 0
drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar 2 10:52 ./
drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar 2 11:41 ../
Sounds good!
-Greg
On Sat, Feb 28, 2015 at 10:55 AM David da...@visions.se wrote:
Hi!
I’m about to do maintenance on a Ceph Cluster, where we need to shut it
all down fully.
We’re currently only using it for rados block devices to KVM Hypervizors.
Are these steps sane?
Shutting it down
This is probably LevelDB being slow. The monitor has some options to
compact the store on startup and I thought the osd handled it
automatically, but you could try looking for something like that and see if
it helps.
-Greg
On Fri, Feb 27, 2015 at 5:02 AM Corin Langosch corin.lango...@netskin.com
To: Gregory Farnum
Cc: ceph-users
Subject: Re: [ceph-users] More than 50% osds down, CPUs still busy;will
the cluster recover without help?
A little further logging:
2015-02-27 10:27:15.745585 7fe8e3f2f700 20 osd.11 62839 update_osd_stat
osd_stat(1305 GB used, 1431 GB avail, 2789 GB total, peers
On Mon, Mar 2, 2015 at 7:56 AM, Erdem Agaoglu erdem.agao...@gmail.com wrote:
Hi all, especially devs,
We have recently pinpointed one of the causes of slow requests in our
cluster. It seems deep-scrubs on pg's that contain the index file for a
large radosgw bucket lock the osds. Incresing op
On Mon, Mar 2, 2015 at 8:44 AM, Daniel Schneller
daniel.schnel...@centerdevice.com wrote:
On our Ubuntu 14.04/Firefly 0.80.8 cluster we are seeing
problem with log file rotation for the rados gateway.
The /etc/logrotate.d/radosgw script gets called, but
it does not work correctly. It spits
On Fri, Feb 27, 2015 at 5:03 AM, Mark Wu wud...@gmail.com wrote:
I am wondering how the value of journal_align_min_size gives impact on
journal padding. Is there any document describing the disk layout of
journal?
Not much, unfortunately. Just looking at the code, the journal will
align any
On Tue, Mar 3, 2015 at 9:26 AM, Garg, Pankaj
pankaj.g...@caviumnetworks.com wrote:
Hi,
I have ceph cluster that is contained within a rack (1 Monitor and 5 OSD
nodes). I kept the same public and private address for configuration.
I do have 2 NICS and 2 valid IP addresses (one internal only
Just to get more specific: the reason you can apparently write stuff
to a file when you can't write to the pool it's stored in is because
the file data is initially stored in cache. The flush out to RADOS,
when it happens, will fail.
It would definitely be preferable if there was some way to
Yes. :)
-Greg
On Wed, Feb 25, 2015 at 8:33 AM Jordan A Eliseo jaeli...@us.ibm.com wrote:
Hi all,
Quick qestion, does the Crush map always strive for proportionality when
rebalancing a cluster? i.e. Say I have 8 OSDs (with a two node cluster - 4
OSDs per host - at ~90% utilization (which I
IIRC these global values for total size and available are just summations
from the (programmatic equivalent) of running df on each machine locally,
but the used values are based on actual space used by each PG. That has
occasionally produced some odd results depending on how you've configured
your
On Tue, Feb 24, 2015 at 6:21 AM, Xavier Villaneau
xavier.villan...@fr.clara.net wrote:
Hello ceph-users,
I am currently making tests on a small cluster, and Cache Tiering is one of
those tests. The cluster runs Ceph 0.87 Giant on three Ubuntu 14.04 servers
with the 3.16.0 kernel, for a total
For everybody else's reference, this is addressed in
http://tracker.ceph.com/issues/10944. That kernel has several known
bugs.
-Greg
On Tue, Feb 24, 2015 at 12:02 PM, Ilja Slepnev islep...@gmail.com wrote:
Dear All,
Configuration of MDS and CephFS client is the same:
OS: CentOS 7.0.1406
On Thu, Feb 19, 2015 at 8:30 PM, Christian Balzer ch...@gol.com wrote:
Hello,
I have a cluster currently at 0.80.1 and would like to upgrade it to
0.80.7 (Debian as you can guess), but for a number of reasons I can't
really do it all at the same time.
In particular I would like to upgrade
That's pretty strange, especially since the monitor is getting the
failure reports. What version are you running? Can you bump up the
monitor debugging and provide its output from around that time?
-Greg
On Fri, Feb 20, 2015 at 3:26 AM, Sudarshan Pathak sushan@gmail.com wrote:
Hello
You can try searching the archives and tracker.ceph.com for hints
about repairing these issues, but your disk stores have definitely
been corrupted and it's likely to be an adventure. I'd recommend
examining your local storage stack underneath Ceph and figuring out
which part was ignoring
On Fri, Feb 20, 2015 at 3:50 AM, Luis Periquito periqu...@gmail.com wrote:
Hi Dan,
I remember http://tracker.ceph.com/issues/9945 introducing some issues with
running cephfs between different versions of giant/firefly.
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg14257.html
On Wed, Feb 25, 2015 at 3:11 PM, Deneau, Tom tom.den...@amd.com wrote:
I need to set up a cluster where the rados client (for running rados
bench) may be on a different architecture and hence running a different
ceph version from the osd/mon nodes. Is there a list of which ceph
versions work
On Mon, Feb 23, 2015 at 8:59 AM, Chris Murray chrismurra...@gmail.com wrote:
... Trying to send again after reporting bounce backs to dreamhost ...
... Trying to send one more time after seeing mails come through the
list today ...
Hi all,
First off, I should point out that this is a 'small
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Chris Murray
Sent: 25 February 2015 12:58
To: Gregory Farnum
Cc: ceph-users
Subject: Re: [ceph-users] More than 50% osds down, CPUs still busy;will
the cluster recover without help?
Thanks Greg
After seeing some recommendations I found
Are all your monitors running? Usually a temporary hang means that the Ceph
client tries to reach a monitor that isn't up, then times out and contacts
a different one.
I have also seen it just be slow if the monitors are processing so many
updates that they're behind, but that's usually on a very
So this is exactly the same test you ran previously, but now it's on
faster hardware and the test is slower?
Do you have more data in the test cluster? One obvious possibility is
that previously you were working entirely in the MDS' cache, but now
you've got more dentries and so it's kicking data
On Wed, Mar 25, 2015 at 3:14 AM, Frédéric Nass
frederic.n...@univ-lorraine.fr wrote:
Hello,
I have a few questions regarding snapshots and fstrim with cache tiers.
In the cache tier and erasure coding FAQ related to ICE 1.2 (based on
Firefly), Inktank says Snapshots are not supported in
there are and what they have permissions on and check; otherwise
you'll have to figure it out from the client side.
-Greg
Thanks for the input!
On Fri, Mar 27, 2015 at 3:04 PM, Gregory Farnum g...@gregs42.com wrote:
So this is exactly the same test you ran previously, but now it's on
faster hardware
Has the OSD actually been detected as down yet?
You'll also need to set that min size on your existing pools (ceph
osd pool pool set min_size 1 or similar) to change their behavior;
the config option only takes effect for newly-created pools. (Thus the
default.)
On Thu, Mar 26, 2015 at 1:29 PM,
able to actually remove cephfs altogether.
On Thu, Mar 26, 2015 at 12:45 PM, Jake Grimmett j...@mrc-lmb.cam.ac.uk
wrote:
On 03/25/2015 05:44 PM, Gregory Farnum wrote:
On Wed, Mar 25, 2015 at 10:36 AM, Jake Grimmett j...@mrc-lmb.cam.ac.uk
wrote:
Dear All,
Please forgive this post if it's
You shouldn't rely on rados ls when working with cache pools. It
doesn't behave properly and is a silly operation to run against a pool
of any size even when it does. :)
More specifically, rados ls is invoking the pgls operation. Normal
read/write ops will go query the backing store for objects
On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen ste...@me.com wrote:
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack volume
manager) as a multi-backend manager and migrate from one backend to the
other, each
:
On 26/03/2015, at 23.01, Gregory Farnum g...@gregs42.com wrote:
On Thu, Mar 26, 2015 at 2:53 PM, Steffen W Sørensen ste...@me.com wrote:
On 26/03/2015, at 21.07, J-P Methot jpmet...@gtcomm.net wrote:
That's a great idea. I know I can setup cinder (the openstack volume
manager) as a multi
On Thu, Mar 26, 2015 at 2:30 PM, Lee Revell rlrev...@gmail.com wrote:
On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum g...@gregs42.com wrote:
Has the OSD actually been detected as down yet?
I believe it has, however I can't directly check because ceph health
starts to hang when I down
Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Gregory Farnum
Sent: Thursday, March 26, 2015 2:40 PM
To: Lee Revell
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] All client writes block when 2 of 3 OSDs down
On Thu, Mar 26, 2015 at 2:30 PM, Lee
the
tradeoff between consistency and availability. The monitors are a
Paxos cluster and Ceph is a 100% consistent system.
-Greg
Thanks Regards
Somnath
-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: Thursday, March 26, 2015 3:29 PM
To: Somnath Roy
Cc: Lee
, should we say that in a cluster with say 3
monitors it should be able to tolerate only one mon failure ?
Yes, that is the case.
Let me know if I am missing a point here.
Thanks Regards
Somnath
-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: Thursday
On Tue, Mar 24, 2015 at 12:09 PM, Brendan Moloney molo...@ohsu.edu wrote:
Hi Loic and Markus,
By the way, Inktank do not support snapshot of a pool with cache tiering :
*
https://download.inktank.com/docs/ICE%201.2%20-%20Cache%20and%20Erasure%20Coding%20FAQ.pdf
Hi,
You seem to be
On Tue, Mar 24, 2015 at 10:48 AM, Robert LeBlanc rob...@leblancnet.us wrote:
I'm not sure why crushtool --test --simulate doesn't match what the
cluster actually does, but the cluster seems to be executing the rules
even though crushtool doesn't. Just kind of stinks that you have to
test the
801 - 900 of 2085 matches
Mail list logo