Hi all,
I understand the present pool tiering infrastructure is intended to work
for 2 layers? We're presently considering backup strategies for large
pools and wondered how much of a stretch it would be to have a base tier
sitting in e.g. an S3 store... I imagine a pg in the base+1 tier mapping
On 04/20/2015 01:46 PM, Robert LeBlanc wrote:
On Mon, Apr 20, 2015 at 2:34 PM, Colin Corr co...@pc-doctor.com
mailto:co...@pc-doctor.com wrote:
On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
We have a similar issue, but we wanted three copies across two racks.
Turns out,
You usually won't end up with more than the size number of replicas, even
in a failure situation. Although technically more than size number of
OSDs may have the data (if the OSD comes back in service, the journal may
be used to quickly get the OSD back up to speed), these would not be
active.
On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
We have a similar issue, but we wanted three copies across two racks. Turns
out, that we increased size to 4 and left min_size at 2. We didn't want to
risk having less than two copies and if we only had thee copies, losing a
rack would block
On Mon, Apr 20, 2015 at 2:34 PM, Colin Corr co...@pc-doctor.com wrote:
On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
We have a similar issue, but we wanted three copies across two racks.
Turns out, that we increased size to 4 and left min_size at 2. We didn't
want to risk having less than
On Mon, 20 Apr 2015 13:17:18 -0400 J-P Methot wrote:
On 4/20/2015 11:01 AM, Christian Balzer wrote:
Hello,
On Mon, 20 Apr 2015 10:30:41 -0400 J-P Methot wrote:
Hi,
This is similar to another thread running right now, but since our
current setup is completely different from the
We’ve been hard at work on CephFS over the last year since Firefly was
released, and with Hammer coming out it seemed like a good time to go over some
of the big developments users will find interesting. Much of this is cribbed
from John’s Linux Vault talk
Hey Cephers,
Just to remind you, our monthly Online Ceph Tech Talk is coming up
again this Thursday at 1p EDT via the BlueJeans tool. We will be
recording it and publishing to YouTube for those who can't make it,
but if you'd like to ask questions make sure you are there!
This month we'll be
Cephers,
Many people told me ceph is ready for production except the cephFS, is this
true? Any why it is? Can any one explain this to me? Thanks a lot.
Best Regards
-- Ray
___
ceph-users mailing list
ceph-users@lists.ceph.com
Thanks for all your hard work on CephFS. This progress is very exciting to
hear about. I am constantly amazed at the amount of work that gets done in
Ceph in so short an amount of time.
On Mon, Apr 20, 2015 at 6:26 PM, Gregory Farnum gfar...@redhat.com wrote:
We’ve been hard at work on CephFS
Hi All,
Ceph: 80.9 (as of Monday 13th) previously 80.8
OS: Ubuntu 12.04, 3.13.0-44-generic #73~precise1-Ubuntu SMP Wed Dec 17 00:39:15
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Servers: DELL R515 1 x 2.7GHZ 6C AMD CPU w/ 32GB RAM 10 x 3TB OSD's w/ 2x
Intel DCS3700 100GB Journals (5 OSD per SSD)
Thanks Kurt. I did not use the --repo-url, which I should have in retrospect,
but instead I had edited install.py and changed all occurrences of ceph.com to
eu.ceph.com
And Wido, Kurt has your answers then (see below) as to the one file that was
missing for him.
Dan
Dan Ferber
Software
Hi!
In case of scub error we get some PGs in inconsistent state.
What is a best method to check, what RBD volumes are mapped into this
inconsistent PG?
Now we invent a long and not easy way to to this:
- from 'ceph health detail' we take PGnums in inconsistent state
- we check logs for
On Mon, Apr 20, 2015 at 2:10 PM, Nikola Ciprich
nikola.cipr...@linuxbox.cz wrote:
Hello Ilya,
Have you set your crush tunables to hammer?
I've set crush tunables to optimal (therefore I guess they got set
to hammer).
Your crushmap has straw2 buckets (alg straw2). That's going to be
Hi Pedro,
Without knowing much about your actual Ceph config file setup (ceph.conf) or
any other factors (pool/replication setup) I'd say you're probably suffering
due to the journal sitting on your OSDs. As in, you made the OSDs and didn't
specify a SSD (or other disk) as the journal
If possible, it might be worth trying an EXT4 formatted RBD. I've had
problems with XFS hanging in the past on simple LVM volumes and never really
got to the bottom of it, whereas the same volumes formatted with EXT4 has
been running for years without a problem.
-Original Message-
Hi,
John Spray wrote:
As far as I can see, this is only meaningful for cache pools, and object is
dirty in the sense of having been created or modified since their its last
flush. For a non-cache-tier pool, everything is logically dirty since it is
never flushed.
I hadn't noticed
I'm using xfs on the rbd disks.
They are between 1 and 10TB in size.
Am 20.04.2015 um 14:32 schrieb Nick Fisk:
Ah ok, good point
What FS are you using on the RBD?
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Christian Eichelmann
Your crushmap has straw2 buckets (alg straw2). That's going to be
supported in 4.1 kernel - when 3.18 was released none of the straw2
stuff existed.
I see.. maybe this is a bit too radical setting for optimal preset?
Well, it depends on how you look at it. Generally optimal is
Hi Nick,
I forgot to mention that I was also trying a workaround using the
userland (rbd-fuse). The behaviour was exactly the same (worked fine for
several hours, testing parallel reading and writing, then IO Wait and
system load increased).
This is why I don't think it is an issue with the rbd
Ah ok, good point
What FS are you using on the RBD?
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Christian Eichelmann
Sent: 20 April 2015 13:16
To: Nick Fisk; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] 100% IO Wait with CEPH
Hello,
I'm quite new to ceph, so please forgive my ignorance.
Yesterday, I've deployed small test cluster (3 nodes, 2 SATA + 1 SSD OSD / node)
I enabled MDS server and created cephfs data + metadata pools and created
filesystem.
However upon mount requests, I'm getting following error:
[Apr20
They're kind of big; here are links:
https://dl.dropboxusercontent.com/u/104949139/osdmap
https://dl.dropboxusercontent.com/u/104949139/ceph-osd.36.log
On Sun, Apr 19, 2015 at 8:42 PM Samuel Just sj...@redhat.com wrote:
I have a suspicion about what caused this. Can you restart one of the
Hi Ceph-Users!
We currently have a problem where I am not sure if the it has it's cause
in Ceph or something else. First, some information about our ceph-setup:
* ceph version 0.87.1
* 5 MON
* 12 OSD with 60x2TB each
* 2 RSYNC Gateways with 2x10G Ethernet (Kernel: 3.16.3-2~bpo70+1, Debian
Hello Ilya,
Have you set your crush tunables to hammer?
I've set crush tunables to optimal (therefore I guess they got set
to hammer).
Your crushmap has straw2 buckets (alg straw2). That's going to be
supported in 4.1 kernel - when 3.18 was released none of the straw2
stuff existed.
I
Hi,
for writes, ceph write twice to the disk, 1 for journal 1 for datas. (so
half write bandwith)
and journal is writen with O_DSYNC
(you should test your disk with fio --sync=1 to compare).
That's why the recommandation is to use ssd for journal disks.
- Mail original -
De:
Hi Christian,
A very non-technical answer but as the problem seems related to the RBD
client it might be worth trying the latest Kernel if possible. The RBD
client is Kernel based and so there may be a fix which might stop this from
happening.
Nick
-Original Message-
From: ceph-users
On 19/04/2015 05:33, Francois Lafont wrote:
If I understand well, all objects in the cluster are dirty.
Is it normal?
What is a dirty object?
As far as I can see, this is only meaningful for cache pools, and object
is dirty in the sense of having been created or modified since their
its last
Hi all!!
I'm setting up a Ceph (version 0.80.6) cluster and I'mbenchmarking the
infrastructure and Ceph itself. I've got 3 rack servers (Dell R630) each
with it's own disks in enclosures.
The cluster network bandwidth is of 10Gbps, the bandwidth between the RAID
controller (Dell H830) and
On Mon, Apr 20, 2015 at 1:33 PM, Nikola Ciprich
nikola.cipr...@linuxbox.cz wrote:
Hello,
I'm quite new to ceph, so please forgive my ignorance.
Yesterday, I've deployed small test cluster (3 nodes, 2 SATA + 1 SSD OSD /
node)
I enabled MDS server and created cephfs data + metadata pools and
Are your journals on separate disks? What is your ratio of journal disks to
data disks? Are you doing replication size 3 ?
On Mon, Apr 20, 2015 at 9:30 AM, J-P Methot jpmet...@gtcomm.net wrote:
Hi,
This is similar to another thread running right now, but since our current
setup is completely
My journals are on-disk, each disk being a SSD. The reason I didn't go
with dedicated drives for journals is that when designing the setup, I
was told that having dedicated journal SSDs on a full-SSD setup would
not give me performance increases.
So that makes the journal disk to data disk
The big question is how fast these drives can do O_DSYNC writes. The
basic gist of this is that for every write to the journal, an
ATA_CMD_FLUSH call is made to ensure that the device (or potentially the
controller) know that this data really needs to be stored safely before
the flush is
4096M 2
template-win2k8-20150420 40960M 2
template-win2k8-20150420@snap 40960M 2
[root@vfnphav1a ~]# rbd snap protect ssd2r/template-win2k8-20150420@snap
rbd: protecting snap failed: 2015-04-20 16:47:31.587489 7f5e9e4fa760 -1 librbd
Hi,
This is similar to what you would observe if you hit the ulimit on
open files/sockets in a Ceph client. Though that normally only affects
clients in user mode, not the kernel. What are the ulimits of your
rbd-fuse client? Also, you could increase the client logging debug
levels to see why the
Hi,
Check xfs fregmentation factor for rbd disks i.e.
xfs_db -c frag -r /dev/sdX
if it is high, try defrag
xfs_fsr /dev/sdX
Regards,
Onur.
On 4/20/2015 4:41 PM, Nick Fisk wrote:
If possible, it might be worth trying an EXT4 formatted RBD. I've had
problems with XFS hanging in the past
Hi,
This is similar to another thread running right now, but since our
current setup is completely different from the one described in the
other thread, I thought it may be better to start a new one.
We are running Ceph Firefly 0.80.8 (soon to be upgraded to 0.80.9). We
have 6 OSD hosts
Hi,
I'm currently benching full ssd setup (don't have finished yet),
but with 4osd, ssd intel s3500, (replication x1), with randwrite 4M, I'm around
550MB/S
with random 4K, i'm around 4iops (1iops by osd, limit is the disk
write o_dsync speed)
This is with hammer.
- Mail
Hello,
On Mon, 20 Apr 2015 10:30:41 -0400 J-P Methot wrote:
Hi,
This is similar to another thread running right now, but since our
current setup is completely different from the one described in the
other thread, I thought it may be better to start a new one.
We are running Ceph
Can you please run 'rbd info' on template-win2k8-20150420 and
template-win2k8-20150420@snap? I just want to verify which RBD features are
currently enabled on your images. Have you overridden the value of
rbd_default_features in your ceph.conf? Did you use the new rbd CLI option
'--image
We have a similar issue, but we wanted three copies across two racks. Turns
out, that we increased size to 4 and left min_size at 2. We didn't want to
risk having less than two copies and if we only had thee copies, losing a
rack would block I/O. Once we expand to a third rack, we will adjust our
On Mon, Apr 20, 2015 at 11:17 AM, Dan van der Ster d...@vanderster.com wrote:
I haven't tried, but wouldn't something like this work:
step take default
step chooseleaf firstn 2 type host
step emit
step take default
step chooseleaf firstn -2 type osd
step emit
We use something like that
Using rados benchmark. It's just a test pool anyway. I will stick with my
current OSD setup (16HDDs 4 SSDs for a 1:4 ration of SSD to HDD). I can get
800 MB/s write and about 1GB read.
On Mon, Apr 20, 2015 at 11:19 AM, Mark Nelson mnel...@redhat.com wrote:
How are you measuring the 300MB/s and
Greetings Cephers,
I have hit a bit of a wall between the available documentation and my
understanding of it with regards to CRUSH rules. I am trying to determine if it
is possible to replicate 3 copies across 2 hosts, such that if one host is
completely lost, at least 1 copy is available. The
On Mon, Apr 20, 2015 at 10:46 AM, Colin Corr co...@pc-doctor.com wrote:
Greetings Cephers,
I have hit a bit of a wall between the available documentation and my
understanding of it with regards to CRUSH rules. I am trying to determine if
it is possible to replicate 3 copies across 2 hosts,
Hello Jason,
On Mon, Apr 20, 2015 at 01:48:14PM -0400, Jason Dillaman wrote:
Can you please run 'rbd info' on template-win2k8-20150420 and
template-win2k8-20150420@snap? I just want to verify which RBD features are
currently enabled on your images. Have you overridden the value
I haven't tried, but wouldn't something like this work:
step take default
step chooseleaf firstn 2 type host
step emit
step take default
step chooseleaf firstn -2 type osd
step emit
We use something like that for an asymmetric multi-room rule.
Cheers, Dan
On Apr 20, 2015 20:02, Robert LeBlanc
Hi ,
i have an issue with my ceph cluster were two nodes wereby accident and
have been recreated.
ceph osd tree
# idweight type name up/down reweight
-1 14.56 root default
-6 14.56 datacenter dc1
-7 14.56 row row1
-9 14.56
rack
On Mon, Apr 20, 2015 at 3:38 AM, John Spray john.sp...@redhat.com wrote:
I hadn't noticed that we presented this as nonzero for regular pools
before, it is a bit weird. Perhaps we should show zero here instead for
non-cache-tier pools.
I have always planned to add a cold EC tier later,
@vfnphav1a ~]# rbd ls -l ssd2r
NAMESIZE PARENT FMT PROT LOCK
fio_test 4096M 2
template-win2k8-20150420 40960M 2
template-win2k8-20150420@snap 40960M 2
[root@vfnphav1a ~]# rbd snap
I have a SSD pool for testing (only 8 Drives) but when I do a 1 SSD with
journal and 1 SSD with Data I get 300 MB/s write. When I change all 8
Disks to house the journal I get 184MB/s write.
On Mon, Apr 20, 2015 at 10:16 AM, Mark Nelson mnel...@redhat.com wrote:
The big question is how fast
How are you measuring the 300MB/s and 184MB/s? IE is it per drive, or
the client throughput? Also what controller do you have? We've seen
some controllers from certain manufacturers start to top out at around
1-2GB/s with write cache enabled.
Mark
On 04/20/2015 11:15 AM, Barclay Jameson
snapshot
I'm getting following error:
[root@vfnphav1a ~]# rbd ls -l ssd2r
NAMESIZE PARENT FMT PROT LOCK
fio_test 4096M 2
template-win2k8-20150420 40960M 2
template-win2k8-20150420@snap 40960M 2
On Apr 20, 2015 20:22, Gregory Farnum g...@gregs42.com wrote:
On Mon, Apr 20, 2015 at 11:17 AM, Dan van der Ster d...@vanderster.com
wrote:
I haven't tried, but wouldn't something like this work:
step take default
step chooseleaf firstn 2 type host
step emit
step take default
step
54 matches
Mail list logo