On Mon, Aug 20, 2018 at 10:23 AM, Daznis wrote:
> Hello,
>
> It appears that something is horribly wrong with the cluster itself. I
> can't create or add any new osds to it at all.
Have you added new monitors? Or replaced monitors? I would check that
all your versions match, something seems to
On 20/08/18 06:44, John Spray wrote:
On Sun, Aug 19, 2018 at 9:21 PM Alfredo Daniel Rezinovsky
wrote:
both in ubuntu 16.04 and 18.04 ceph-mgr fail to starts when package
python-routes is not installed
I guess you mean that the dashboard doesn't work, as opposed to the
whole ceph-mgr process
On Mon, Aug 20, 2018 at 6:50 PM Alfredo Daniel Rezinovsky
wrote:
>
>
>
> On 20/08/18 06:44, John Spray wrote:
> > On Sun, Aug 19, 2018 at 9:21 PM Alfredo Daniel Rezinovsky
> > wrote:
> >> both in ubuntu 16.04 and 18.04 ceph-mgr fail to starts when package
> >> python-routes is not installed
> >
Hello,
AFAIK removing of big RBD-Images would lead ceph to produce blocked
requests - I dont mean caused by poor disks.
Is this still the case with "Luminous (12.2.4)"?
I have a a few images with
- 2 Terrabyte
- 5 Terrabyte
and
- 20 Terrabyte
in size and have to delete the images.
Would
Folks,
Today i found ceph -s is really slow and just hanging for minute or 2
minute to give me output also same with "ceph osd tree" output,
command just hanging long time to give me output..
This is what i am seeing output, one OSD down not sure why its down
and what is the relation with
The general talk about the rados cleanup command is to clean things up
after benchmarking. Could this command also be used for deleting an old
RGW bucket or an RBD. For instance, a bucket with a prefix of
`25ff9eff-058b-41e3-8724-cfffecb979c0.9709451.1` such that all objects in
the
Hello,
Medic shows everything fine. Whole cluster is on the latest mimic
version. It was updated to mimic when stable version of mimic was
release and recently it was updated to "ceph version 13.2.1
(5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)". For some
reason one mgr service is
This issue first started while using Luminous 12.2.5, I upgraded to 12.2.7 and
it's still present. This issue is _not_ present in 12.2.4.
With Ceph 12.2.4, using QEMU/KVM + Libvirt, I'm able to mount an rbd image
using the following syntax and populated xml:
'virsh attach-device $vm foo.xml
Hello,
since loic seems to have left ceph development and his wunderful crush
optimization tool isn'T working anymore i'm trying to get a good
distribution with the ceph balancer.
Sadly it does not work as good as i want.
# ceph osd df | sort -k8
show 75 to 83% Usage which is 8% difference
On Mon, Aug 20, 2018 at 4:52 PM Dietmar Rieder
wrote:
>
> Hi Cephers,
>
>
> I wonder if the cephfs client in RedHat/CentOS 7.5 will be updated to
> luminous?
> As far as I see there is some luminous related stuff that was
> backported, however,
> the "ceph features" command just reports "jewel"
That message has been there since 2014. We should lower the log level though.
Yehuda
On Mon, Aug 20, 2018 at 6:08 AM, David Turner wrote:
> In luminous they consolidated a lot of the rgw metadata pools by using
> namespace inside of the pools. I would say that the GC pool was consolidated
>
Hi Eugen,
I think it does have positive effect on the messages. Cause I get fewer
messages than before.
Eugen Block 于2018年8月20日周一 下午9:29写道:
> Update: we are getting these messages again.
>
> So the search continues...
>
>
> Zitat von Eugen Block :
>
> > Hi,
> >
> > Depending on your kernel
There was an existing bug reported for this one, and it's fixed on master:
http://tracker.ceph.com/issues/23801
It will be backport to luminous and mimic.
On Mon, Aug 20, 2018 at 9:25 AM, Yehuda Sadeh-Weinraub
wrote:
> That message has been there since 2014. We should lower the log level
On 08/20/2018 05:20 PM, David Turner wrote:
> The general talk about the rados cleanup command is to clean things up
> after benchmarking. Could this command also be used for deleting an old
> RGW bucket or an RBD. For instance, a bucket with a prefix of
>
On Mon, Aug 20, 2018 at 5:37 PM Ilya Dryomov wrote:
>
> On Mon, Aug 20, 2018 at 4:52 PM Dietmar Rieder
> wrote:
> >
> > Hi Cephers,
> >
> >
> > I wonder if the cephfs client in RedHat/CentOS 7.5 will be updated to
> > luminous?
> > As far as I see there is some luminous related stuff that was
>
Am 20.08.2018 um 22:13 schrieb David Turner:
> You might just have too much data per PG. If a single PG can account
> for 4% of your OSD, then 9% difference in used space on your OSDs is
> caused by an OSD having only 2 more PGs than another OSD. If you do
> have very large PGs, increasing your
On Tue, Aug 21, 2018 at 2:37 AM, Satish Patel wrote:
> Folks,
>
> Today i found ceph -s is really slow and just hanging for minute or 2
> minute to give me output also same with "ceph osd tree" output,
> command just hanging long time to give me output..
>
> This is what i am seeing output, one
I didn't ask how many PGs per OSD, I asked how large are your PGs in
comparison to your OSDs. For instance my primary data pool in my home
cluster has 10914GB of data in it and has 256 PGs. That means that each PG
accounts for 42GB of data. I'm using 5TB disks in this cluster. Each PG
on an
Hi again,
I'm starting to feel really unlucky here...
At the moment, the situation is "sort of okay":
1387 active+clean
11 active+clean+inconsistent
7 active+recovery_wait+degraded
1
Am 20.08.2018 um 21:52 schrieb Sage Weil:
> On Mon, 20 Aug 2018, Stefan Priebe - Profihost AG wrote:
>> Hello,
>>
>> since loic seems to have left ceph development and his wunderful crush
>> optimization tool isn'T working anymore i'm trying to get a good
>> distribution with the ceph balancer.
Am 20.08.2018 um 22:38 schrieb Dan van der Ster:
> On Mon, Aug 20, 2018 at 10:19 PM Stefan Priebe - Profihost AG
> wrote:
>>
>>
>> Am 20.08.2018 um 21:52 schrieb Sage Weil:
>>> On Mon, 20 Aug 2018, Stefan Priebe - Profihost AG wrote:
Hello,
since loic seems to have left ceph
You might just have too much data per PG. If a single PG can account for
4% of your OSD, then 9% difference in used space on your OSDs is caused by
an OSD having only 2 more PGs than another OSD. If you do have very large
PGs, increasing your PG count in those pools should improve your data
Hi there,
A few hours ago I started the given OSD again and gave it weight
1.0. Backfilling started and more PGs became active+clean.
After a while the same crashing behaviour started to act up so I stopped
the backfilling.
Running with noout,nobackfill,norebalance,noscrub,nodeep-scrub
Thanks Brad,
This is what i found, issue was MTU I have set MTU 9000 on all my OSD
nodes and mon node but somehow it get reverted on mon node back to
1500. Because of mismatched MTU caused some strange communication
issue between osd and mon nodes.
After fixing MTU on mon, things started
On 20/08/18 03:50, Bastiaan Visser wrote:
you should only use the 18.04 repo in 18.04, and remove the 16.04 repo.
use:
https://download.ceph.com/debian-luminous bionic main
- Bastiaan
Right. But if I came from a working 16.04 system upgraded to 18.04 the
ceph (xenial) packages are already
Hi Lincoln,
We're looking at (now existing) RBD support using KVM/QEMU, so this is
an upgrade path.
Regards,
Kees
On 20-08-18 16:37, Lincoln Bryant wrote:
What interfaces do your Hammer clients need? If you're looking at
CephFS, we have had reasonable success moving our older clients (EL6)
Hi Kees,
What interfaces do your Hammer clients need? If you're looking at
CephFS, we have had reasonable success moving our older clients (EL6)
to NFS Ganesha with the Ceph FSAL.
--Lincoln
On Mon, 2018-08-20 at 12:22 +0200, Kees Meijs wrote:
> Good afternoon Cephers,
>
> While I'm fixing our
On Mon, Aug 20, 2018 at 06:13:26AM -0400, David Turner wrote:
:There is a thread from the ceph-large users ML that covered a way to do
:this change without shifting data for an HDD only cluster. Hopefully it
:will be helpful for you.
:
:
The correct URL should be:
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-April/000106.html
Zitat von Jonathan Proulx :
On Mon, Aug 20, 2018 at 06:13:26AM -0400, David Turner wrote:
:There is a thread from the ceph-large users ML that covered a way to do
:this change without
Hi Cephers,
I wonder if the cephfs client in RedHat/CentOS 7.5 will be updated to
luminous?
As far as I see there is some luminous related stuff that was
backported, however,
the "ceph features" command just reports "jewel" as release of my cephfs
clients running CentOS 7.5 (kernel
you should only use the 18.04 repo in 18.04, and remove the 16.04 repo.
use:
https://download.ceph.com/debian-luminous bionic main
- Bastiaan
- Original Message -
From: "Alfredo Daniel Rezinovsky"
To: "ceph-users"
Sent: Sunday, August 19, 2018 10:15:00 PM
Subject: [ceph-users]
Hello K,
We have found our issue – we were only fixing the main RDB image in our script
rather than the snapshots. Working fine now.
Thanks for your help.
Kind regards,
Glen Baars
From: Konstantin Shalygin
Sent: Friday, 17 August 2018 11:20 AM
To: ceph-users@lists.ceph.com; Glen Baars
Ok...after a bit more searching. I realized you can specify the username
directly in the constructor of the "Rados" object. I'm still not entirely
clear how one would do it through the config file, but this works for me as
well.
import rados
cluster = rados.Rados(conffile="python_ceph.conf",
Hi David,
Thanks for your advice. My end goal is BlueStore so to upgrade to Jewel
and then Luminous would be ideal.
Currently all monitors are (succesfully) running Internalis, one OSD
node is running Infernalis and all other OSD nodes have Hammer.
I'll try freeing up one Infernalis OSD at
Bad news: I've got a PG stuck in down+peering now.
Please advice.
K.
On 20-08-18 12:12, Kees Meijs wrote:
> Thanks for your advice. My end goal is BlueStore so to upgrade to Jewel
> and then Luminous would be ideal.
>
> Currently all monitors are (succesfully) running Internalis, one OSD
> node
All of the data moves because all of the crush IDs for the hosts and osds
changes when you configure a crush rule to only use SSDs or HDDs. Crush
creates shadow hosts and shadow osds in the crush map that only have each
class of osd. So if you had node1 with osd.0 as an hdd and osd.1 as an
SSD,
As mentioned here recently, the sizing recommendations for BlueStore
have been updated:
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing
In our ceph cluster, we have some ratios that are much lower, like 20GB
of SSD (WAL and DB) per 7TB of spinning space. This
The given PG is back online, phew...
Meanwhile, some OSDs still on Hammer seem to crash with errors alike:
> 2018-08-20 13:06:33.819569 7f8962b2f700 -1 osd/ReplicatedPG.cc: In
> function 'void ReplicatedPG::scan_range(int, int,
> PG::BackfillInterval*, ThreadPool::TPHandle&)' thread 7f8962b2f700
Good afternoon Cephers,
While I'm fixing our upgrade-semi-broken cluster (see thread Upgrade to
Infernalis: failed to pick suitable auth object) I'm wondering about
ensuring client compatibility.
My end goal is BlueStore (i.e. running Luminous) and unfortunately I'm
obliged to offer Hammer
Hello,
right now we have multiple HDD only clusters with ether filestore journals
on SSDs or on newer installations WAL etc. on SSD.
I plan to extend our ceph clusters with SSDs to provide ssd only pools. In
luminous we have devices classes so that i should be able todo this without
editing
Hmm then is not really an option for me. Maybe someone from the devs can
shed a light why it is doing migration as long you only have OSDs with the
same class? I have a few Petabyte of Storage in each cluster. When it
starts migrating everything again that will result in a super big
performance
It isn't possible in the config file. You have to do it via the rados
constructor. You came to the correct conclusion.
On Mon, Aug 20, 2018, 2:59 AM Benjamin Cherian
wrote:
> Ok...after a bit more searching. I realized you can specify the username
> directly in the constructor of the "Rados"
Issue tracker http://tracker.ceph.com/issues/23801.
Still don't know why only particular OSDs write this information to log
files.
Jakub
On Wed, Aug 8, 2018 at 12:02 PM Jakub Jaszewski
wrote:
> Hi All, exactly the same story today, same 8 OSDs and a lot of garbage
> collection objects to
I'm assuming you use RGW and that you have a GC pool for RGW. It also might
beat assumed that your GC pool only has 8 PGs. Are any of those guesses
correct?
On Mon, Aug 20, 2018, 5:13 AM Jakub Jaszewski
wrote:
> Issue tracker http://tracker.ceph.com/issues/23801.
> Still don't know why only
Hi again,
Over night some other PGs seem inconsistent as well after being deep
scrubbed.
All affected OSDs log similar errors like:
> log [ERR] : 3.13 soid -5/0013/temp_3.13_0_16175425_287/head:
> failed to pick suitable auth object
Since there's temp in the name and we're running a
Ehrm, that should of course be rebuilding. (I.e. removing the OSD,
reformat, re-add.)
On 20-08-18 11:51, Kees Meijs wrote:
> Since there's temp in the name and we're running a 3-replica cluster,
> I'm thinking of just reboiling the comprised OSDs.
___
My suggestion would be to remove the osds and let the cluster recover from
all of the other copies. I would deploy the node back to Hammer instead of
Infernalis. Either that or remove these osds, let the cluster backfill, and
then upgrade to Jewel, and then luminous, and maybe mimic if you're
I just recently did the same. Take into account that everything starts
migrating. How weird it maybe, I had hdd test cluster only and changed
the crush rule to having hdd. Took a few days, totally unnecessary as
far as I am concerned.
-Original Message-
From: Enrico Kern
On Sun, Aug 19, 2018 at 9:21 PM Alfredo Daniel Rezinovsky
wrote:
>
> both in ubuntu 16.04 and 18.04 ceph-mgr fail to starts when package
> python-routes is not installed
I guess you mean that the dashboard doesn't work, as opposed to the
whole ceph-mgr process not starting? If it's the latter
It will automatically spill over to the slower storage if necessary;
it's better to have some fast storage for the DB than just slow
storage.
Paul
2018-08-20 13:07 GMT+02:00 Harald Staub :
> As mentioned here recently, the sizing recommendations for BlueStore have
> been updated:
>
Willem Jan, hello.
On 16 Aug 2018, at 12:07, Willem Jan Withagen wrote:
In the mean time I have uploaded a PR to fix this in the manual, which
should read:
gpart create -s GPT ada1
gpart add -t freebsd-zfs -l osd.1 ada1
zpool create osd.1 gpt/osd.1
zfs create -o
Hi Konstantin,
Thank you for looking into my question.
I was trying to understand how to set up CRUSH hierarchies and set
rules for different failure domains. I am particularly confused by the
'step take' and 'step choose|chooseleaf' settings for which I think
are the keys for defining a failure
In luminous they consolidated a lot of the rgw metadata pools by using
namespace inside of the pools. I would say that the GC pool was
consolidated into the log pool based on the correlation you've found with
the primary osds. At least that mystery is solved as to why those 8 osds.
I don't know
Update: we are getting these messages again.
So the search continues...
Zitat von Eugen Block :
Hi,
Depending on your kernel (memory leaks with CephFS) increasing the
mds_cache_memory_limit could be of help. What is your current
setting now?
ceph:~ # ceph daemon mds. config show |
Hello,
It appears that something is horribly wrong with the cluster itself. I
can't create or add any new osds to it at all.
On Mon, Aug 20, 2018 at 11:04 AM Daznis wrote:
>
> Hello,
>
>
> Zapping the journal didn't help. I tried to create the journal after
> zapping it. Also failed. I'm not
55 matches
Mail list logo