Hello Vlad,
Ceph clients connect to the primary OSD of each PG. If you create a
crush rule for building1 and one for building2 that takes a OSD from
the same building as the first one, your reads to the pool will always
be on the same building (if the cluster is healthy) and only write
request get
>>If you're using kernel client for cephfs, I strongly advise to have the
>>client on the same subnet as the ceph public one i.e all traffic should be on
>>the same subnet/VLAN. Even if your firewall situation is good, if you >>have
>>to cross subnets or VLANs, you will run into weird problems l
Are you sure the down OSD didn't happen to have any data required for the
re-balance to complete? How long has the down now removed OSD been out?
Before or after your increased PG count?
If you do "ceph health detail" and then pick a stuck PG what does "ceph pg
PG query" output?
Has your ceph -s
How does one repair an rstats mismatch detected by 'scrub_path' (caused by a
previous failure to write the journal)?
And how bad is an rstats mismatch? What are rstats used for? I see one thing
the mismatch does, apparently, is make it impossible to delete the directory,
as Cephfs says it isn't
I am trying to test replicated ceph with servers in different buildings,
and I have a read problem.
Reads from one building go to osd in another building and vice versa,
making reads slower then writes! Making read as slow as slowest node.
Is there a way to
- disable parallel read (so it reads onl
If you're using kernel client for cephfs, I strongly advise to have the client
on the same subnet as the ceph public one i.e all traffic should be on the same
subnet/VLAN. Even if your firewall situation is good, if you have to cross
subnets or VLANs, you will run into weird problems later. Fuse
Ok,
It seem to come from firewall,
I'm seeing dropped session exactly 15min before the log.
The sessions are the session to osd, session to mon && mds are ok.
Seem that keeplive2 is used to monitor the mon session
https://patchwork.kernel.org/patch/7105641/
but I'm not sure about osd sessions
Kernel 4.13+ (i tested up to 4.18) missed some non-essential feature (explained
by a Ceph dev on this ML) that was in Luminous, so they show up as Jewel, but
otherwise they're fully compatible with upmap. We have a few hundred nodes on
the kernel client with CephFS, and we also run balancer with
To be more precise,
the logs occurs when the hang is finished.
I have looked at stats on 10 differents hang, and the duration is always around
15 minutes.
Maybe related to:
ms tcp read timeout
Description:If a client or daemon makes a request to another Ceph daemon
and does not drop an un
Thanks for the extra info, but I find the question more nuanced than that.
For example in my case I ended up with 12.2.9 on my last handful of
newly-installed servers (and this is simply using yum rather than
explicitly ceph-deploy).
We are replacing hardware nodes so the cluster is actively
Hi,
On 08/11/2018 22:38, Ken Dreyer wrote:
What's the full apt-get command you're running?
I wasn't using apt-get, because the ceph repository has the broken
12.2.9 packages in it (and I didn't want to install them, obviously); so
I downloaded all the .debs I needed, installed the dependenc
Hi Matthew,
What's the full apt-get command you're running?
On Thu, Nov 8, 2018 at 9:31 AM Matthew Vernon wrote:
>
> Hi,
>
> in Jewel, /etc/bash_completion.d/radosgw-admin is in the radosgw package
> In Luminous, /etc/bash_completion.d/radosgw-admin is in the ceph-common
> package
>
> ...so if yo
El Jueves 08/11/2018 a las 06:17, Marc Roos escribió:
> And that is why I don't like ceph-deploy. Unless you have maybe hundreds
> of disks, I don’t see why you cannot install it "manually".
We do have another cluster with 600+ disks, this one has 91 so far.
We actually started using ceph-deploy
On Thu, Nov 8, 2018 at 12:16 PM, Ricardo J. Barberis
wrote:
> Hi Neha, thank you for the info.
>
> I'd like to clarify that we didn't actually upgrade to 12.2.9, we just
> installed 4 more OSD servers and those got 12.2.9, so we have a mixture
> of 12.2.9 and 12.2.8.
>
> Should we:
> - keep as is
Hi Neha, thank you for the info.
I'd like to clarify that we didn't actually upgrade to 12.2.9, we just
installed 4 more OSD servers and those got 12.2.9, so we have a mixture
of 12.2.9 and 12.2.8.
Should we:
- keep as is and wait for 12.2.10+ before proceeding?
- downgrade our newest OSDs from 1
On 08/11/2018 16:31, Matthew Vernon wrote:
Hi,
in Jewel, /etc/bash_completion.d/radosgw-admin is in the radosgw package
In Luminous, /etc/bash_completion.d/radosgw-admin is in the ceph-common
package
...so if you try and upgrade, you get:
Unpacking ceph-common (12.2.8-1xenial) over (10.2.9-0
Hi,
we are currently test cephfs with kernel module (4.17 and 4.18) instead fuse
(worked fine),
and we have hang, iowait jump like crazy for around 20min.
client is a qemu 2.12 vm with virtio-net interface.
Is the client logs, we are seeing this kind of logs:
[jeu. nov. 8 12:20:18 2018] lib
On Thu, Nov 8, 2018 at 5:10 PM Stefan Kooman wrote:
>
> Quoting Stefan Kooman (ste...@bit.nl):
> > I'm pretty sure it isn't. I'm trying to do the same (force luminous
> > clients only) but ran into the same issue. Even when running 4.19 kernel
> > it's interpreted as a jewel client. Here is the li
On 08/11/2018 16:31, Matthew Vernon wrote:
The exact versioning would depend on when the move was made (I presume
either Jewel -> Kraken or Kraken -> Luminous). Does anyone know?
To answer my own question, this went into 12.0.3 via
https://github.com/ceph/ceph/commit/9fd30b93f7281fad70b93512f0
Hi,
in Jewel, /etc/bash_completion.d/radosgw-admin is in the radosgw package
In Luminous, /etc/bash_completion.d/radosgw-admin is in the ceph-common
package
...so if you try and upgrade, you get:
Unpacking ceph-common (12.2.8-1xenial) over (10.2.9-0ubuntu0.16.04.1) ...
dpkg: error processing
Quoting Stefan Kooman (ste...@bit.nl):
> I'm pretty sure it isn't. I'm trying to do the same (force luminous
> clients only) but ran into the same issue. Even when running 4.19 kernel
> it's interpreted as a jewel client. Here is the list I made so far:
>
> Kernel 4.13 / 4.15:
> "featu
Em qui, 8 de nov de 2018 às 10:00, Joao Eduardo Luis
escreveu:
> Hello Gesiel,
>
> Welcome to Ceph!
>
> In the future, you may want to address the ceph-users list
> (`ceph-users@lists.ceph.com`) for this sort of issues.
>
>
Thank you, I will do.
On 11/08/2018 11:18 AM, Gesiel Galvão Bernardes wr
On Thu, Nov 8, 2018 at 2:15 PM Stefan Kooman wrote:
>
> Quoting Ilya Dryomov (idryo...@gmail.com):
> > On Sat, Nov 3, 2018 at 10:41 AM wrote:
> > >
> > > Hi.
> > >
> > > I tried to enable the "new smart balancing" - backend are on RH luminous
> > > clients are Ubuntu 4.15 kernel.
> [cut]
> > > ok
Quoting Ilya Dryomov (idryo...@gmail.com):
> On Sat, Nov 3, 2018 at 10:41 AM wrote:
> >
> > Hi.
> >
> > I tried to enable the "new smart balancing" - backend are on RH luminous
> > clients are Ubuntu 4.15 kernel.
[cut]
> > ok, so 4.15 kernel connects as a "hammer" (<1.0) client? Is there a
> > hu
On Thu, Nov 8, 2018 at 3:02 AM Janne Johansson wrote:
>
> Den ons 7 nov. 2018 kl 18:43 skrev David Turner :
> >
> > My big question is that we've had a few of these releases this year that
> > are bugged and shouldn't be upgraded to... They don't have any release
> > notes or announcement and th
On 11/8/18 12:28 PM, Hector Martin wrote:
> On 11/8/18 5:52 PM, Wido den Hollander wrote:
>> [osd]
>> bluestore_cache_size_ssd = 1G
>>
>> The BlueStore Cache size for SSD has been set to 1GB, so the OSDs
>> shouldn't use more then that.
>>
>> When dumping the mem pools each OSD claims to be usin
On 11/8/18 1:05 PM, ST Wong (ITSC) wrote:
> Hi,
>
>
>
> We created a testing rbd block device image as following:
>
>
>
> - cut here ---
>
> # rbd create 4copy/foo --size 10G
>
> # rbd feature disable 4copy/foo object-map fast-diff deep-flatten
>
> # rbd --image 4copy/foo info
What command are you using to mount the /dev/rbd0 to start with? You seem
to have missed that on your copy and paste.
On Thu, Nov 8, 2018 at 8:06 PM ST Wong (ITSC) wrote:
> Hi,
>
>
>
> We created a testing rbd block device image as following:
>
>
>
> - cut here ---
>
> # rbd create 4copy
Hi,
We created a testing rbd block device image as following:
- cut here ---
# rbd create 4copy/foo --size 10G
# rbd feature disable 4copy/foo object-map fast-diff deep-flatten
# rbd --image 4copy/foo info
rbd image 'foo':
size 10 GiB in 2560 objects
order 22 (4 MiB object
I'm experimenting with single-host Ceph use cases, where HA is not
important but data durability is.
How does a Ceph cluster react to its (sole) mon being rolled back to an
earlier state? The idea here is that the mon storage may not be
redundant but would be (atomically, e.g. lvm snapshot and dum
Hello Gesiel,
Welcome to Ceph!
In the future, you may want to address the ceph-users list
(`ceph-users@lists.ceph.com`) for this sort of issues.
On 11/08/2018 11:18 AM, Gesiel Galvão Bernardes wrote:
> Hi everyone,
>
> I am a beginner in Ceph. I made a increase of pg_num in a pool, and
> after
On 11/8/18 5:52 PM, Wido den Hollander wrote:
> [osd]
> bluestore_cache_size_ssd = 1G
>
> The BlueStore Cache size for SSD has been set to 1GB, so the OSDs
> shouldn't use more then that.
>
> When dumping the mem pools each OSD claims to be using between 1.8GB and
> 2.2GB of memory.
>
> $ ceph d
Hello Marc,
> - You can use this separate from the commandline?
yes, we don't take apart any feature or possible way, but we don't recommend it
> - And if I modify something from the commandline, these changes are visible
> in the webinterface?
yes, we just ask Ceph/Linux for it's current state
On 11/8/18 11:34 AM, Stefan Kooman wrote:
> Quoting Wido den Hollander (w...@42on.com):
>> Hi,
>>
>> Recently I've seen a Ceph cluster experience a few outages due to memory
>> issues.
>>
>> The machines:
>>
>> - Intel Xeon E3 CPU
>> - 32GB Memory
>> - 8x 1.92TB SSD
>> - Ubuntu 16.04
>> - Ceph 1
Quoting Wido den Hollander (w...@42on.com):
> Hi,
>
> Recently I've seen a Ceph cluster experience a few outages due to memory
> issues.
>
> The machines:
>
> - Intel Xeon E3 CPU
> - 32GB Memory
> - 8x 1.92TB SSD
> - Ubuntu 16.04
> - Ceph 12.2.8
What kernel version is running? What network card
H interesting maybe,
- You can use this separate from the commandline?
- And if I modify something from the commandline, these changes are
visible in the webinterface?
- I can easily remove/add this webinterface? I mean sometimes you have
these tools that just customize the whole enviro
Sorry to say this, but that's why there's the croit management
interface (free community edition feature).
You don't have to worry about problems that are absolutely critical
for reliable and stable operation. It doesn't matter if you run a
cluster with 10 or 1000 hard disks, it just has to run!
O
I know ceph is meant to operate at scale, that’s why we are all here.
But if you have a 180 disk cluster, you have 6-9 nodes, is nothing when
you add a node. I would just do the manual install and especially with a
production environment, considering all the 'little' bugs surfacing
here. I do
Hello,
Since upgrade from Jewel to Luminous 12.2.8, in the logs are reported some
errors related to "scrub mismatch", every day at the same time.
I have 5 mon (from mon.0 to mon.4) and I need help to indentify and recover
from this problem.
This is the log:
2018-11-07 15:13:53.808128 [ERR] mon.4
On 08/11/2018 09:17, Marc Roos wrote:
And that is why I don't like ceph-deploy. Unless you have maybe hundreds
of disks, I don’t see why you cannot install it "manually".
...as the recent ceph survey showed, plenty of people have hundreds of
disks! Ceph is meant to be operated at scale, whi
On 08/11/2018 09:17, Marc Roos wrote:
And that is why I don't like ceph-deploy. Unless you have maybe hundreds
of disks, I don’t see why you cannot install it "manually".
On 07/11/2018 22:22, Ricardo J. Barberis wrote:
Also relevant: if you use ceph-deploy like I do con CentOS 7, it
ins
And that is why I don't like ceph-deploy. Unless you have maybe hundreds
of disks, I don’t see why you cannot install it "manually".
-Original Message-
From: Ricardo J. Barberis [mailto:rica...@palmtx.com.ar]
Sent: woensdag 7 november 2018 23:23
To: ceph-users@lists.ceph.com
Subject
Hi,
Recently I've seen a Ceph cluster experience a few outages due to memory
issues.
The machines:
- Intel Xeon E3 CPU
- 32GB Memory
- 8x 1.92TB SSD
- Ubuntu 16.04
- Ceph 12.2.8
Looking at one of the machines:
root@ceph22:~# free -h
totalusedfree shared buff
Sure.
Seems that there is a test itself bug:
https://jenkins.ceph.com/job/ceph-pull-requests-arm64/25498/console
Best Wishes
- Original Message -
From: "Ilya Dryomov"
To: "xiang.dai"
Cc: "ceph-users"
Sent: Wednesday, November 7, 2018 10:40:13 PM
Subject: Re: [ceph-users] [bug] mount.c
Have in the past few days noticed that every single automated deep scrub
comes back as inconsistent, once I run a manual deep-scrub it finishes fine
and the PG is marked as clean.
I am running the latest mimic but have noticed someone else under luminous
is facing the same issue :
http://lists.cep
I'll second that.
We are in progress of upgrading after just receiving new hardware, and
it looks like right now, digging information on what to do and exactly
how will take literally hundreds of times more time and effort than the
upgrade itself, once you know.
On 08.11.2018 10:02, Janne
Den ons 7 nov. 2018 kl 18:43 skrev David Turner :
>
> My big question is that we've had a few of these releases this year that are
> bugged and shouldn't be upgraded to... They don't have any release notes or
> announcement and the only time this comes out is when users finally ask about
> it we
47 matches
Mail list logo