We encounter a strange behavior on our Mimic 13.2.6 cluster. A any
time, and without any load, some OSDs become unreachable from only
some hosts. It last 10 mn and then the problem vanish.
It 's not always the same OSDs and the same hosts. There is no network
failure on any of the host (because
The documentation tell to size the DB to 4% of the disk data ie 240GB
for a 6 TB disk. Plz gives more explanations when your answer disagree
with the documentation !
Le lun. 25 nov. 2019 à 11:00, Konstantin Shalygin a écrit :
>
> I have an Ceph cluster which was designed for file store. Each
I have an Ceph cluster which was designed for file store. Each host
have 5 SSDs write intensive of 400GB and 20 HDD of 6TB. So each HDD
have a WAL of 5 GB on SSD
If i want to put Bluestore on this cluster, i can only allocate ~75GB
of WAL and DB on SSD for each HDD which is far below the 4% limit
complete manual
Le lun. 15 oct. 2018 à 14:26, Matthew Vernon a écrit :
>
> Hi,
>
> On 15/10/18 11:44, Vincent Godin wrote:
> > Does a man exist on ceph-objectstore-tool ? if yes, where can i find it ?
>
> No, but there is some --help output:
>
> root@sto-1-1:~
Does a man exist on ceph-objectstore-tool ? if yes, where can i find it ?
Thx
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Ceph cluster in Jewel 10.2.11
Mons & Hosts are on CentOS 7.5.1804 kernel 3.10.0-862.6.3.el7.x86_64
Everyday, we can see in ceph.log on Monitor a lot of logs like these :
2018-10-02 16:07:08.882374 osd.478 192.168.1.232:6838/7689 386 :
cluster [WRN] map e612590 wrongly marked me down
2018-10-02
Hello Cephers,
if i had to go for production today, which release should i choose :
Luminous or Mimic ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Two monthes ago, we had a simple crushmap :
- one root
- one region
- two datacenters
- one room per datacenter
- two pools per room (one SATA and one SSD)
- hosts in SATA pool only
- osds in host
So we created a ceph pool at the level SATA on each site.
After some disk problems which impacted
Hi,
As i understand it, you'll have one RAID1 of two SSDs for 12 HDDs. A
WAL is used for all writes on your host. If you have good SSDs, they
can handle 450-550 MBpsc. Your 12 HDDs SATA can handle 12 x 100 MBps
that is to say 1200 GBps. So your RAID 1 will be the bootleneck with
this design. A
Hello Alex,
We have a similar design. Two Datacenters at short distance (sharing
the same level 2 network) and one Datacenter at long range (more than
100km) for our Ceph cluster. Let's call these sites A1, A2 and B.
We set 2 Mons on A1, 2 Mons on A2 and 1 Mon on B. A1 and A2 shared a
same level
As no response were given, i will explain what i found : maybe it
could help other people
.dirXXX object is an index marker with a 0 data size. The metadata
associated to this object (located in the levelDB of the OSDs
currently holding this marker) is the index of the bucket
corresponding to
How to know the usage of an indexless bucket ? We need to have this
information for our billing process
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Yesterday we had an outage on our ceph cluster. One OSD was looping on << [call
rgw.bucket_complete_op] snapc 0=[]
ack+ondisk+write+known_if_redirected e359833) currently waiting for
degraded object >> for hours blocking all the requests to this OSD and
then ...
We had to delete the degraded
Yesterday we just encountered this bug. One OSD was looping on
"2018-01-03 16:20:59.148121 7f011a6a1700 0 log_channel(cluster) log
[WRN] : slow request 30.254269 seconds old, received at 2018-01-03
16:20:28.883837: osd_op(client.48285929.0:14601958 35.8abfc02e
We have some scrub errors on our cluster. A ceph pg repair x.xxx is
take in account only after hours. It seems to be linked to deep-scrubs
which are running at the same time. It 's look like it has to wait for
a slot before launching the repair. I have then two question :
is it possible to launch
In addition to the points that you made :
I noticed on RAID0 disk that read IO errors are not always trapped by
ceph leading to unattended behaviour of the impacted OSD daemon.
On both RAID0 disk or non-RAID disk, a IO error is trapped on /var/log/messages
Oct 2 15:20:37 os-ceph05 kernel: sd
If you have at least 2 hosts per room, you can use a k=3 and m=3 and
place 2 shards per room (one on each host). So you'll need 3 shards to
read the data : you can loose a room and one host in the two other
rooms and still get your data.It covers a double faults which is
better.
It will take more
We had similar problem few month ago when migrating from hammer to
jewel. We encountered some old bugs (which were declared closed on
Hammer !!!l). We had some OSDs refusing to start because of lack of pg
map like yours, some others which were completly busy and start
declaring valid OSDs losts =>
Hi,
If you're using ceph-deploy, just run the command :
ceph-deploy osd prepare --overwrite-conf {your_host}:/dev/sdaa:/dev/sdaf2
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
When we use a replicated pool of size 3 for example, each data, a block of
4MB is written on one PG which is distributed on 3 hosts (by default). The
osd holding the primary will copy the block to OSDs holding the secondary
and third PG.
With erasure code, let's take a raid5 schema like k=2 and
when you replace a failed osd, it has to recover all of its pgs and so it
is pretty busy. Is it possible to tell the OSD to not become primary for
any of its already synchronized pgs till every pgs (of the OSD) have
recover ? It should accelerate the rebuild process because the OSD won't
have to
First of all, don't do a ceph upgrade while your cluster is in warning or
error state. A process upgrade must be done from an clean cluster.
Don't stay with a replicate at 2. Majority of problems come from that
point: just look the advices given by experience users of the list. You
should set a
I created 2 users : jack & bob inside a tenant_A
jack created a bucket named BUCKET_A and want to give read access to the
user bob
with s3cmd, i can grant a user without tenant easylly: s3cmd setacl
--acl-grant=read:user s3://BUCKET_A
but with an explicit tenant, i tried :
--acl-grant=read:bob
read
> deadlock within librbd.
>
> On Mon, Jan 16, 2017 at 1:12 PM, Vincent Godin <vince.ml...@gmail.com>
> wrote:
> > We are using librbd on a host with CentOS 7.2 via virtio-blk. This server
> > hosts the VMs on which we are doing our tests. But we have exactly th
Ceph version is Jewel 10.2.3
> > Ceph clients, mons and servers have the kernel
> 3.10.0-327.36.3.el7.x86_64
> > on CentOS 7.2
> >
> > 2017-01-13 20:07 GMT+01:00 Jason Dillaman <jdill...@redhat.com>:
> >>
> >> You might be hitting this issue [1] where mk
com>:
> You might be hitting this issue [1] where mkfs is issuing lots of
> discard operations. If you get a chance, can you retest w/ the "-E
> nodiscard" option?
>
> Thanks
>
> [1] http://tracker.ceph.com/issues/16689
>
> On Fri, Jan 13, 2017 at 12:5
We are using a production cluster which started in Firefly, then moved to
Giant, Hammer and finally Jewel. So our images have different features
correspondind to the value of "rbd_default_features" of the version when
they were created.
We have actually three pack of features activated :
image
Hello,
I didn't look at your video but i already can tell you some tracks :
1 - there is a bug in 10.2.2 which make the client cache not working. The
client cache works as it never recieved a flush so it will stay in
writethrough mode. This bug is clear in 10.2.3
2 - 2 SSDs in JBOD and 12 x 4TB
Hello,
We had our cluster failed again this morning. It took almost the day to
stabilize.Here are some problems in OSD's logs we have encountered :
*Some OSDs refused to start :*
-1> 2016-11-23 15:50:49.507588 7f5f5b7a5800 -1 osd.27 196774 load_pgs: have
pgid 9.268 at epoch 196874, but missing
Hello,
We now have a full cluster (Mon, OSD & Clients) in jewel 10.2.2 (initial
was hammer 0.94.5) but we have still some big problems on our production
environment :
- some ceph filesystem are not mounted at startup and we have to mount
them with the "/bin/sh -c 'flock /var/lock/ceph-disk
After a test on a non production environment, we decided to upgrade our
running cluster to jewel 10.2.3. Our cluster has 3 monitors and 8 nodes of
20 disks. The cluster is in hammer 0.94.5 with tunables set to "bobtail".
As the cluster is in production and it wasn't possible to upgrade ceph
client
We have an Openstack which use Ceph for Cinder and Glance. Ceph is in
Hammer release and we need to upgrade to Jewel. My question is :
are the Hammer clients compatible with the Jewel servers ? (upgrade of Mon
then Ceph servers first)
As the upgrade of the Ceph client need a reboot of all the
When you increase your pg number, the new pgs will have to peer first and
during this time they will be unreachable.So you need to put the cluster in
maintenance mode for this operation.
The way to upgrade the number of PG and the PGP of a running cluster is :
- First, it's very important to
Hi,
In fact, when you increase your pg number, the new pgs will have to peer
first and during this time, a lot a pg will be unreachable. The best way to
upgrade the number of PG of a cluster (you 'll need to adjust the number of
PGP too) is :
- Don't forget to apply Goncalo advices to keep
I restart osd.80 and till now : no bakfill_toofull anymore
2016-07-25 17:46 GMT+02:00 M Ranga Swami Reddy <swamire...@gmail.com>:
> can you restart osd.80 and check see, if the recovery procced?
>
> Thanks
> Swami
>
> On Mon, Jul 25, 2016 at 9:05 PM, Vincent Godin <vin
The OSD 140 is 73.61% used and its backfill_full_ratio is 0.85 too
-- Forwarded message --
From: Vincent Godin <vince.ml...@gmail.com>
Date: 2016-07-25 17:35 GMT+02:00
Subject: 1
active+undersized+degraded+remapped+wait_backfill+backfill_toofull ???
To: ceph-users@lists.ce
Hi,
I'm facing this problem. The cluster is in Hammer 0.94.5
When i do a ceph health detail, i can see :
pg 8.c1 is stuck unclean for 21691.555742, current state
active+undersized+degraded+remapped+wait_backfill+backfill_toofull, last
acting [140]
pg 8.c1 is stuck undersized for 21327.027365,
Hello.
I've been testing Intel 3500 as journal store for few HDD-based OSD. I
stumble on issues with multiple partitions (>4) and UDEV (sda5, sda6,etc
sometime do not appear after partition creation). And I'm thinking that
partition is not that useful for OSD management, because linux do no
allow
Is there now a stable version of Ceph in Hammer and/or Infernalis whis
which we can safely use cache tier in write back mode ?
I saw few month ago a post saying that we have to wait for a next release
to use it safely.
___
ceph-users mailing list
u learn something new everyday.
>
>
> [1] https://www.mail-archive.com/ceph-users@lists.ceph.com/msg26017.html
> -
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
>
> On Wed, Jan 20, 2016 at 7:11 AM, Vincent Godin
Hi,
I need to import a new crushmap in production (the old one is the default
one) to define two datacenters and to isolate SSD from SATA disk. What is
the best way to do this without starting an hurricane on the platform ?
Till now, i was just using hosts (SATA OSD) on one datacenter with the
41 matches
Mail list logo