HI,
since the upgrade to luminous 12.2.2 the mons are complaining about
scrub errors:
2017-12-13 08:49:27.169184 mon.ceph-storage-03 [ERR] scrub mismatch
2017-12-13 08:49:27.169203 mon.ceph-storage-03 [ERR] mon.0
ScrubResult(keys {logm=87,mds_health=13} crc
{logm=4080463437,mds_health=221
Cluster is unusable because of inactive PGs. How can we correct it?
=
ceph pg dump_stuck inactive
ok
PG_STAT STATE UP UP_PRIMARY ACTING
ACTING_PRIMARY
1.4bactivating+remapped [5,2,0,13,1] 5 [5,2,13,1,4]
5
1.35activating+remapped [2,7,0,1,12]
On Wed, Dec 13, 2017 at 9:27 AM, 13605702...@163.com
<13605702...@163.com> wrote:
> hi
>
> since Jewel, cephfs is considered as production ready.
> but can anybody tell me which version fo ceph is better? Jewel? kraken? or
> Luminous?
>
luminous, version 12.2.2
> thanks
>
> __
I fixed this inconsistent error. It seems ceph didn't delete the mismatch
object that depends the deleted snapshot. This caused error "unexpected
clone" that resulted of message inconsistent.
Log:
2017-12-12 20:14:06.651942 7fc7eff7e700 -1 log_channel(cluster) log [ERR] :
deep-scrub 4.1b42
4:42db3
Hello,
We added a new disk to the cluster and while rebalancing we are getting
error warnings.
=
Overall status: HEALTH_ERR
REQUEST_SLOW: 1824 slow requests are blocked > 32 sec
REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec
==
The load in the servers seems to
hi
since Jewel, cephfs is considered as production ready.
but can anybody tell me which version fo ceph is better? Jewel? kraken? or
Luminous?
thanks
13605702...@163.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/lis
On Tue, Dec 12, 2017 at 3:36 PM wrote:
> From: Gregory Farnum
> Date: Tuesday, 12 December 2017 at 19:24
> To: "Vasilakakos, George (STFC,RAL,SC)"
> Cc: "ceph-users@lists.ceph.com"
> Subject: Re: [ceph-users] Sudden omap growth on some OSDs
>
> On Tue, Dec 12, 2017 at 3:16 AM george.vasilaka.
From: Gregory Farnum
Date: Tuesday, 12 December 2017 at 19:24
To: "Vasilakakos, George (STFC,RAL,SC)"
Cc: "ceph-users@lists.ceph.com"
Subject: Re: [ceph-users] Sudden omap growth on some OSDs
On Tue, Dec 12, 2017 at 3:16 AM
mailto:george.vasilaka...@stfc.ac.uk>> wrote:
On 11 Dec 2017, at 18:2
Hi,
As a follow-up, this PR for librbd seems to be what needs to be applied
to krbd too. As said in the PR, the bug is very much reproducible after
Jason Dillaman's suggestion.
Regards,
Florian
Florian Margaine writes:
> Hi,
>
> We're hitting an odd issue on our ceph cluster:
>
> - We have mac
On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk wrote:
>
> > That doesn't look like an RBD object -- any idea who is
> > "client.34720596.1:212637720"?
>
> So I think these might be proxy ops from the cache tier, as there are also
> block ops on one of the cache tier OSD's, but this time it actually l
Quoting Nick Fisk (n...@fisk.me.uk):
> Hi All,
>
> Has anyone been testing the bluestore pool compression option?
>
> I have set compression=snappy on a RBD pool. When I add a new bluestore OSD,
> data is not being compressed when backfilling, confirmed by looking at the
> perf dump results. If I
> That doesn't look like an RBD object -- any idea who is
> "client.34720596.1:212637720"?
So I think these might be proxy ops from the cache tier, as there are also
block ops on one of the cache tier OSD's, but this time it actually lists
the object name. Block op on cache tier.
"de
On Tue, Dec 12, 2017 at 8:18 PM, fcid wrote:
> Hello everyone,
>
> We had an incident regarding a client which reboot after experiencing some
> issues with a ceph cluster.
>
> The other clients who consume RBD images from the same ceph cluster showed
> and error at the time of the reboot in logs r
On Tue, Dec 12, 2017 at 3:16 AM wrote:
>
> On 11 Dec 2017, at 18:24, Gregory Farnum gfar...@redhat.com>> wrote:
>
> Hmm, this does all sound odd. Have you tried just restarting the primary
> OSD yet? That frequently resolves transient oddities like this.
> If not, I'll go poke at the kraken sour
Hello everyone,
We had an incident regarding a client which reboot after experiencing
some issues with a ceph cluster.
The other clients who consume RBD images from the same ceph cluster
showed and error at the time of the reboot in logs related to libceph.
The errors looks like this:
Dec
Jason was more diligent than me and dug enough to realize that we print out
the "raw pg", which we are printing out because we haven't gotten far
enough in the pipeline to decode the actual object name. You'll note that
it ends with the same characters as the PG does, and unlike a pgid, the raw
pg
That doesn't look like an RBD object -- any idea who is
"client.34720596.1:212637720"?
On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk wrote:
> Does anyone know what this object (0.ae78c1cf) might be, it's not your
> normal run of the mill RBD object and I can't seem to find it in the pool
> using ra
On Tue, Dec 12, 2017 at 9:37 AM Nick Fisk wrote:
> Does anyone know what this object (0.ae78c1cf) might be, it's not your
> normal run of the mill RBD object and I can't seem to find it in the pool
> using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
> activating+remapped stat
We have a project using cephfs (ceph-fuse) in kubernetes containers. For
us the throughput was limited by the mount point and not the cluster.
Having a single mount point for each container would cap with the
throughput of a single mount point. We ended up mounting cephfs inside of
the containers
Hi All,
Has anyone been testing the bluestore pool compression option?
I have set compression=snappy on a RBD pool. When I add a new bluestore OSD,
data is not being compressed when backfilling, confirmed by looking at the
perf dump results. If I then set again the compression type on the pool to
Does anyone know what this object (0.ae78c1cf) might be, it's not your
normal run of the mill RBD object and I can't seem to find it in the pool
using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an
activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool
with a ca
Hello, everyone!
We have recently started to use CephFS (Jewel, v12.2.1) from a few LXD
containers. We have mounted it on the host servers and then exposed it in
the LXD containers.
Do you have any recommendations (dos and don'ts) on this way of using
CephFS?
Thank you, in advance!
Kind regards
Hi,
We're hitting an odd issue on our ceph cluster:
- We have machine1 mapping an exclusive-lock RBD.
- Machine2 wants to take a snapshot of the RBD, but fails to take the lock.
Stracing the rbd snap process on machine2 shows it looping on sending
"lockget" commands, without ever moving forward.
Hi,
My ceph cluster has a inconsistent pg. I tried to deep scrub pg and repair
pg but not fix problems.
I found that the object that made pg inconsistent depends on snapshot (snap
id is 2ccac = 183468) of a image, I deleted this snapshot, then query
inconsistent pg and it showed empty, but my ceph
On 12/12/2017 02:18 PM, David Turner wrote:
I always back up my crush map. Someone making a mistake to the crush map
will happen and being able to restore last night's crush map has been
wonderful. That's all I really back up.
Yes, that's what I would suggest as well. Just have a daily CRO
I always back up my crush map. Someone making a mistake to the crush map
will happen and being able to restore last night's crush map has been
wonderful. That's all I really back up.
On Tue, Dec 12, 2017, 5:53 AM Wolfgang Lendl <
wolfgang.le...@meduniwien.ac.at> wrote:
> hello,
>
> I'm looking fo
To delete objects quickly, I set up a multi-threaded python script, but
then I learned about the --bypass-gc so I've been trying to use that
instead of putting all of the object into the GC to be deleted. Deleting
using radosgw-admin is not multi-threaded.
On Tue, Dec 12, 2017, 5:43 AM Rafał Wądoł
Thank you very much! I feel optimistic that now I got what I need to get that
thing back working again.
I'll report back...
Best regards,
Tobi
On 12/12/2017 02:08 PM, Yan, Zheng wrote:
On Tue, Dec 12, 2017 at 8:29 PM, Tobias Prousa wrote:
Hi Zheng,
the more you tell me the more what I se
On Tue, Dec 12, 2017 at 8:29 PM, Tobias Prousa wrote:
> Hi Zheng,
>
> the more you tell me the more what I see begins to makes sens to me. Thank
> you very much.
>
> Could you please be a little more verbose about how to use rados rmomapky?
> What to use for and what to use for <>. Here is what m
Hi!
(By the way, now a second bucket has this problem, it apparently occurs
when the automatic resharding commences while data is being written to
the bucket).
Am 12.12.17 um 09:53 schrieb Orit Wasserman:
On Mon, Dec 11, 2017 at 11:45 AM, Martin Emrich
wrote:
This is after resharding th
Hi Zheng,
the more you tell me the more what I see begins to makes sens to me. Thank you
very much.
Could you please be a little more verbose about how to use rados rmomapky? What to use for and what to use for <>. Here is what my
dir_frag looks like:
{
"damage_type": "dir_frag"
On Tue, Dec 12, 2017 at 4:22 PM, Tobias Prousa wrote:
> Hi there,
>
> regarding my ML post from yesterday (Upgrade from 12.2.1 to 12.2.2 broke my
> CephFs) I was able to get a little further with the suggested
> "cephfs-table-tool take_inos ". This made the whole issue with
> loads of "falsely fre
On 11 Dec 2017, at 18:24, Gregory Farnum
mailto:gfar...@redhat.com>> wrote:
Hmm, this does all sound odd. Have you tried just restarting the primary OSD
yet? That frequently resolves transient oddities like this.
If not, I'll go poke at the kraken source and one of the developers more
familiar
On Tue, Dec 12, 2017 at 4:22 PM, Tobias Prousa wrote:
> Hi there,
>
> regarding my ML post from yesterday (Upgrade from 12.2.1 to 12.2.2 broke my
> CephFs) I was able to get a little further with the suggested
> "cephfs-table-tool take_inos ". This made the whole issue with
> loads of "falsely fre
hello,
I'm looking for a recommendation about what parts/configuration/etc to
backup from a ceph cluster in case of a disaster.
I know this depends heavily on the type of disaster and I'm not talking
about backup of payload stored on osds.
currently I have my admin key stored somewhere outside th
Doh!
The activate command needs the *osd* fsid, not the cluster fsid.
So this works:
ceph-volume lvm activate 0 6608c0cf-3827-4967-94fd-5a3336f604c3
Is an "activate-all" equivalent planned?
-- Dan
On Tue, Dec 12, 2017 at 11:35 AM, Dan van der Ster wrote:
> Hi all,
>
> Did anyone successful
Hi,
Is there any known fast procedure to delete objects in large buckets? I
have about 40 milions of objects. I used:
radosgw-admin bucket rm --bucket=bucket-3 --purge-objects
but it is very slow. I am using ceph luminous (12.2.1).
Is it working in parallel?
--
BR,
Rafał Wądołowski
__
Hi all,
Did anyone successfully prepare a new OSD with ceph-volume in 12.2.2?
We are trying the simplest thing possible and not succeeding :(
# ceph-volume lvm prepare --bluestore --data /dev/sdb
# ceph-volume lvm list
== osd.0 ===
[block]
/dev/ceph-4da6fd06-b069-49af-901f-c9513b
Hi,
On Mon, Dec 11, 2017 at 11:45 AM, Martin Emrich
wrote:
> Hi!
>
> Am 10.12.17, 11:54 schrieb "Orit Wasserman" :
>
> Hi Martin,
>
> On Thu, Dec 7, 2017 at 5:05 PM, Martin Emrich
> wrote:
>
> It could be issue: http://tracker.ceph.com/issues/21619
> The workaround is running r
Hi there,
regarding my ML post from yesterday (Upgrade from 12.2.1 to 12.2.2 broke
my CephFs) I was able to get a little further with the suggested
"cephfs-table-tool take_inos ". This made the whole issue with
loads of "falsely free-marked inodes" go away.
I then restarted MDS, kept all cli
On Mon, Dec 11, 2017 at 5:44 PM, Sam Wouters wrote:
> On 11-12-17 16:23, Orit Wasserman wrote:
>> On Mon, Dec 11, 2017 at 4:58 PM, Sam Wouters wrote:
>>> Hi Orrit,
>>>
>>>
>>> On 04-12-17 18:57, Orit Wasserman wrote:
Hi Andreas,
On Mon, Dec 4, 2017 at 11:26 AM, Andreas Calminder
>>
41 matches
Mail list logo