On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote:
> I was on 12.2.5 for a couple weeks and started randomly seeing
> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
> loose. I panicked and moved to Mimic, and when that didn't solve the
> problem, only then did I start
Your issue is different since not only do the omap digests of all
replicas not match the omap digest from the auth object info but they
are all different to each other.
What is min_size of pool 67 and what can you tell us about the events
leading up to this?
On Mon, Jul 16, 2018 at 7:06 PM,
Ceph doesn't shut down systems as in kill or reboot the box if that's
what you're saying?
On Mon, Jul 23, 2018 at 5:04 PM, Nicolas Huillard wrote:
> Le lundi 23 juillet 2018 à 11:07 +0700, Konstantin Shalygin a écrit :
>> > I even have no fancy kernel or device, just real standard Debian.
>> >
On Thu, Jul 19, 2018 at 12:47 PM, Troy Ablan wrote:
>
>
> On 07/18/2018 06:37 PM, Brad Hubbard wrote:
>> On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan wrote:
>>>
>>>
>>> On 07/17/2018 11:14 PM, Brad Hubbard wrote:
>>>>
>>>>
On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber wrote:
> Hi @all,
>
> im currently in testing for setup an production environment based on the
> following OSD Nodes:
>
> CEPH Version: luminous 12.2.5
>
> 5x OSD Nodes with following specs:
>
> - 8 Core Intel Xeon 2,0 GHZ
>
> - 96GB Ram
>
> - 10x
rnel
exhibiting the problem.
>
> kind regards
>
> Ben
>
>> Brad Hubbard hat am 5. Juli 2018 um 01:16 geschrieben:
>>
>>
>> On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber
>> wrote:
>> > Hi @all,
>> >
>> > im currently in testing for
On Sun, Jan 14, 2018 at 4:41 AM, Dyweni - Ceph-Users <6exbab4fy...@dyweni.com>
wrote:
> Hi,
>
> GLIBC 2.25-r9
> GCC 6.4.0-r1
>
> When compiling Ceph 12.2.2, the compilation hangs (cc1plus goes into an
> infinite loop and never finishes, requiring the process to be killed
> manually) while
On Wed, Jan 17, 2018 at 2:20 AM, Nikos Kormpakis <nk...@noc.grnet.gr> wrote:
> On 01/16/2018 12:53 AM, Brad Hubbard wrote:
>> On Tue, Jan 16, 2018 at 1:35 AM, Alexander Peters <apet...@sphinx.at> wrote:
>>> i created the dump output but it looks very cryptic to me so
On Mon, Jan 22, 2018 at 10:37 PM, Hüseyin Atatür YILDIRIM <
hyildi...@havelsan.com.tr> wrote:
>
> Hi again,
>
>
>
> In the “journalctl –xe” output:
>
>
>
> Jan 22 15:29:18 mon02 ceph-osd-prestart.sh[1526]: OSD data directory
> /var/lib/ceph/osd/ceph-1 does not exist; bailing out.
>
>
>
> Also in
On Thu, Mar 8, 2018 at 1:22 AM, Harald Staub wrote:
> "ceph pg repair" leads to:
> 5.7bd repair 2 errors, 0 fixed
>
> Only an empty list from:
> rados list-inconsistent-obj 5.7bd --format=json-pretty
>
> Inspired by http://tracker.ceph.com/issues/12577 , I tried again with
On Thu, Mar 8, 2018 at 5:01 PM, 赵贺东 wrote:
> Hi All,
>
> Every time after we activate osd, we got “Structure needs cleaning” in
> /var/lib/ceph/osd/ceph-xxx/current/meta.
>
>
> /var/lib/ceph/osd/ceph-xxx/current/meta
> # ls -l
> ls: reading directory .: Structure needs
On Thu, Mar 8, 2018 at 7:33 PM, 赵赵贺东 <zhaohed...@gmail.com> wrote:
> Hi Brad,
>
> Thank you for your attention.
>
>> 在 2018年3月8日,下午4:47,Brad Hubbard <bhubb...@redhat.com> 写道:
>>
>> On Thu, Mar 8, 2018 at 5:01 PM, 赵贺东 <zhaohed...@gmail.com> wrote:
On Fri, Mar 9, 2018 at 3:54 AM, Subhachandra Chandra
wrote:
> I noticed a similar crash too. Unfortunately, I did not get much info in the
> logs.
>
> *** Caught signal (Segmentation fault) **
>
> Mar 07 17:58:26 data7 ceph-osd-run.sh[796380]: in thread 7f63a0a97700
>
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
mbald...@hsamiata.it> wrote:
> Hi
>
> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
> the problem happens with various different OSDs in different nodes, for me
> it is clear it's not an hardware problem.
>
debug_osd that is... :)
On Tue, Mar 6, 2018 at 7:10 PM, Brad Hubbard <bhubb...@redhat.com> wrote:
>
>
> On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it> wrote:
>
>> Hi
>>
>> I monitor dmesg in ea
method of http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-mons/
> (with id controller02)
>
> The logs provided are when the controller02 was added with the manual
> method.
>
> But the controller02 won't join the cluster
>
> Hope It helps understand
>
>
>
See the thread in this very ML titled "Ceph iSCSI is a prank?", last update
thirteen days ago.
If your questions are not answered by that thread let us know.
Please also remember that CentOS is not the only platform that ceph runs on
by a long shot and that not all distros lag as much as it (not
"NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this."
Have you ever wondered what this means and why it's there? :)
This is at least something you can try. it may provide useful
information, it may not.
This stack looks like it is either corrupted, or possibly not in
d it probably will be correct again in the near future and, if
not, we can review and correct it as necessary.
> There is something confused about what the documentation minimal
> requirements, the dashboard suggest to be able to do, and what i read
> around about modded Ceph for ot
On Tue, Mar 27, 2018 at 9:46 PM, Brad Hubbard <bhubb...@redhat.com> wrote:
>
>
> On Tue, Mar 27, 2018 at 9:12 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>
>> Hi Brad,
>>
>> that post was mine. I knew it quite well.
>>
> That Post was about
t us started. Getting late here for me so I'll
take a look at this tomorrow.
Thanks!
>
> http://tracker.ceph.com/issues/23431
>
> Maybe Oliver has something to add as well.
>
>
> Dietmar
>
>
> On 03/27/2018 11:37 AM, Brad Hubbard wrote:
>> "NOTE: a copy o
On Wed, Mar 28, 2018 at 6:53 PM, Max Cuttins <m...@phoenixweb.it> wrote:
> Il 27/03/2018 13:46, Brad Hubbard ha scritto:
>
>
>
> On Tue, Mar 27, 2018 at 9:12 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>>
>> Hi Brad,
>>
>> that post was mi
I'm not sure I completely understand your "test". What exactly are you
trying to achieve and what documentation are you following?
On Fri, Mar 30, 2018 at 10:49 PM, Julien Lavesque
<julien.laves...@objectif-libre.com> wrote:
> Brad,
>
> Thanks for your answer
>
> On
Can you update with the result of the following commands from all of the MONs?
# ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status
# ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok quorum_status
On Thu, Mar 29, 2018 at 3:11 PM, Gauvain Pocentek
"name": "controller03",
"addr": "172.18.8.7:6789\/0"
}
]
}
}
In the monmaps we are called 'controller02', not 'mon.controller02'.
These names need to be identical.
On Thu, Mar 29, 2018 at 7:23 PM, Julien Lav
On Fri, Mar 2, 2018 at 3:54 PM, Alex Gorbachev wrote:
> On Thu, Mar 1, 2018 at 10:57 PM, David Turner wrote:
>> Blocked requests and slow requests are synonyms in ceph. They are 2 names
>> for the exact same thing.
>>
>>
>> On Thu, Mar 1, 2018,
provide from the time leading up to when the issue was first seen?
>
> Cheers
>
> Andrei
> - Original Message -
>> From: "Brad Hubbard"
>> To: "Andrei Mikhailovsky"
>> Cc: "ceph-users"
>> Sent: Thursday, 28 June, 2018 01:
What does "rados list-inconsistent-obj " say?
Note that you may have to do a deep scrub to populate the output.
On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong wrote:
>
> Hi folks,
>
> I would appreciate any pointer as to how I can resolve a
> PG stuck in “active+clean+inconsistent” state. This has
>
What do you get if you send "help" (without quotes) to m
ajord...@vger.kernel.org ?
On Sun, Nov 11, 2018 at 10:15 AM Cranage, Steve <
scran...@deepspacestorage.com> wrote:
> Can anyone tell me the secret? A colleague tried and failed many times so
> I tried and got this:
>
>
>
>
>
> Steve
C. Wong
>> kcw...@verseon.com
>> M: +1 (408) 769-8235
>>
>> -
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message
t; Clearly, on osd.67, the “attrs” array is empty. The question is,
> how do I fix this?
>
> Many thanks in advance,
>
> -kc
>
> K.C. Wong
> kcw...@verseon.com
> M: +1 (408) 769-8235
>
> -
> Confidentiality Notice:
&
On Tue, Sep 25, 2018 at 11:31 PM Josh Haft wrote:
>
> Hi cephers,
>
> I have a cluster of 7 storage nodes with 12 drives each and the OSD
> processes are regularly crashing. All 84 have crashed at least once in
> the past two days. Cluster is Luminous 12.2.2 on CentOS 7.4.1708,
> kernel version
On Tue, Sep 25, 2018 at 7:50 PM Sergey Malinin wrote:
>
> # rados list-inconsistent-obj 1.92
> {"epoch":519,"inconsistents":[]}
It's likely the epoch has changed since the last scrub and you'll need
to run another scrub to repopulate this data.
>
> Septem
http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
should still be current enough and makes good reading on the subject.
On Mon, Jan 21, 2019 at 8:46 PM Stijn De Weirdt wrote:
>
> hi marc,
>
> > - how to prevent the D state process to accumulate so much load?
> you can't. in
On Tue, Dec 18, 2018 at 10:23 AM Mike O'Connor wrote:
>
> Hi All
>
> I have a ceph cluster which has been working with out issues for about 2
> years now, it was upgrade about 6 month ago to 10.2.11
>
> root@blade3:/var/lib/ceph/mon# ceph status
> 2018-12-18 10:42:39.242217 7ff770471700 0 --
Can you provide the complete OOM message from the dmesg log?
On Sat, Dec 22, 2018 at 7:53 AM Pardhiv Karri wrote:
>
>
> Thank You for the quick response Dyweni!
>
> We are using FileStore as this cluster is upgraded from
> Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes.
https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf
On Thu, Dec 6, 2018 at 8:11 PM Leon Robinson wrote:
>
> The most important thing to remember about CRUSH is that the H stands for
> hashing.
>
> If you hash the same object you're going to get the same result.
>
> e.g. cat
On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman wrote:
>
> Hey,
> After upgrading to centos7.6, I started encountering the following kernel
> panic
>
> [17845.147263] XFS (rbd4): Unmounting Filesystem
> [17846.860221] rbd: rbd4: capacity 3221225472 features 0x1
> [17847.109887] XFS (rbd4): Mounting
same setup, you might be hitting the same
> bug.
Thanks for that Jason, I wasn't aware of that bug. I'm interested to
see the details.
>
> On Thu, Jan 10, 2019 at 6:46 PM Brad Hubbard wrote:
> >
> > On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman wrote:
> > >
>
Haha, in the email thread he says CentOS but the bug is opened against RHEL :P
Is it worth recommending a fix in skb_can_coalesce() upstream so other
modules don't hit this?
On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov wrote:
>
> On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard
On Fri, Jan 11, 2019 at 8:58 PM Rom Freiman wrote:
>
> Same kernel :)
Not exactly the point I had in mind, but sure ;)
>
>
> On Fri, Jan 11, 2019, 12:49 Brad Hubbard wrote:
>>
>> Haha, in the email thread he says CentOS but the bug is opened against RHEL
>>
Nautilus will make this easier.
https://github.com/ceph/ceph/pull/18096
On Thu, Jan 3, 2019 at 5:22 AM Bryan Stillwell wrote:
>
> Recently on one of our bigger clusters (~1,900 OSDs) running Luminous
> (12.2.8), we had a problem where OSDs would frequently get restarted while
>
Are you using filestore or bluestore on the OSDs? If filestore what is
the underlying filesystem?
You could try setting debug_osd and debug_filestore to 20 and see if
that gives some more info?
On Wed, Sep 19, 2018 at 12:36 PM fatkun chan wrote:
>
>
> ceph version 12.2.5
On Thu, Mar 21, 2019 at 12:11 AM Glen Baars wrote:
>
> Hello Ceph Users,
>
>
>
> Does anyone know what the flag point ‘Started’ is? Is that ceph osd daemon
> waiting on the disk subsystem?
This is set by "mark_started()" and is roughly set when the pg starts
processing the op. Might want to
Actually, the lag is between "sub_op_committed" and "commit_sent". Is
there any pattern to these slow requests? Do they involve the same
osd, or set of osds?
On Thu, Mar 21, 2019 at 3:37 PM Brad Hubbard wrote:
>
> On Thu, Mar 21, 2019 at 3:20 PM Glen Baars
> wrote:
>
> Does anyone know what that section is waiting for?
Hi Glen,
These are documented, to some extent, here.
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
It looks like it may be taking a long time to communicate the commit
message back to the client? Are these sl
It would help to know what version you are running but, to begin with,
could you post the output of the following?
$ sudo ceph pg 10.2a query
$ sudo rados list-inconsistent-obj 10.2a --format=json-pretty
Also, have a read of
"last_epoch_clean": 20840,
> "parent": "0.0",
> "parent_split_bits": 0,
> "last_scrub": "21395'11835365",
> "last_scrub_stamp": "20
Do a "ps auwwx" to see how a running monitor was started and use the
equivalent command to try to start the MON that won't start. "ceph-mon
--help" will show you what you need. Most important is to get the ID
portion right and to add "-d" to get it to run in teh foreground and
log to stdout. HTH
If you want to do containers at the same time, or transition some/all
to containers at some point in future maybe something based on
kubevirt [1] would be more futureproof?
[1] http://kubevirt.io/
CNV is an example,
https://www.redhat.com/en/resources/container-native-virtualization
On Sat, Apr
ed+inconsistent+peering, and the other peer is active+clean+inconsistent
Per the document I linked previously if a pg remains remapped you
likely have a problem with your configuration. Take a good look at
your crushmap, pg distribution, pool configuration, etc.
>
>
> On Wed, Mar 27, 2019 at 4:1
{
> "osd": "7",
> "status": "not queried"
> },
> {
> "osd": "8",
> "status": "already probed"
> },
>
ther OSDs appear to be ok, I see
> them up and in, why do you see something wrong?
>
> On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard wrote:
>>
>> Hammer is no longer supported.
>>
>> What's the status of osds 7 and 17?
>>
>> On Tue, Mar 26, 2019 at 8:56 A
https://bugzilla.redhat.com/show_bug.cgi?id=1662496
On Wed, Mar 27, 2019 at 5:00 AM Andrew J. Hutton
wrote:
>
> More or less followed the install instructions with modifications as
> needed; but I'm suspecting that either a dependency was missed in the
> F29 package or something else is up. I
+Jos Collin
On Thu, Mar 7, 2019 at 9:41 AM Milanov, Radoslav Nikiforov
wrote:
> Can someone elaborate on
>
>
>
> From http://tracker.ceph.com/issues/38122
>
>
>
> Which exactly package is missing?
>
> And why is this happening ? In Mimic all dependencies are resolved by yum?
>
> - Rado
>
>
>
you could try reading the data from this object and write it again
using rados get then rados put.
On Fri, Mar 8, 2019 at 3:32 AM Herbert Alexander Faleiros
wrote:
>
> On Thu, Mar 07, 2019 at 01:37:55PM -0300, Herbert Alexander Faleiros wrote:
> > Hi,
> >
> > # ceph health detail
> > HEALTH_ERR
On Tue, Mar 19, 2019 at 7:54 PM Zhenshi Zhou wrote:
>
> Hi,
>
> I mount cephfs on my client servers. Some of the servers mount without any
> error whereas others don't.
>
> The error:
> # ceph-fuse -n client.kvm -m ceph.somedomain.com:6789 /mnt/kvm -r /kvm -d
> 2019-03-19 17:03:29.136
On Fri, Mar 8, 2019 at 4:46 AM Samuel Taylor Liston wrote:
>
> Hello All,
> I have recently had 32 large map objects appear in my default.rgw.log
> pool. Running luminous 12.2.8.
>
> Not sure what to think about these.I’ve done a lot of reading
> about how when these
21 16:51:56.862447",
> "age": 376.527241,
> "duration": 1.331278,
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Brad Hubbard
> Sent: Thursday, 21 March 2019 1:43 PM
> To: Glen Baars
> Cc: cep
Try capturing another log with debug_ms turned up. 1 or 5 should be Ok
to start with.
On Fri, Feb 8, 2019 at 8:37 PM Massimo Sgaravatto
wrote:
>
> Our Luminous ceph cluster have been worked without problems for a while, but
> in the last days we have been suffering from continuous slow
On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez wrote:
>
> Hi there,
>
> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
>
> Today we had two disks fail out of the controller, and after a reboot
> they both seemed to come back fine but ceph-osd was only able to start
> in one
t;
> 2019-02-09 07:35:14.627462 7f99972cc700 1 -- 192.168.222.204:6804/4159520
> <== osd.5 192.168.222.202:6816/157436 2527
> osd_repop(client.171725953.0:404377591 8.9b e1205833/1205735) v2
> 1050+0+123635 (1225076790 0 171428115) 0x5610f5128a00 con 0x5610fc5bf000
> 2019-02-0
A single OSD should be expendable and you should be able to just "zap"
it and recreate it. Was this not true in your case?
On Wed, Feb 13, 2019 at 1:27 AM Ruben Rodriguez wrote:
>
>
>
> On 2/9/19 5:40 PM, Brad Hubbard wrote:
> > On Sun, Feb 10, 2019 at 1:
rong/misconfigured with the new switch: we
> would try to replicate the problem, possibly without a ceph deployment ...
>
> Thanks again for your help !
>
> Cheers, Massimo
>
> On Sun, Feb 10, 2019 at 12:07 AM Brad Hubbard wrote:
>>
>> The log ends at
>>
>>
Let's try to restrict discussion to the original thread
"backfill_toofull while OSDs are not full" and get a tracker opened up
for this issue.
On Sat, Feb 2, 2019 at 11:52 AM Fyodor Ustinov wrote:
>
> Hi!
>
> Right now, after adding OSD:
>
> # ceph health detail
> HEALTH_ERR 74197563/199392333
On Tue, Apr 16, 2019 at 7:38 AM solarflow99 wrote:
>
> Then why doesn't this work?
>
> # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> osd.0: osd_recovery_max_active = '4' (not observed, change may require
> restart)
> osd.1: osd_recovery_max_active = '4' (not observed, change may
puzzled why it doesn't show any change when I run this no matter
> what I set it to:
>
> # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> osd_recovery_max_active = 3
>
> in fact it doesn't matter if I use an OSD number that doesn't exist, same
> thing if I use c
:15 libceph-common.so ->
> libceph-common.so.0
> -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard wrote:
> >
> > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang w
On Wed, Apr 17, 2019 at 1:37 PM Can Zhang wrote:
>
> Thanks for your suggestions.
>
> I tried to build libfio_ceph_objectstore.so, but it fails to load:
>
> ```
> $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
>
> fio: engine libfio_ceph_objectstore.so not loadable
> IO
gt; Notice the "U" and "V" from nm results.
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard wrote:
> >
> > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > as you say, then it
relating to the clearing in mon, mgr, or osd logs.
> >
> > So, not entirely sure what fixed it, but it is resolved on its own.
> >
> > Thanks,
> >
> > Reed
> >
> > On Apr 30, 2019, at 8:01 PM, Brad Hubbard wrote:
> >
> > On Wed, May 1, 2019 at
On Wed, May 1, 2019 at 10:54 AM Brad Hubbard wrote:
>
> Which size is correct?
Sorry, accidental discharge =D
If the object info size is *incorrect* try forcing a write to the OI
with something like the following.
1. rados -p [name_of_pool_17] setomapval 10008536718.
tempora
Which size is correct?
On Tue, Apr 30, 2019 at 1:06 AM Reed Dier wrote:
>
> Hi list,
>
> Woke up this morning to two PG's reporting scrub errors, in a way that I
> haven't seen before.
>
> $ ceph versions
> {
> "mon": {
> "ceph version 13.2.5
If you can give me specific steps so I can reproduce this
from a freshly cloned tree I'd be happy to look further into it.
Good luck.
On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard wrote:
>
> Let me try to reproduce this on centos 7.5 with master and I'll let
> you know how I go.
>
>
I'd suggest creating a tracker similar to
http://tracker.ceph.com/issues/40554 which was created for the issue
in the thread you mentioned.
On Wed, Jul 3, 2019 at 12:29 AM Vandeir Eduardo
wrote:
>
> Hi,
>
> on client machines, when I use the command rbd, for example, rbd ls
> poolname, this
On Thu, Jun 27, 2019 at 8:58 PM nokia ceph wrote:
>
> Hi Team,
>
> We have a requirement to create multiple copies of an object and currently we
> are handling it in client side to write as separate objects and this causes
> huge network traffic between client and cluster.
> Is there
gt; application is responsible for any locking needed.
> -Greg
>
> On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard wrote:
> >
> > Yes, this should be possible using an object class which is also a
> > RADOS client (via the RADOS API). You'll still have some client
> >
>
> Best,
> Can Zhang
>
>
> On Fri, Apr 19, 2019 at 6:28 PM Brad Hubbard wrote:
> >
> > OK. So this works for me with master commit
> > bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.
> >
> > I cloned the repo and ran './install-deps.sh'
On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich wrote:
>
> This works, it just says that it *might* require a restart, but this
> particular option takes effect without a restart.
We've already looked at changing the wording once to make it more palatable.
http://tracker.ceph.com/issues/18424
>
t;> Thank you for your response , and we will check this video as well.
>>> Our requirement is while writing an object into the cluster , if we can
>>> provide number of copies to be made , the network consumption between
>>> client and cluster will be only for one object write.
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan wrote:
>
> Paul,
>
> Thanks for the reply. All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
>
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan wrote:
>
> Paul,
>
> Thanks for the reply. All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
>
Could you create a tracker for this?
Also, if you can reproduce this could you gather a log with
debug_osd=20 ? That should show us the superblock it was trying to
decode as well as additional details.
On Mon, Aug 12, 2019 at 6:29 AM huxia...@horebdata.cn
wrote:
>
> Dear folks,
>
> I had an OSD
https://tracker.ceph.com/issues/41255 is probably reporting the same issue.
On Thu, Aug 22, 2019 at 6:31 PM Lars Täuber wrote:
>
> Hi there!
>
> We also experience this behaviour of our cluster while it is moving pgs.
>
> # ceph health detail
> HEALTH_ERR 1 MDSs report slow metadata IOs; Reduced
https://tracker.ceph.com/issues/38724
On Fri, Aug 23, 2019 at 10:18 PM Paul Emmerich wrote:
>
> I've seen that before (but never on Nautilus), there's already an
> issue at tracker.ceph.com but I don't recall the id or title.
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph
On Wed, Sep 4, 2019 at 9:42 PM Andras Pataki
wrote:
>
> Dear ceph users,
>
> After upgrading our ceph-fuse clients to 14.2.2, we've been seeing sporadic
> segfaults with not super revealing stack traces:
>
> in thread 7fff5a7fc700 thread_name:ceph-fuse
>
> ceph version 14.2.2
On Thu, Sep 12, 2019 at 1:52 AM Benjamin Tayehanpour
wrote:
>
> Greetings!
>
> I had an OSD down, so I ran ceph osd status and got this:
>
> [root@ceph1 ~]# ceph osd status
> Error EINVAL: Traceback (most recent call last):
> File "/usr/lib64/ceph/mgr/status/module.py", line 313, in
-63> 2019-08-07 00:51:52.861 7fe987e49700 1 heartbeat_map
clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed
out after 150
You hit a suicide timeout, that's fatal. On line 80 the process kills
the thread based on the assumption it's hung.
src/common/HeartbeatMap.cc:
66
Removed ceph-de...@vger.kernel.org and added d...@ceph.io
On Tue, Oct 1, 2019 at 4:26 PM Alex Litvak wrote:
>
> Hellow everyone,
>
> Can you shed the line on the cause of the crash? Could actually client
> request trigger it?
>
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30
On Wed, Oct 2, 2019 at 1:15 AM Mattia Belluco wrote:
>
> Hi Jake,
>
> I am curious to see if your problem is similar to ours (despite the fact
> we are still on Luminous).
>
> Could you post the output of:
>
> rados list-inconsistent-obj
>
> and
>
> rados list-inconsistent-snapset
Make sure
9 at 8:03 AM Sasha Litvak
> wrote:
>>
>> It was hardware indeed. Dell server reported a disk being reset with power
>> on. Checking the usual suspects i.e. controller firmware, controller event
>> log (if I can get one), drive firmware.
>> I will report more when I g
On Tue, Oct 1, 2019 at 10:43 PM Del Monaco, Andrea <
andrea.delmon...@atos.net> wrote:
> Hi list,
>
> After the nodes ran OOM and after reboot, we are not able to restart the
> ceph-osd@x services anymore. (Details about the setup at the end).
>
> I am trying to do this manually, so we can see
up
> ([27,30,38], p27) acting ([30,25], p30)
>
> I also checked the logs of all OSDs already done and got the same logs
> about this object :
> * osd.4, last time : 2019-10-10 16:15:20
> * osd.32, last time : 2019-10-14 01:54:56
> * osd.33, last time : 2019-10-11 06:24:01
>
I'd suggest you open a tracker under the Bluestore component so
someone can take a look. I'd also suggest you include a log with
'debug_bluestore=20' added to the COT command line.
On Thu, Nov 7, 2019 at 6:56 PM Eugene de Beste wrote:
>
> Hi, does anyone have any feedback for me regarding this?
Yes, try and get the pgs healthy, then you can just re-provision the down OSDs.
Run a scrub on each of these pgs and then use the commands on the
following page to find out more information for each case.
https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/
Focus on the
On Tue, Oct 29, 2019 at 9:09 PM Jérémy Gardais
wrote:
>
> Thus spake Brad Hubbard (bhubb...@redhat.com) on mardi 29 octobre 2019 à
> 08:20:31:
> > Yes, try and get the pgs healthy, then you can just re-provision the down
> > OSDs.
> >
> > Run a scrub
On Tue, Sep 24, 2019 at 10:51 PM M Ranga Swami Reddy
wrote:
>
> Interestingly - "rados list-inconsistent-obj ${PG} --format=json" not
> showing any objects inconsistent-obj.
> And also "rados list-missing-obj ${PG} --format=json" also not showing any
> missing or unfound objects.
Complete a
Does pool 6 have min_size = 1 set?
https://tracker.ceph.com/issues/24994#note-5 would possibly be helpful
here, depending on what the output of the following command looks
like.
# rados list-inconsistent-obj [pgid] --format=json-pretty
On Thu, Oct 10, 2019 at 8:16 PM Kenneth Waegeman
wrote:
>
ashpspool stripe_width 0 application cephfs
This looked like something min_size 1 could cause, but I guess that's
not the cause here.
> so inconsistens is empty, which is weird, no ?
Try scrubbing the pg just before running the command.
>
> Thanks again!
>
> K
>
>
> On 10/10/2019
On Fri, Oct 4, 2019 at 6:09 PM Marc Roos wrote:
>
> >
> >Try something like the following on each OSD that holds a copy of
> >rbd_data.1f114174b0dc51.0974 and see what output you get.
> >Note that you can drop the bluestore flag if they are not bluestore
> >osds and you will need
301 - 400 of 404 matches
Mail list logo