With just the one ls listing and my memory it's not totally clear, but
I believe this is the output you get when delete a snapshot folder but
it's still referenced by a different snapshot farther up the
hierarchy.
-Greg
On Mon, Dec 16, 2019 at 8:51 AM Marc Roos wrote:
>
>
> Am I the only lucky
Unfortunately RGW doesn't test against extended version differences
like this and I don't think it's compatible across more than one major
release. Basically it's careful to support upgrades between long-term
stable releases but nothing else is expected to work.
That said, getting off of Giant
t understanding why this started happening when memory usage had
> been so stable before.
>
> Thanks,
>
> Vlad
>
>
>
> On 10/9/19 11:51 AM, Gregory Farnum wrote:
> > On Mon, Oct 7, 2019 at 7:20 AM Vladimir Brik
> > wrote:
> >>
> >> > Do
s that nobody
else in the cluster cares about.
-Greg
> Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum
> Folgendes geschrieben:
>
>
> On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou
> wrote:
> >
> > I had to use rocksdb repair tool before because
mory.
>
> It appears that memory is highly fragmented on the NUMA node 0 of all
> the servers. Some of the servers have no free pages higher than order 0.
> (Memory on NUMA node 1 of the servers appears much less fragmented.)
>
> The servers have 192GB of RAM, 2 NUMA nodes.
>
>
&
bit of data. :/
> What is meant with "turn it off and rebuild from remainder"?
If only one monitor is crashing, you can remove it from the quorum,
zap all the disks, and add it back so that it recovers from its
healthy peers.
-Greg
>
> Am Samstag, 5. Oktober 2019, 0
Do you have statistics on the size of the OSDMaps or count of them
which were being maintained by the OSDs? I'm not sure why having noout
set would change that if all the nodes were alive, but that's my bet.
-Greg
On Thu, Oct 3, 2019 at 7:04 AM Vladimir Brik
wrote:
>
> And, just as unexpectedly,
Hmm, that assert means the monitor tried to grab an OSDMap it had on
disk but it didn't work. (In particular, a "pinned" full map which we
kept around after trimming the others to save on disk space.)
That *could* be a bug where we didn't have the pinned map and should
have (or incorrectly
On Thu, Sep 19, 2019 at 12:06 AM Alex Xu wrote:
>
> Hi Cephers,
>
> We are testing the write performance of Ceph EC (Luminous, 8 + 4), and
> noticed that tail latency is extremly high. Say, avgtime of 10th
> commit is 40ms, acceptable as it's an all HDD cluster; 11th is 80ms,
> doubled; then 12th
On Tue, Sep 17, 2019 at 8:12 AM Sander Smeenk wrote:
>
> Quoting Paul Emmerich (paul.emmer...@croit.io):
>
> > Yeah, CephFS is much closer to POSIX semantics for a filesystem than
> > NFS. There's an experimental relaxed mode called LazyIO but I'm not
> > sure if it's applicable here.
>
> Out of
On Thu, Aug 29, 2019 at 4:57 AM Thomas Byrne - UKRI STFC
wrote:
>
> Hi all,
>
> I’m investigating an issue with our (non-Ceph) caching layers of our large EC
> cluster. It seems to be turning users requests for whole objects into lots of
> small byte range requests reaching the OSDs, but I’m
No; KStore is not for real use AFAIK.
On Wed, Aug 7, 2019 at 12:24 AM R.R.Yuan wrote:
>
> Hi, All,
>
>When deploying a development cluster, there are three types of OSD
> objectstore backend: filestore, bluestore and kstore.
>But there is no "--kstore" option when using
On Fri, Aug 2, 2019 at 12:13 AM Pierre Dittes wrote:
>
> Hi,
> we had some major up with our CephFS. Long story short..no Journal backup
> and journal was truncated.
> Now..I still see a metadata pool with all objects and datapool is fine, from
> what I know neither was corrupted. Last
On Thu, Aug 1, 2019 at 12:06 PM Eric Ivancich wrote:
>
> Hi Paul,
>
> I’ll interleave responses below.
>
> On Jul 31, 2019, at 2:02 PM, Paul Emmerich wrote:
>
> How could the bucket deletion of the future look like? Would it be possible
> to put all objects in buckets into RADOS namespaces and
>
> Thanks,
> Muthu
>
> On Wed, Jul 31, 2019 at 11:13 PM Gregory Farnum wrote:
>>
>>
>>
>> On Wed, Jul 31, 2019 at 1:32 AM nokia ceph wrote:
>>>
>>> Hi Greg,
>>>
>>> We were trying to implement this however having issue
gt; Thank you Greg, we will try this out .
>>
>> Thanks,
>> Muthu
>>
>> On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum
>> wrote:
>>
>>> Well, the RADOS interface doesn't have a great deal of documentation
>>> so I don't know if I can point y
n Wed, Jul 3, 2019 at 11:09 AM Austin Workman wrote:
> Decided that if all the data was going to move, I should adjust my jerasure
> ec profile from k=4, m=1 -> k=5, m=1 with force(is this even recommended vs.
> just creating new pools???)
>
> Initially it unset crush-device-class=hdd to be
i Greg,
>
> Can you please share the api details for COPY_FROM or any reference document?
>
> Thanks ,
> Muthu
>
> On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard wrote:
>>
>> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum wrote:
>> >
>> > I'm not su
I'm not sure how or why you'd get an object class involved in doing
this in the normal course of affairs.
There's a copy_from op that a client can send and which copies an
object from another OSD into the target object. That's probably the
primitive you want to build on. Note that the OSD doesn't
On Fri, Jun 28, 2019 at 5:41 PM Jorge Garcia wrote:
>
> Ok, actually, the problem was somebody writing to the filesystem. So I moved
> their files and got to 0 objects. But then I tried to remove the original
> data pool and got an error:
>
> # ceph fs rm_data_pool cephfs cephfs-data
>
On Sat, Jun 29, 2019 at 8:13 PM Bryan Henderson wrote:
>
> > I'm not sure why the monitor did not mark it _out_ after 600 seconds
> > (default)
>
> Well, that part I understand. The monitor didn't mark the OSD out because the
> monitor still considered the OSD up. No reason to mark an up OSD
p [1], and
> I can provide other logs/files if anyone thinks they could be useful.
>
> Cheers,
> Tom
>
> [1] ceph-post-file: 1829bf40-cce1-4f65-8b35-384935d11446
>
> -Original Message-
> From: Gregory Farnum
> Sent: 24 June 2019 17:30
> To: Byrne, Thomas (STFC
On Mon, Jun 24, 2019 at 9:06 AM Thomas Byrne - UKRI STFC
wrote:
>
> Hi all,
>
>
>
> Some bluestore OSDs in our Luminous test cluster have started becoming
> unresponsive and booting very slowly.
>
>
>
> These OSDs have been used for stress testing for hardware destined for our
> production
Just nuke the monitor's store, remove it from the existing quorum, and
start over again. Injecting maps correctly is non-trivial and obviously
something went wrong, and re-syncing a monitor is pretty cheap.
On Thu, Jun 20, 2019 at 6:46 AM ☣Adam wrote:
> Anyone have any suggestions for how to
On Tue, Jun 18, 2019 at 2:26 AM ?? ?? wrote:
>
> Thank you very much! Can you point out where is the code of revoke?
The caps code is all over the code base as it's fundamental to the
filesystem's workings. You can get some more general background in my
recent Cephalocon talk "What are “caps”?
I think the mimic balancer doesn't include omap data when trying to
balance the cluster. (Because it doesn't get usable omap stats from
the cluster anyway; in Nautilus I think it does.) Are you using RGW or
CephFS?
-Greg
On Wed, Jun 5, 2019 at 1:01 PM Josh Haft wrote:
>
> Hi everyone,
>
> On my
On Wed, Jun 5, 2019 at 10:10 AM Jonas Jelten wrote:
>
> Hi!
>
> I'm also affected by this:
>
> HEALTH_WARN 13 pgs not deep-scrubbed in time; 13 pgs not scrubbed in time
> PG_NOT_DEEP_SCRUBBED 13 pgs not deep-scrubbed in time
> pg 6.b1 not deep-scrubbed since 0.00
> pg 7.ac not
These OSDs are far too small at only 10GiB for the balancer to try and
do any work. It's not uncommon for metadata like OSDMaps to exceed
that size in error states and in any real deployment a single PG will
be at least that large.
There are probably parameters you can tweak to try and make it
You’re the second report I’ve seen if this, and while it’s confusing, you
should be Abel to resolve it by restarting your active manager daemon.
On Sun, May 26, 2019 at 11:52 PM Lars Täuber wrote:
> Fri, 24 May 2019 21:41:33 +0200
> Michel Raabe ==> Lars Täuber ,
> ceph-users@lists.ceph.com :
On Tue, May 14, 2019 at 11:03 AM Rainer Krienke wrote:
>
> Hello,
>
> for a fresh setup ceph cluster I see a strange difference in the number
> of existing pools in the output of ceph -s and what I know that should
> be there: no pools at all.
>
> I set up a fresh Nautilus cluster with 144 OSDs
On Mon, May 6, 2019 at 6:41 PM Kyle Brantley wrote:
>
> On 5/6/2019 6:37 PM, Gregory Farnum wrote:
> > Hmm, I didn't know we had this functionality before. It looks to be
> > changing quite a lot at the moment, so be aware this will likely
> > require reconfiguring
On Wed, May 8, 2019 at 10:05 AM Ansgar Jazdzewski
wrote:
>
> hi folks,
>
> we try to build a new NAS using the vfs_ceph modul from samba 4.9.
>
> if i try to open the share i recive the error:
>
> May 8 06:58:44 nas01 smbd[375700]: 2019-05-08 06:58:44.732830
> 7ff3d5f6e700 0 --
On Wed, May 8, 2019 at 5:33 AM Dietmar Rieder
wrote:
>
> On 5/8/19 1:55 PM, Paul Emmerich wrote:
> > Nautilus properly accounts metadata usage, so nothing changed it just
> > shows up correctly now ;)
>
> OK, but then I'm not sure I understand why the increase was not sudden
> (with the update)
On Wed, May 8, 2019 at 2:37 AM Marco Stuurman
wrote:
>
> Hi,
>
> I've got an issue with the data in our pool. A RBD image containing 4TB+ data
> has moved over to a different pool after a crush rule set change, which
> should not be possible. Besides that it loops over and over to start
>
On Tue, May 7, 2019 at 6:54 AM Ignat Zapolsky wrote:
>
> Hi,
>
>
>
> We are looking at how to troubleshoot an issue with Ceph FS on k8s cluster.
>
>
>
> This filesystem is provisioned via rook 0.9.2 and have following behavior:
>
> If ceph fs is mounted on K8S master, then it is writeable
> If
Hmm, I didn't know we had this functionality before. It looks to be
changing quite a lot at the moment, so be aware this will likely
require reconfiguring later.
On Sun, May 5, 2019 at 10:40 AM Kyle Brantley wrote:
>
> I've been running luminous / ceph-12.2.11-0.el7.x86_64 on CentOS 7 for about
What's the output of "ceph -s" and "ceph osd tree"?
On Fri, May 3, 2019 at 8:58 AM Stefan Kooman wrote:
>
> Hi List,
>
> I'm playing around with CRUSH rules and device classes and I'm puzzled
> if it's working correctly. Platform specifics: Ubuntu Bionic with Ceph 14.2.1
>
> I created two new
::react_impl(boost::state
> > chart::event_base const&, void const*)+0x16a) [0x55766fe355ca]
> > 11:
> > (boost::statechart::state_machine > PG::RecoveryState::Initial, std::allocator,
> > boost::statechart::null_exception_translator>::process_event(b
> > oost::sta
Yes, check out the file layout options:
http://docs.ceph.com/docs/master/cephfs/file-layouts/
On Mon, Apr 29, 2019 at 3:32 PM Daniel Williams wrote:
>
> Is the 4MB configurable?
>
> On Mon, Apr 29, 2019 at 4:36 PM Gregory Farnum wrote:
>>
>> CephFS automatically chunks
; 6: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x38)
>>> [0x7c1476e528]
>>> 7: (boost::statechart::simple_state>> PG::RecoveryState::ToDelete, boost::mpl::list>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
>>>
CephFS automatically chunks objects into 4MB objects by default. For
an EC pool, RADOS internally will further subdivide them based on the
erasure code and striping strategy, with a layout that can vary. But
by default if you have eg an 8+3 EC code, you'll end up with a bunch
of (4MB/8=)512KB
You'll probably want to generate a log with "debug osd = 20" and
"debug bluestore = 20", then share that or upload it with
ceph-post-file, to get more useful info about which PGs are breaking
(is it actually the ones that were supposed to delete?).
If there's a particular set of PGs you need to
On Fri, Apr 26, 2019 at 10:55 AM Jan Pekař - Imatic wrote:
>
> Hi,
>
> yesterday my cluster reported slow request for minutes and after restarting
> OSDs (reporting slow requests) it stuck with peering PGs. Whole
> cluster was not responding and IO stopped.
>
> I also notice, that problem was
On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny wrote:
>
> I remember seeing reports in regards but it's being a while now.
> Can anyone tell?
No, this hasn't changed. It's unlikely it ever will; I think NFS
resolved the issue but it took a lot of ridiculous workarounds and
imposes a permanent
t incorrect?
I think they will come in and you're correct, but I haven't worked
with RGW in years so it's a bit out of my wheelhouse.
-Greg
>
> --
> Regards,
> Varun Singh
>
> On Sat, Apr 13, 2019 at 12:50 AM Gregory Farnum wrote:
> >
> > Yes, you would do this by
On Mon, Apr 15, 2019 at 1:52 PM Brent Kennedy wrote:
>
> I was looking around the web for the reason for some of the default pools in
> Ceph and I cant find anything concrete. Here is our list, some show no use
> at all. Can any of these be deleted ( or is there an article my googlefu
>
Yes, you would do this by setting up separate data pools for segregated
clients, giving those pools a CRUSH rule placing them on their own servers,
and if using S3 assigning the clients to them using either wholly separate
instances or perhaps separate zones and the S3 placement options.
-Greg
On
On Mon, Apr 8, 2019 at 3:19 PM Bryan Stillwell wrote:
>
> We have two separate RGW clusters running Luminous (12.2.8) that have started
> seeing an increase in PGs going active+clean+inconsistent with the reason
> being caused by an omap_digest mismatch. Both clusters are using FileStore
>
On Mon, Apr 1, 2019 at 4:04 AM Paul Emmerich wrote:
>
> There are no problems with mixed bluestore_min_alloc_size; that's an
> abstraction layer lower than the concept of multiple OSDs. (Also, you
> always have that when mixing SSDs and HDDs)
>
> I'm not sure about the real-world impacts of a
I believe our community manager Mike is in charge of that?
On Wed, Apr 3, 2019 at 6:49 AM Raphaël Enrici wrote:
>
> Dear all,
>
> is there somebody in charge of the ceph hosting here, or someone who
> knows the guy who knows another guy who may know...
>
> Saw this while reading the FOSDEM 2019
I think this got dealt with on irc, but for those following along at home:
I think the problem here is that you've set the central config to
disable authentication, but the client doesn't know what those config
options look like until it's connected — which it can't do, because
it's demanding
On Thu, Mar 21, 2019 at 2:45 PM Dan van der Ster wrote:
>
> On Thu, Mar 21, 2019 at 8:51 AM Gregory Farnum wrote:
> >
> > On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster
> > wrote:
> >>
> >> On Tue, Mar 19, 2019 at 9:43 AM Er
On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster wrote:
> On Tue, Mar 19, 2019 at 9:43 AM Erwin Bogaard
> wrote:
> >
> > Hi,
> >
> >
> >
> > For a number of application we use, there is a lot of file duplication.
> This wastes precious storage space, which I would like to avoid.
> >
> > When
On Tue, Mar 19, 2019 at 2:13 PM Erwin Bogaard
wrote:
> Hi,
>
>
>
> For a number of application we use, there is a lot of file duplication.
> This wastes precious storage space, which I would like to avoid.
>
> When using a local disk, I can use a hard link to let all duplicate files
> point to
On Mon, Mar 18, 2019 at 7:28 PM Yan, Zheng wrote:
>
> On Mon, Mar 18, 2019 at 9:50 PM Dylan McCulloch wrote:
> >
> >
> > >please run following command. It will show where is 4.
> > >
> > >rados -p -p hpcfs_metadata getxattr 4. parent >/tmp/parent
> > >ceph-dencoder import
You will either need access to a ceph.conf, or else have some way to
pass in on the CLI:
* monitor IP addresses
* a client ID
* a client key (or keyring file)
Your ceph.conf doesn't strictly need to be the same one used for other
things on the cluster, so you could assemble it yourself. Same goes
I don’t have any idea what’s going on here or why it’s not working, but you
are using v0.94.7. That release is:
1) out of date for the Hammer cycle, which reached at least .94.10
2) prior to the release where we declared CephFS stable (Jewel, v10.2.0)
3) way past its supported expiration date.
Are they getting cleaned up? CephFS does not instantly delete files; they
go into a "purge queue" and get cleaned up later by the MDS.
-Greg
On Thu, Mar 7, 2019 at 2:00 AM Fyodor Ustinov wrote:
> Hi!
>
> After removing all files from cephfs I see that situation:
> #ceph df
> POOLS:
> NAME
In general, no, this is not an expected behavior.
My guess would be that something odd is happening with the other clients
you have to the system, and there's a weird pattern with the way the file
locks are being issued. Can you be more precise about exactly what workload
you're running, and get
This is probably a build issue of some kind, but I'm not quite sure how...
The MDS (and all the Ceph code) is just invoking the getgrnam_r function,
which is part of POSIX and implemented by glib (or whatever other libc). So
any dependency on nscd is being created "behind our backs" somewhere.
The OSD tried to create a new thread, and the kernel told it no. You
probably need to turn up the limits on threads and/or file descriptors.
-Greg
On Wed, Feb 27, 2019 at 2:36 AM hnuzhoulin2 wrote:
> Hi, guys
>
> So far, there have been 10 osd service exit because of this error.
> the error
On Fri, Feb 15, 2019 at 1:39 AM Ilya Dryomov wrote:
> On Fri, Feb 15, 2019 at 12:05 AM Mike Perez wrote:
> >
> > Hi Marc,
> >
> > You can see previous designs on the Ceph store:
> >
> > https://www.proforma.com/sdscommunitystore
>
> Hi Mike,
>
> This site stopped working during DevConf and
Actually I think I misread what this was doing, sorry.
Can you do a “ceph osd tree”? It’s hard to see the structure via the text
dumps.
On Wed, Feb 13, 2019 at 10:49 AM Gregory Farnum wrote:
> Your CRUSH rule for EC spools is forcing that behavior with the line
>
> step chooseleaf ind
On Wed, Feb 13, 2019 at 10:37 AM Jason Dillaman wrote:
>
> For the future Ceph Octopus release, I would like to remove all
> remaining support for RBD image format v1 images baring any
> substantial pushback.
>
> The image format for new images has been defaulted to the v2 image
> format since
Your CRUSH rule for EC spools is forcing that behavior with the line
step chooseleaf indep 1 type ctnr
If you want different behavior, you’ll need a different crush rule.
On Tue, Feb 12, 2019 at 5:18 PM hnuzhoulin2 wrote:
> Hi, cephers
>
>
> I am building a ceph EC cluster.when a disk is
On Tue, Feb 12, 2019 at 5:10 AM Hector Martin wrote:
> On 12/02/2019 06:01, Gregory Farnum wrote:
> > Right. Truncates and renames require sending messages to the MDS, and
> > the MDS committing to RADOS (aka its disk) the change in status, before
> > they can be complete
On Thu, Feb 7, 2019 at 3:31 AM Hector Martin wrote:
> On 07/02/2019 19:47, Marc Roos wrote:
> >
> > Is this difference not related to chaching? And you filling up some
> > cache/queue at some point? If you do a sync after each write, do you
> > have still the same results?
>
> No, the slow
You can't tell from the client log here, but probably the MDS itself was
failing over to a new instance during that interval. There's not much
experience with it, but you could experiment with faster failover by
reducing the mds beacon and grace times. This may or may not work
reliably...
On Sat,
On Mon, Feb 4, 2019 at 8:03 AM Mahmoud Ismail
wrote:
> On Mon, Feb 4, 2019 at 4:35 PM Gregory Farnum wrote:
>
>>
>>
>> On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail <
>> mahmoudahmedism...@gmail.com> wrote:
>>
>>> On Mon, Feb 4, 2019 at 4:16 P
On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail
wrote:
> On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum wrote:
>
>> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail <
>> mahmoudahmedism...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm a bit co
On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail
wrote:
> Hello,
>
> I'm a bit confused about how the journaling actually works in the MDS.
>
> I was reading about these two configuration parameters (journal write head
> interval) and (mds early reply). Does the MDS flush the journal
>
On Fri, Jan 11, 2019 at 11:24 AM Sergio A. de Carvalho Jr. <
scarvalh...@gmail.com> wrote:
> Thanks for the answers, guys!
>
> Am I right to assume msgr2 (http://docs.ceph.com/docs/mimic/dev/msgr2/)
> will provide encryption between Ceph daemons as well as between clients and
> daemons?
>
> Does
Looks like your network deactivated before the rbd volume was unmounted.
This is a known issue without a good programmatic workaround and you’ll
need to adjust your configuration.
On Tue, Jan 22, 2019 at 9:17 AM Gao, Wenjun wrote:
> I’m using krbd to map a rbd device to a VM, it appears when the
This doesn’t look familiar to me. Is the cluster still doing recovery so we
can at least expect them to make progress when the “out” OSDs get removed
from the set?
On Tue, Jan 22, 2019 at 2:44 PM Wido den Hollander wrote:
> Hi,
>
> I've got a couple of PGs which are stuck in backfill_toofull,
On Mon, Jan 21, 2019 at 12:52 AM Yan, Zheng wrote:
> On Mon, Jan 21, 2019 at 12:12 PM Albert Yue
> wrote:
> >
> > Hi Yan Zheng,
> >
> > 1. mds cache limit is set to 64GB
> > 2. we get the size of meta data pool by running `ceph df` and saw meta
> data pool just used 200MB space.
> >
>
> That's
On Fri, Jan 11, 2019 at 10:07 PM Brian Topping
wrote:
> Hi all,
>
> I have a simple two-node Ceph cluster that I’m comfortable with the care
> and feeding of. Both nodes are in a single rack and captured in the
> attached dump, it has two nodes, only one mon, all pools size 2. Due to
> physical
On Fri, Jan 4, 2019 at 1:19 PM Jan Kasprzak wrote:
>
> Gregory Farnum wrote:
> : On Wed, Jan 2, 2019 at 5:12 AM Jan Kasprzak wrote:
> :
> : > Thomas Byrne - UKRI STFC wrote:
> : > : I recently spent some time looking at this, I believe the 'summary' and
> : > : 'ov
The osd_map_cache_size controls the OSD’s cache of maps; the change in
13.2.3 is to the default for the monitors’.
On Mon, Jan 7, 2019 at 8:24 AM Anthony D'Atri wrote:
>
>
> > * The default memory utilization for the mons has been increased
> > somewhat. Rocksdb now uses 512 MB of RAM by
Yeah I think it’s a “planet”-style feed that incorporates some other blogs.
I don’t think it’s been maintained much since being launched though.
On Fri, Jan 4, 2019 at 1:21 PM Jan Kasprzak wrote:
> Gregory Farnum wrote:
> : It looks like ceph.com/feed is the RSS url?
>
>
On Wed, Jan 2, 2019 at 5:12 AM Jan Kasprzak wrote:
> Thomas Byrne - UKRI STFC wrote:
> : I recently spent some time looking at this, I believe the 'summary' and
> : 'overall_status' sections are now deprecated. The 'status' and 'checks'
> : fields are the ones to use now.
>
> OK, thanks.
You can also get more data by checking what the monitor logs for that
manager on the connect attempt (if you turn up its debug mon or debug
ms settings). If one of your managers is behaving, I'd examine its
configuration file and compare to the others. For instance, that
"Invalid argument" might
Regarding 13.2.3 specifically:
As Abhishek says, there are no known issues in the release. It went
through our full and proper release validation; nobody has spotted any
last-minute bugs. The release notes are available in the git
repository:
It looks like ceph.com/feed is the RSS url?
On Fri, Jan 4, 2019 at 5:52 AM Jan Kasprzak wrote:
> Hello,
>
> is there any RSS or Atom source for Ceph blog? I have looked inside
> the https://ceph.com/community/blog/ HTML source, but there is no
> or anything mentioning RSS or Atom.
>
>
y be most helpful is if you can dump out one of those
over-large incremental osdmaps and see what's using up all the space. (You
may be able to do it through the normal Ceph CLI by querying the monitor?
Otherwise if it's something very weird you may need to get the
ceph-dencoder tool and look at it
On Fri, Dec 14, 2018 at 6:44 PM Bryan Henderson
wrote:
> > Going back through the logs though it looks like the main reason we do a
> > 4MiB block size is so that we have a chance of reporting actual cluster
> > sizes to 32-bit systems,
>
> I believe you're talking about a different block size
CephFS is prepared for the statx interface that doesn’t necessarily fill in
every member of the stat structure, and allows you to make requests for
only certain pieces of information. The purpose is so that the client and
MDS can take less expensive actions than are required to satisfy a full
On Tue, Dec 18, 2018 at 1:11 AM Hector Martin wrote:
> Hi list,
>
> I'm running libvirt qemu guests on RBD, and currently taking backups by
> issuing a domfsfreeze, taking a snapshot, and then issuing a domfsthaw.
> This seems to be a common approach.
>
> This is safe, but it's impactful: the
On Thu, Dec 13, 2018 at 3:31 PM Bryan Henderson
wrote:
> I've searched the ceph-users archives and found no discussion to speak of
> of
> Cephfs block sizes, and I wonder how much people have thought about it.
>
> The POSIX 'stat' system call reports for each file a block size, which is
>
increased by 0.01 so we get 5 new pg_temp in osdmap.1357883 but size
> inc_osdmap so huge
>
> чт, 6 дек. 2018 г. в 06:20, Gregory Farnum :
> >
> > On Wed, Dec 5, 2018 at 3:32 PM Sergey Dolgov wrote:
> >>
> >> Hi guys
> >>
> >> I faced strange b
Well, it looks like you have different data in the MDSMap across your
monitors. That's not good on its face, but maybe there are extenuating
circumstances. Do you actually use CephFS, or just RBD/RGW? What's the
full output of "ceph -s"?
-Greg
On Thu, Dec 6, 2018 at 1:39 PM Marco Aroldi wrote:
>
On Wed, Dec 5, 2018 at 3:32 PM Sergey Dolgov wrote:
> Hi guys
>
> I faced strange behavior of crushmap change. When I change crush
> weight osd I sometimes get increment osdmap(1.2MB) which size is
> significantly bigger than size of osdmap(0.4MB)
>
This is probably because when CRUSH changes,
Yes, this is exactly it with the "reconnect denied".
-Greg
On Tue, Dec 4, 2018 at 3:00 AM NingLi wrote:
>
> Hi,maybe this reference can help you
>
>
> http://docs.ceph.com/docs/master/cephfs/troubleshooting/#disconnected-remounted-fs
>
>
> > On Dec 4, 2018, at 18:55, c...@jack.fr.eu.org wrote:
I’m pretty sure the monitor command there won’t move intermediate buckets
like the host. This is so if an osd has incomplete metadata it doesn’t
inadvertently move 11 other OSDs into a different rack/row/whatever.
So in this case, it finds the host osd0001 and matches it, but since the
crush map
On Mon, Nov 26, 2018 at 10:28 AM Vladimir Brik
wrote:
>
> Hello
>
> I am doing some Ceph testing on a near-full cluster, and I noticed that,
> after I brought down a node, some OSDs' utilization reached
> osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
> (90%) if
On Tue, Nov 20, 2018 at 9:50 PM Vlad Kopylov wrote:
> I see the point, but not for the read case:
> no overhead for just choosing or let Mount option choose read replica.
>
> This is simple feature that can be implemented, that will save many
> people bandwidth in really distributed cases.
>
On Fri, Nov 23, 2018 at 11:01 AM ST Wong (ITSC) wrote:
> Hi all,
>
>
> We've 8 osd hosts, 4 in room 1 and 4 in room2.
>
> A pool with size = 3 using following crush map is created, to cater for
> room failure.
>
>
> rule multiroom {
> id 0
> type replicated
> min_size 2
>
As the monitors limit their transaction rates, I would tend for the
higher-durability drives. I don't think any monitor throughput issues have
been reported on clusters with SSDs for storage.
-Greg
On Mon, Nov 26, 2018 at 5:47 AM Valmar Kuristik wrote:
> Hello,
>
> Can anyone say how important
On Sun, Nov 25, 2018 at 2:41 PM Stefan Kooman wrote:
> Hi list,
>
> During cluster expansion (adding extra disks to existing hosts) some
> OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction
> error (39) Directory not empty not handled on operation 21 (op 1,
> counting from
On Mon, Nov 26, 2018 at 3:30 AM Janne Johansson wrote:
> Den sön 25 nov. 2018 kl 22:10 skrev Stefan Kooman :
> >
> > Hi List,
> >
> > Another interesting and unexpected thing we observed during cluster
> > expansion is the following. After we added extra disks to the cluster,
> > while
Looks like you’ve considered the essential points for bluestore OSDs, yep.
:)
My concern would just be the surprisingly-large block.db requirements for
rgw workloads that have been brought up. (300+GB per OSD, I think someone
saw/worked out?).
-Greg
On Tue, Nov 20, 2018 at 1:35 AM Dan van der
1 - 100 of 2084 matches
Mail list logo