Re: [ceph-users] deleted snap dirs are back as _origdir_1099536400705

2019-12-16 Thread Gregory Farnum
With just the one ls listing and my memory it's not totally clear, but I believe this is the output you get when delete a snapshot folder but it's still referenced by a different snapshot farther up the hierarchy. -Greg On Mon, Dec 16, 2019 at 8:51 AM Marc Roos wrote: > > > Am I the only lucky

Re: [ceph-users] osds way ahead of gateway version?

2019-12-03 Thread Gregory Farnum
Unfortunately RGW doesn't test against extended version differences like this and I don't think it's compatible across more than one major release. Basically it's careful to support upgrades between long-term stable releases but nothing else is expected to work. That said, getting off of Giant

Re: [ceph-users] Unexpected increase in the memory usage of OSDs

2019-10-09 Thread Gregory Farnum
t understanding why this started happening when memory usage had > been so stable before. > > Thanks, > > Vlad > > > > On 10/9/19 11:51 AM, Gregory Farnum wrote: > > On Mon, Oct 7, 2019 at 7:20 AM Vladimir Brik > > wrote: > >> > >> > Do

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-09 Thread Gregory Farnum
s that nobody else in the cluster cares about. -Greg > Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum > Folgendes geschrieben: > > > On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou > wrote: > > > > I had to use rocksdb repair tool before because

Re: [ceph-users] Unexpected increase in the memory usage of OSDs

2019-10-09 Thread Gregory Farnum
mory. > > It appears that memory is highly fragmented on the NUMA node 0 of all > the servers. Some of the servers have no free pages higher than order 0. > (Memory on NUMA node 1 of the servers appears much less fragmented.) > > The servers have 192GB of RAM, 2 NUMA nodes. > > &

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-07 Thread Gregory Farnum
bit of data. :/ > What is meant with "turn it off and rebuild from remainder"? If only one monitor is crashing, you can remove it from the quorum, zap all the disks, and add it back so that it recovers from its healthy peers. -Greg > > Am Samstag, 5. Oktober 2019, 0

Re: [ceph-users] Unexpected increase in the memory usage of OSDs

2019-10-04 Thread Gregory Farnum
Do you have statistics on the size of the OSDMaps or count of them which were being maintained by the OSDs? I'm not sure why having noout set would change that if all the nodes were alive, but that's my bet. -Greg On Thu, Oct 3, 2019 at 7:04 AM Vladimir Brik wrote: > > And, just as unexpectedly,

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-04 Thread Gregory Farnum
Hmm, that assert means the monitor tried to grab an OSDMap it had on disk but it didn't work. (In particular, a "pinned" full map which we kept around after trimming the others to save on disk space.) That *could* be a bug where we didn't have the pinned map and should have (or incorrectly

Re: [ceph-users] RADOS EC: is it okay to reduce the number of commits required for reply to client?

2019-09-25 Thread Gregory Farnum
On Thu, Sep 19, 2019 at 12:06 AM Alex Xu wrote: > > Hi Cephers, > > We are testing the write performance of Ceph EC (Luminous, 8 + 4), and > noticed that tail latency is extremly high. Say, avgtime of 10th > commit is 40ms, acceptable as it's an all HDD cluster; 11th is 80ms, > doubled; then 12th

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-17 Thread Gregory Farnum
On Tue, Sep 17, 2019 at 8:12 AM Sander Smeenk wrote: > > Quoting Paul Emmerich (paul.emmer...@croit.io): > > > Yeah, CephFS is much closer to POSIX semantics for a filesystem than > > NFS. There's an experimental relaxed mode called LazyIO but I'm not > > sure if it's applicable here. > > Out of

Re: [ceph-users] Help understanding EC object reads

2019-09-09 Thread Gregory Farnum
On Thu, Aug 29, 2019 at 4:57 AM Thomas Byrne - UKRI STFC wrote: > > Hi all, > > I’m investigating an issue with our (non-Ceph) caching layers of our large EC > cluster. It seems to be turning users requests for whole objects into lots of > small byte range requests reaching the OSDs, but I’m

Re: [ceph-users] Can kstore be used as OSD objectstore backend when deploying a Ceph Storage Cluster? If can, how to?

2019-08-07 Thread Gregory Farnum
No; KStore is not for real use AFAIK. On Wed, Aug 7, 2019 at 12:24 AM R.R.Yuan wrote: > > Hi, All, > >When deploying a development cluster, there are three types of OSD > objectstore backend: filestore, bluestore and kstore. >But there is no "--kstore" option when using

Re: [ceph-users] CephFS Recovery/Internals Questions

2019-08-04 Thread Gregory Farnum
On Fri, Aug 2, 2019 at 12:13 AM Pierre Dittes wrote: > > Hi, > we had some major up with our CephFS. Long story short..no Journal backup > and journal was truncated. > Now..I still see a metadata pool with all objects and datapool is fine, from > what I know neither was corrupted. Last

Re: [ceph-users] Adventures with large RGW buckets

2019-08-01 Thread Gregory Farnum
On Thu, Aug 1, 2019 at 12:06 PM Eric Ivancich wrote: > > Hi Paul, > > I’ll interleave responses below. > > On Jul 31, 2019, at 2:02 PM, Paul Emmerich wrote: > > How could the bucket deletion of the future look like? Would it be possible > to put all objects in buckets into RADOS namespaces and

Re: [ceph-users] details about cloning objects using librados

2019-08-01 Thread Gregory Farnum
> > Thanks, > Muthu > > On Wed, Jul 31, 2019 at 11:13 PM Gregory Farnum wrote: >> >> >> >> On Wed, Jul 31, 2019 at 1:32 AM nokia ceph wrote: >>> >>> Hi Greg, >>> >>> We were trying to implement this however having issue

Re: [ceph-users] details about cloning objects using librados

2019-07-31 Thread Gregory Farnum
gt; Thank you Greg, we will try this out . >> >> Thanks, >> Muthu >> >> On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum >> wrote: >> >>> Well, the RADOS interface doesn't have a great deal of documentation >>> so I don't know if I can point y

Re: [ceph-users] OSD's won't start - thread abort

2019-07-05 Thread Gregory Farnum
n Wed, Jul 3, 2019 at 11:09 AM Austin Workman wrote: > Decided that if all the data was going to move, I should adjust my jerasure > ec profile from k=4, m=1 -> k=5, m=1 with force(is this even recommended vs. > just creating new pools???) > > Initially it unset crush-device-class=hdd to be

Re: [ceph-users] details about cloning objects using librados

2019-07-03 Thread Gregory Farnum
i Greg, > > Can you please share the api details for COPY_FROM or any reference document? > > Thanks , > Muthu > > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard wrote: >> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum wrote: >> > >> > I'm not su

Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread Gregory Farnum
I'm not sure how or why you'd get an object class involved in doing this in the normal course of affairs. There's a copy_from op that a client can send and which copies an object from another OSD into the target object. That's probably the primitive you want to build on. Note that the OSD doesn't

Re: [ceph-users] Migrating a cephfs data pool

2019-07-01 Thread Gregory Farnum
On Fri, Jun 28, 2019 at 5:41 PM Jorge Garcia wrote: > > Ok, actually, the problem was somebody writing to the filesystem. So I moved > their files and got to 0 objects. But then I tried to remove the original > data pool and got an error: > > # ceph fs rm_data_pool cephfs cephfs-data >

Re: [ceph-users] How does monitor know OSD is dead?

2019-07-01 Thread Gregory Farnum
On Sat, Jun 29, 2019 at 8:13 PM Bryan Henderson wrote: > > > I'm not sure why the monitor did not mark it _out_ after 600 seconds > > (default) > > Well, that part I understand. The monitor didn't mark the OSD out because the > monitor still considered the OSD up. No reason to mark an up OSD

Re: [ceph-users] OSDs taking a long time to boot due to 'clear_temp_objects', even with fresh PGs

2019-06-26 Thread Gregory Farnum
p [1], and > I can provide other logs/files if anyone thinks they could be useful. > > Cheers, > Tom > > [1] ceph-post-file: 1829bf40-cce1-4f65-8b35-384935d11446 > > -Original Message- > From: Gregory Farnum > Sent: 24 June 2019 17:30 > To: Byrne, Thomas (STFC

Re: [ceph-users] OSDs taking a long time to boot due to 'clear_temp_objects', even with fresh PGs

2019-06-24 Thread Gregory Farnum
On Mon, Jun 24, 2019 at 9:06 AM Thomas Byrne - UKRI STFC wrote: > > Hi all, > > > > Some bluestore OSDs in our Luminous test cluster have started becoming > unresponsive and booting very slowly. > > > > These OSDs have been used for stress testing for hardware destined for our > production

Re: [ceph-users] Monitor stuck at "probing"

2019-06-20 Thread Gregory Farnum
Just nuke the monitor's store, remove it from the existing quorum, and start over again. Injecting maps correctly is non-trivial and obviously something went wrong, and re-syncing a monitor is pretty cheap. On Thu, Jun 20, 2019 at 6:46 AM ☣Adam wrote: > Anyone have any suggestions for how to

Re: [ceph-users] How does cephfs ensure client cache consistency?

2019-06-18 Thread Gregory Farnum
On Tue, Jun 18, 2019 at 2:26 AM ?? ?? wrote: > > Thank you very much! Can you point out where is the code of revoke? The caps code is all over the code base as it's fundamental to the filesystem's workings. You can get some more general background in my recent Cephalocon talk "What are “caps”?

Re: [ceph-users] balancer module makes OSD distribution worse

2019-06-05 Thread Gregory Farnum
I think the mimic balancer doesn't include omap data when trying to balance the cluster. (Because it doesn't get usable omap stats from the cluster anyway; in Nautilus I think it does.) Are you using RGW or CephFS? -Greg On Wed, Jun 5, 2019 at 1:01 PM Josh Haft wrote: > > Hi everyone, > > On my

Re: [ceph-users] PG scrub stamps reset to 0.000000 in 14.2.1

2019-06-05 Thread Gregory Farnum
On Wed, Jun 5, 2019 at 10:10 AM Jonas Jelten wrote: > > Hi! > > I'm also affected by this: > > HEALTH_WARN 13 pgs not deep-scrubbed in time; 13 pgs not scrubbed in time > PG_NOT_DEEP_SCRUBBED 13 pgs not deep-scrubbed in time > pg 6.b1 not deep-scrubbed since 0.00 > pg 7.ac not

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Gregory Farnum
These OSDs are far too small at only 10GiB for the balancer to try and do any work. It's not uncommon for metadata like OSDMaps to exceed that size in error states and in any real deployment a single PG will be at least that large. There are probably parameters you can tweak to try and make it

Re: [ceph-users] inconsistent number of pools

2019-05-28 Thread Gregory Farnum
You’re the second report I’ve seen if this, and while it’s confusing, you should be Abel to resolve it by restarting your active manager daemon. On Sun, May 26, 2019 at 11:52 PM Lars Täuber wrote: > Fri, 24 May 2019 21:41:33 +0200 > Michel Raabe ==> Lars Täuber , > ceph-users@lists.ceph.com :

Re: [ceph-users] ceph -s finds 4 pools but ceph osd lspools says no pool which is the expected answer

2019-05-15 Thread Gregory Farnum
On Tue, May 14, 2019 at 11:03 AM Rainer Krienke wrote: > > Hello, > > for a fresh setup ceph cluster I see a strange difference in the number > of existing pools in the output of ceph -s and what I know that should > be there: no pools at all. > > I set up a fresh Nautilus cluster with 144 OSDs

Re: [ceph-users] Prioritized pool recovery

2019-05-08 Thread Gregory Farnum
On Mon, May 6, 2019 at 6:41 PM Kyle Brantley wrote: > > On 5/6/2019 6:37 PM, Gregory Farnum wrote: > > Hmm, I didn't know we had this functionality before. It looks to be > > changing quite a lot at the moment, so be aware this will likely > > require reconfiguring

Re: [ceph-users] ceph mimic and samba vfs_ceph

2019-05-08 Thread Gregory Farnum
On Wed, May 8, 2019 at 10:05 AM Ansgar Jazdzewski wrote: > > hi folks, > > we try to build a new NAS using the vfs_ceph modul from samba 4.9. > > if i try to open the share i recive the error: > > May 8 06:58:44 nas01 smbd[375700]: 2019-05-08 06:58:44.732830 > 7ff3d5f6e700 0 --

Re: [ceph-users] Nautilus: significant increase in cephfs metadata pool usage

2019-05-08 Thread Gregory Farnum
On Wed, May 8, 2019 at 5:33 AM Dietmar Rieder wrote: > > On 5/8/19 1:55 PM, Paul Emmerich wrote: > > Nautilus properly accounts metadata usage, so nothing changed it just > > shows up correctly now ;) > > OK, but then I'm not sure I understand why the increase was not sudden > (with the update)

Re: [ceph-users] Data moved pools but didn't move osds & backfilling+remapped loop

2019-05-08 Thread Gregory Farnum
On Wed, May 8, 2019 at 2:37 AM Marco Stuurman wrote: > > Hi, > > I've got an issue with the data in our pool. A RBD image containing 4TB+ data > has moved over to a different pool after a crush rule set change, which > should not be possible. Besides that it loops over and over to start >

Re: [ceph-users] Read-only CephFs on a k8s cluster

2019-05-07 Thread Gregory Farnum
On Tue, May 7, 2019 at 6:54 AM Ignat Zapolsky wrote: > > Hi, > > > > We are looking at how to troubleshoot an issue with Ceph FS on k8s cluster. > > > > This filesystem is provisioned via rook 0.9.2 and have following behavior: > > If ceph fs is mounted on K8S master, then it is writeable > If

Re: [ceph-users] Prioritized pool recovery

2019-05-06 Thread Gregory Farnum
Hmm, I didn't know we had this functionality before. It looks to be changing quite a lot at the moment, so be aware this will likely require reconfiguring later. On Sun, May 5, 2019 at 10:40 AM Kyle Brantley wrote: > > I've been running luminous / ceph-12.2.11-0.el7.x86_64 on CentOS 7 for about

Re: [ceph-users] CRUSH rule device classes mystery

2019-05-06 Thread Gregory Farnum
What's the output of "ceph -s" and "ceph osd tree"? On Fri, May 3, 2019 at 8:58 AM Stefan Kooman wrote: > > Hi List, > > I'm playing around with CRUSH rules and device classes and I'm puzzled > if it's working correctly. Platform specifics: Ubuntu Bionic with Ceph 14.2.1 > > I created two new

Re: [ceph-users] Nautilus (14.2.0) OSDs crashing at startup after removing a pool containing a PG with an unrepairable error

2019-05-06 Thread Gregory Farnum
::react_impl(boost::state > > chart::event_base const&, void const*)+0x16a) [0x55766fe355ca] > > 11: > > (boost::statechart::state_machine > PG::RecoveryState::Initial, std::allocator, > > boost::statechart::null_exception_translator>::process_event(b > > oost::sta

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-04-29 Thread Gregory Farnum
Yes, check out the file layout options: http://docs.ceph.com/docs/master/cephfs/file-layouts/ On Mon, Apr 29, 2019 at 3:32 PM Daniel Williams wrote: > > Is the 4MB configurable? > > On Mon, Apr 29, 2019 at 4:36 PM Gregory Farnum wrote: >> >> CephFS automatically chunks

Re: [ceph-users] Nautilus (14.2.0) OSDs crashing at startup after removing a pool containing a PG with an unrepairable error

2019-04-29 Thread Gregory Farnum
; 6: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x38) >>> [0x7c1476e528] >>> 7: (boost::statechart::simple_state>> PG::RecoveryState::ToDelete, boost::mpl::list>> mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, >>>

Re: [ceph-users] Cephfs on an EC Pool - What determines object size

2019-04-29 Thread Gregory Farnum
CephFS automatically chunks objects into 4MB objects by default. For an EC pool, RADOS internally will further subdivide them based on the erasure code and striping strategy, with a layout that can vary. But by default if you have eg an 8+3 EC code, you'll end up with a bunch of (4MB/8=)512KB

Re: [ceph-users] Nautilus (14.2.0) OSDs crashing at startup after removing a pool containing a PG with an unrepairable error

2019-04-26 Thread Gregory Farnum
You'll probably want to generate a log with "debug osd = 20" and "debug bluestore = 20", then share that or upload it with ceph-post-file, to get more useful info about which PGs are breaking (is it actually the ones that were supposed to delete?). If there's a particular set of PGs you need to

Re: [ceph-users] PG stuck peering - OSD cephx: verify_authorizer key problem

2019-04-26 Thread Gregory Farnum
On Fri, Apr 26, 2019 at 10:55 AM Jan Pekař - Imatic wrote: > > Hi, > > yesterday my cluster reported slow request for minutes and after restarting > OSDs (reporting slow requests) it stuck with peering PGs. Whole > cluster was not responding and IO stopped. > > I also notice, that problem was

Re: [ceph-users] Were fixed CephFS lock ups when it's running on nodes with OSDs?

2019-04-22 Thread Gregory Farnum
On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny wrote: > > I remember seeing reports in regards but it's being a while now. > Can anyone tell? No, this hasn't changed. It's unlikely it ever will; I think NFS resolved the issue but it took a lot of ridiculous workarounds and imposes a permanent

Re: [ceph-users] Ceph Object storage for physically separating tenants storage infrastructure

2019-04-15 Thread Gregory Farnum
t incorrect? I think they will come in and you're correct, but I haven't worked with RGW in years so it's a bit out of my wheelhouse. -Greg > > -- > Regards, > Varun Singh > > On Sat, Apr 13, 2019 at 12:50 AM Gregory Farnum wrote: > > > > Yes, you would do this by

Re: [ceph-users] Default Pools

2019-04-15 Thread Gregory Farnum
On Mon, Apr 15, 2019 at 1:52 PM Brent Kennedy wrote: > > I was looking around the web for the reason for some of the default pools in > Ceph and I cant find anything concrete. Here is our list, some show no use > at all. Can any of these be deleted ( or is there an article my googlefu >

Re: [ceph-users] Ceph Object storage for physically separating tenants storage infrastructure

2019-04-12 Thread Gregory Farnum
Yes, you would do this by setting up separate data pools for segregated clients, giving those pools a CRUSH rule placing them on their own servers, and if using S3 assigning the clients to them using either wholly separate instances or perhaps separate zones and the S3 placement options. -Greg On

Re: [ceph-users] Inconsistent PGs caused by omap_digest mismatch

2019-04-08 Thread Gregory Farnum
On Mon, Apr 8, 2019 at 3:19 PM Bryan Stillwell wrote: > > We have two separate RGW clusters running Luminous (12.2.8) that have started > seeing an increase in PGs going active+clean+inconsistent with the reason > being caused by an omap_digest mismatch. Both clusters are using FileStore >

Re: [ceph-users] CephFS and many small files

2019-04-04 Thread Gregory Farnum
On Mon, Apr 1, 2019 at 4:04 AM Paul Emmerich wrote: > > There are no problems with mixed bluestore_min_alloc_size; that's an > abstraction layer lower than the concept of multiple OSDs. (Also, you > always have that when mixing SSDs and HDDs) > > I'm not sure about the real-world impacts of a

Re: [ceph-users] Wrong certificate delivered on https://ceph.io/

2019-04-04 Thread Gregory Farnum
I believe our community manager Mike is in charge of that? On Wed, Apr 3, 2019 at 6:49 AM Raphaël Enrici wrote: > > Dear all, > > is there somebody in charge of the ceph hosting here, or someone who > knows the guy who knows another guy who may know... > > Saw this while reading the FOSDEM 2019

Re: [ceph-users] Disable cephx with centralized configs

2019-04-04 Thread Gregory Farnum
I think this got dealt with on irc, but for those following along at home: I think the problem here is that you've set the central config to disable authentication, but the client doesn't know what those config options look like until it's connected — which it can't do, because it's demanding

Re: [ceph-users] CephFS: effects of using hard links

2019-03-22 Thread Gregory Farnum
On Thu, Mar 21, 2019 at 2:45 PM Dan van der Ster wrote: > > On Thu, Mar 21, 2019 at 8:51 AM Gregory Farnum wrote: > > > > On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster > > wrote: > >> > >> On Tue, Mar 19, 2019 at 9:43 AM Er

Re: [ceph-users] CephFS: effects of using hard links

2019-03-21 Thread Gregory Farnum
On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster wrote: > On Tue, Mar 19, 2019 at 9:43 AM Erwin Bogaard > wrote: > > > > Hi, > > > > > > > > For a number of application we use, there is a lot of file duplication. > This wastes precious storage space, which I would like to avoid. > > > > When

Re: [ceph-users] CephFS: effects of using hard links

2019-03-19 Thread Gregory Farnum
On Tue, Mar 19, 2019 at 2:13 PM Erwin Bogaard wrote: > Hi, > > > > For a number of application we use, there is a lot of file duplication. > This wastes precious storage space, which I would like to avoid. > > When using a local disk, I can use a hard link to let all duplicate files > point to

Re: [ceph-users] CephFS - large omap object

2019-03-18 Thread Gregory Farnum
On Mon, Mar 18, 2019 at 7:28 PM Yan, Zheng wrote: > > On Mon, Mar 18, 2019 at 9:50 PM Dylan McCulloch wrote: > > > > > > >please run following command. It will show where is 4. > > > > > >rados -p -p hpcfs_metadata getxattr 4. parent >/tmp/parent > > >ceph-dencoder import

Re: [ceph-users] Running ceph status as non-root user?

2019-03-15 Thread Gregory Farnum
You will either need access to a ceph.conf, or else have some way to pass in on the CLI: * monitor IP addresses * a client ID * a client key (or keyring file) Your ceph.conf doesn't strictly need to be the same one used for other things on the cluster, so you could assemble it yourself. Same goes

Re: [ceph-users] MDS segfaults on client connection -- brand new FS

2019-03-08 Thread Gregory Farnum
I don’t have any idea what’s going on here or why it’s not working, but you are using v0.94.7. That release is: 1) out of date for the Hammer cycle, which reached at least .94.10 2) prior to the release where we declared CephFS stable (Jewel, v10.2.0) 3) way past its supported expiration date.

Re: [ceph-users] garbage in cephfs pool

2019-03-07 Thread Gregory Farnum
Are they getting cleaned up? CephFS does not instantly delete files; they go into a "purge queue" and get cleaned up later by the MDS. -Greg On Thu, Mar 7, 2019 at 2:00 AM Fyodor Ustinov wrote: > Hi! > > After removing all files from cephfs I see that situation: > #ceph df > POOLS: > NAME

Re: [ceph-users] Can CephFS Kernel Client Not Read & Write at the Same Time?

2019-03-07 Thread Gregory Farnum
In general, no, this is not an expected behavior. My guess would be that something odd is happening with the other clients you have to the system, and there's a weird pattern with the way the file locks are being issued. Can you be more precise about exactly what workload you're running, and get

Re: [ceph-users] luminous 12.2.11 on debian 9 requires nscd?

2019-02-27 Thread Gregory Farnum
This is probably a build issue of some kind, but I'm not quite sure how... The MDS (and all the Ceph code) is just invoking the getgrnam_r function, which is part of POSIX and implemented by glib (or whatever other libc). So any dependency on nscd is being created "behind our backs" somewhere.

Re: [ceph-users] osd exit common/Thread.cc: 160: FAILED assert(ret == 0)--10.2.10

2019-02-27 Thread Gregory Farnum
The OSD tried to create a new thread, and the kernel told it no. You probably need to turn up the limits on threads and/or file descriptors. -Greg On Wed, Feb 27, 2019 at 2:36 AM hnuzhoulin2 wrote: > Hi, guys > > So far, there have been 10 osd service exit because of this error. > the error

Re: [ceph-users] Ceph Nautilus Release T-shirt Design

2019-02-15 Thread Gregory Farnum
On Fri, Feb 15, 2019 at 1:39 AM Ilya Dryomov wrote: > On Fri, Feb 15, 2019 at 12:05 AM Mike Perez wrote: > > > > Hi Marc, > > > > You can see previous designs on the Ceph store: > > > > https://www.proforma.com/sdscommunitystore > > Hi Mike, > > This site stopped working during DevConf and

Re: [ceph-users] jewel10.2.11 EC pool out a osd, its PGs remap to the osds in the same host

2019-02-15 Thread Gregory Farnum
Actually I think I misread what this was doing, sorry. Can you do a “ceph osd tree”? It’s hard to see the structure via the text dumps. On Wed, Feb 13, 2019 at 10:49 AM Gregory Farnum wrote: > Your CRUSH rule for EC spools is forcing that behavior with the line > > step chooseleaf ind

Re: [ceph-users] RBD image format v1 EOL ...

2019-02-13 Thread Gregory Farnum
On Wed, Feb 13, 2019 at 10:37 AM Jason Dillaman wrote: > > For the future Ceph Octopus release, I would like to remove all > remaining support for RBD image format v1 images baring any > substantial pushback. > > The image format for new images has been defaulted to the v2 image > format since

Re: [ceph-users] jewel10.2.11 EC pool out a osd, its PGs remap to the osds in the same host

2019-02-13 Thread Gregory Farnum
Your CRUSH rule for EC spools is forcing that behavior with the line step chooseleaf indep 1 type ctnr If you want different behavior, you’ll need a different crush rule. On Tue, Feb 12, 2019 at 5:18 PM hnuzhoulin2 wrote: > Hi, cephers > > > I am building a ceph EC cluster.when a disk is

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-12 Thread Gregory Farnum
On Tue, Feb 12, 2019 at 5:10 AM Hector Martin wrote: > On 12/02/2019 06:01, Gregory Farnum wrote: > > Right. Truncates and renames require sending messages to the MDS, and > > the MDS committing to RADOS (aka its disk) the change in status, before > > they can be complete

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-11 Thread Gregory Farnum
On Thu, Feb 7, 2019 at 3:31 AM Hector Martin wrote: > On 07/02/2019 19:47, Marc Roos wrote: > > > > Is this difference not related to chaching? And you filling up some > > cache/queue at some point? If you do a sync after each write, do you > > have still the same results? > > No, the slow

Re: [ceph-users] faster switch to another mds

2019-02-11 Thread Gregory Farnum
You can't tell from the client log here, but probably the MDS itself was failing over to a new instance during that interval. There's not much experience with it, but you could experiment with faster failover by reducing the mds beacon and grace times. This may or may not work reliably... On Sat,

Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Mon, Feb 4, 2019 at 8:03 AM Mahmoud Ismail wrote: > On Mon, Feb 4, 2019 at 4:35 PM Gregory Farnum wrote: > >> >> >> On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail < >> mahmoudahmedism...@gmail.com> wrote: >> >>> On Mon, Feb 4, 2019 at 4:16 P

Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Mon, Feb 4, 2019 at 7:32 AM Mahmoud Ismail wrote: > On Mon, Feb 4, 2019 at 4:16 PM Gregory Farnum wrote: > >> On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail < >> mahmoudahmedism...@gmail.com> wrote: >> >>> Hello, >>> >>> I'm a bit co

Re: [ceph-users] CephFS MDS journal

2019-02-04 Thread Gregory Farnum
On Fri, Feb 1, 2019 at 2:29 AM Mahmoud Ismail wrote: > Hello, > > I'm a bit confused about how the journaling actually works in the MDS. > > I was reading about these two configuration parameters (journal write head > interval) and (mds early reply). Does the MDS flush the journal >

Re: [ceph-users] Encryption questions

2019-01-24 Thread Gregory Farnum
On Fri, Jan 11, 2019 at 11:24 AM Sergio A. de Carvalho Jr. < scarvalh...@gmail.com> wrote: > Thanks for the answers, guys! > > Am I right to assume msgr2 (http://docs.ceph.com/docs/mimic/dev/msgr2/) > will provide encryption between Ceph daemons as well as between clients and > daemons? > > Does

Re: [ceph-users] krbd reboot hung

2019-01-24 Thread Gregory Farnum
Looks like your network deactivated before the rbd volume was unmounted. This is a known issue without a good programmatic workaround and you’ll need to adjust your configuration. On Tue, Jan 22, 2019 at 9:17 AM Gao, Wenjun wrote: > I’m using krbd to map a rbd device to a VM, it appears when the

Re: [ceph-users] backfill_toofull while OSDs are not full

2019-01-24 Thread Gregory Farnum
This doesn’t look familiar to me. Is the cluster still doing recovery so we can at least expect them to make progress when the “out” OSDs get removed from the set? On Tue, Jan 22, 2019 at 2:44 PM Wido den Hollander wrote: > Hi, > > I've got a couple of PGs which are stuck in backfill_toofull,

Re: [ceph-users] MDS performance issue

2019-01-21 Thread Gregory Farnum
On Mon, Jan 21, 2019 at 12:52 AM Yan, Zheng wrote: > On Mon, Jan 21, 2019 at 12:12 PM Albert Yue > wrote: > > > > Hi Yan Zheng, > > > > 1. mds cache limit is set to 64GB > > 2. we get the size of meta data pool by running `ceph df` and saw meta > data pool just used 200MB space. > > > > That's

Re: [ceph-users] Offsite replication scenario

2019-01-14 Thread Gregory Farnum
On Fri, Jan 11, 2019 at 10:07 PM Brian Topping wrote: > Hi all, > > I have a simple two-node Ceph cluster that I’m comfortable with the care > and feeding of. Both nodes are in a single rack and captured in the > attached dump, it has two nodes, only one mon, all pools size 2. Due to > physical

Re: [ceph-users] ceph health JSON format has changed

2019-01-08 Thread Gregory Farnum
On Fri, Jan 4, 2019 at 1:19 PM Jan Kasprzak wrote: > > Gregory Farnum wrote: > : On Wed, Jan 2, 2019 at 5:12 AM Jan Kasprzak wrote: > : > : > Thomas Byrne - UKRI STFC wrote: > : > : I recently spent some time looking at this, I believe the 'summary' and > : > : 'ov

Re: [ceph-users] Questions re mon_osd_cache_size increase

2019-01-07 Thread Gregory Farnum
The osd_map_cache_size controls the OSD’s cache of maps; the change in 13.2.3 is to the default for the monitors’. On Mon, Jan 7, 2019 at 8:24 AM Anthony D'Atri wrote: > > > > * The default memory utilization for the mons has been increased > > somewhat. Rocksdb now uses 512 MB of RAM by

Re: [ceph-users] Ceph blog RSS/Atom URL?

2019-01-04 Thread Gregory Farnum
Yeah I think it’s a “planet”-style feed that incorporates some other blogs. I don’t think it’s been maintained much since being launched though. On Fri, Jan 4, 2019 at 1:21 PM Jan Kasprzak wrote: > Gregory Farnum wrote: > : It looks like ceph.com/feed is the RSS url? > >

Re: [ceph-users] ceph health JSON format has changed

2019-01-04 Thread Gregory Farnum
On Wed, Jan 2, 2019 at 5:12 AM Jan Kasprzak wrote: > Thomas Byrne - UKRI STFC wrote: > : I recently spent some time looking at this, I believe the 'summary' and > : 'overall_status' sections are now deprecated. The 'status' and 'checks' > : fields are the ones to use now. > > OK, thanks.

Re: [ceph-users] ceph-mgr fails to restart after upgrade to mimic

2019-01-04 Thread Gregory Farnum
You can also get more data by checking what the monitor logs for that manager on the connect attempt (if you turn up its debug mon or debug ms settings). If one of your managers is behaving, I'd examine its configuration file and compare to the others. For instance, that "Invalid argument" might

Re: [ceph-users] Mimic 13.2.3?

2019-01-04 Thread Gregory Farnum
Regarding 13.2.3 specifically: As Abhishek says, there are no known issues in the release. It went through our full and proper release validation; nobody has spotted any last-minute bugs. The release notes are available in the git repository:

Re: [ceph-users] Ceph blog RSS/Atom URL?

2019-01-04 Thread Gregory Farnum
It looks like ceph.com/feed is the RSS url? On Fri, Jan 4, 2019 at 5:52 AM Jan Kasprzak wrote: > Hello, > > is there any RSS or Atom source for Ceph blog? I have looked inside > the https://ceph.com/community/blog/ HTML source, but there is no > or anything mentioning RSS or Atom. > >

Re: [ceph-users] size of inc_osdmap vs osdmap

2019-01-02 Thread Gregory Farnum
y be most helpful is if you can dump out one of those over-large incremental osdmaps and see what's using up all the space. (You may be able to do it through the normal Ceph CLI by querying the monitor? Otherwise if it's something very weird you may need to get the ceph-dencoder tool and look at it

Re: [ceph-users] cephfs file block size: must it be so big?

2018-12-21 Thread Gregory Farnum
On Fri, Dec 14, 2018 at 6:44 PM Bryan Henderson wrote: > > Going back through the logs though it looks like the main reason we do a > > 4MiB block size is so that we have a chance of reporting actual cluster > > sizes to 32-bit systems, > > I believe you're talking about a different block size

Re: [ceph-users] why libcephfs API use "struct ceph_statx" instead of "struct stat"

2018-12-20 Thread Gregory Farnum
CephFS is prepared for the statx interface that doesn’t necessarily fill in every member of the stat structure, and allows you to make requests for only certain pieces of information. The purpose is so that the client and MDS can take less expensive actions than are required to satisfy a full

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-20 Thread Gregory Farnum
On Tue, Dec 18, 2018 at 1:11 AM Hector Martin wrote: > Hi list, > > I'm running libvirt qemu guests on RBD, and currently taking backups by > issuing a domfsfreeze, taking a snapshot, and then issuing a domfsthaw. > This seems to be a common approach. > > This is safe, but it's impactful: the

Re: [ceph-users] cephfs file block size: must it be so big?

2018-12-13 Thread Gregory Farnum
On Thu, Dec 13, 2018 at 3:31 PM Bryan Henderson wrote: > I've searched the ceph-users archives and found no discussion to speak of > of > Cephfs block sizes, and I wonder how much people have thought about it. > > The POSIX 'stat' system call reports for each file a block size, which is >

Re: [ceph-users] size of inc_osdmap vs osdmap

2018-12-12 Thread Gregory Farnum
increased by 0.01 so we get 5 new pg_temp in osdmap.1357883 but size > inc_osdmap so huge > > чт, 6 дек. 2018 г. в 06:20, Gregory Farnum : > > > > On Wed, Dec 5, 2018 at 3:32 PM Sergey Dolgov wrote: > >> > >> Hi guys > >> > >> I faced strange b

Re: [ceph-users] ERR scrub mismatch

2018-12-06 Thread Gregory Farnum
Well, it looks like you have different data in the MDSMap across your monitors. That's not good on its face, but maybe there are extenuating circumstances. Do you actually use CephFS, or just RBD/RGW? What's the full output of "ceph -s"? -Greg On Thu, Dec 6, 2018 at 1:39 PM Marco Aroldi wrote: >

Re: [ceph-users] size of inc_osdmap vs osdmap

2018-12-05 Thread Gregory Farnum
On Wed, Dec 5, 2018 at 3:32 PM Sergey Dolgov wrote: > Hi guys > > I faced strange behavior of crushmap change. When I change crush > weight osd I sometimes get increment osdmap(1.2MB) which size is > significantly bigger than size of osdmap(0.4MB) > This is probably because when CRUSH changes,

Re: [ceph-users] [cephfs] Kernel outage / timeout

2018-12-04 Thread Gregory Farnum
Yes, this is exactly it with the "reconnect denied". -Greg On Tue, Dec 4, 2018 at 3:00 AM NingLi wrote: > > Hi,maybe this reference can help you > > > http://docs.ceph.com/docs/master/cephfs/troubleshooting/#disconnected-remounted-fs > > > > On Dec 4, 2018, at 18:55, c...@jack.fr.eu.org wrote:

Re: [ceph-users] Customized Crush location hooks in Mimic

2018-11-30 Thread Gregory Farnum
I’m pretty sure the monitor command there won’t move intermediate buckets like the host. This is so if an osd has incomplete metadata it doesn’t inadvertently move 11 other OSDs into a different rack/row/whatever. So in this case, it finds the host osd0001 and matches it, but since the crush map

Re: [ceph-users] What could cause mon_osd_full_ratio to be exceeded?

2018-11-26 Thread Gregory Farnum
On Mon, Nov 26, 2018 at 10:28 AM Vladimir Brik wrote: > > Hello > > I am doing some Ceph testing on a near-full cluster, and I noticed that, > after I brought down a node, some OSDs' utilization reached > osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio > (90%) if

Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-26 Thread Gregory Farnum
On Tue, Nov 20, 2018 at 9:50 PM Vlad Kopylov wrote: > I see the point, but not for the read case: > no overhead for just choosing or let Mount option choose read replica. > > This is simple feature that can be implemented, that will save many > people bandwidth in really distributed cases. >

Re: [ceph-users] will crush rule be used during object relocation in OSD failure ?

2018-11-26 Thread Gregory Farnum
On Fri, Nov 23, 2018 at 11:01 AM ST Wong (ITSC) wrote: > Hi all, > > > We've 8 osd hosts, 4 in room 1 and 4 in room2. > > A pool with size = 3 using following crush map is created, to cater for > room failure. > > > rule multiroom { > id 0 > type replicated > min_size 2 >

Re: [ceph-users] Monitor disks for SSD only cluster

2018-11-26 Thread Gregory Farnum
As the monitors limit their transaction rates, I would tend for the higher-durability drives. I don't think any monitor throughput issues have been reported on clusters with SSDs for storage. -Greg On Mon, Nov 26, 2018 at 5:47 AM Valmar Kuristik wrote: > Hello, > > Can anyone say how important

Re: [ceph-users] No recovery when "norebalance" flag set

2018-11-26 Thread Gregory Farnum
On Sun, Nov 25, 2018 at 2:41 PM Stefan Kooman wrote: > Hi list, > > During cluster expansion (adding extra disks to existing hosts) some > OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction > error (39) Directory not empty not handled on operation 21 (op 1, > counting from

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Gregory Farnum
On Mon, Nov 26, 2018 at 3:30 AM Janne Johansson wrote: > Den sön 25 nov. 2018 kl 22:10 skrev Stefan Kooman : > > > > Hi List, > > > > Another interesting and unexpected thing we observed during cluster > > expansion is the following. After we added extra disks to the cluster, > > while

Re: [ceph-users] bucket indices: ssd-only or is a large fast block.db sufficient?

2018-11-20 Thread Gregory Farnum
Looks like you’ve considered the essential points for bluestore OSDs, yep. :) My concern would just be the surprisingly-large block.db requirements for rgw workloads that have been brought up. (300+GB per OSD, I think someone saw/worked out?). -Greg On Tue, Nov 20, 2018 at 1:35 AM Dan van der

  1   2   3   4   5   6   7   8   9   10   >