Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-04 Thread Patrick Donnelly
Hi Goncalo, I believe this segfault may be the one fixed here: https://github.com/ceph/ceph/pull/10027 (Sorry for brief top-post. Im on mobile.) On Jul 4, 2016 9:16 PM, "Goncalo Borges" wrote: > > Dear All... > > We have recently migrated all our ceph infrastructure from 9.2.0 to 10.2.2. > > W

Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-06 Thread Patrick Donnelly
o 10.2.2) The locks were missing in 9.2.0. There were probably instances of the segfault unreported/unresolved. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-07 Thread Patrick Donnelly
ry()+0xd) [0x7f5440275c6d] > 8: (()+0x7aa1) [0x7f543ecefaa1] > 9: (clone()+0x6d) [0x7f543df6893d] > NOTE: a copy of the executable, or `objdump -rdS ` is needed to > interpret this. This one looks like a very different problem. I've created an issue here: http:/

Re: [ceph-users] mds standby + standby-reply upgrade

2016-07-08 Thread Patrick Donnelly
pgrade start and next mon restart, active monitor falls with > "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . This is the first time you've upgraded your pool to jewel right? Straight from 9.X to 10.2.2? -- Patrick Donnelly ___

Re: [ceph-users] ceph-fuse segfaults ( jewel 10.2.2)

2016-07-11 Thread Patrick Donnelly
lly to fast that ceph-fuse segfaults before OOM Killer can > kill it. It's possible but we have no evidence yet that ceph-fuse is using up all the memory on those machines yet, right? -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS write performance

2016-07-19 Thread Patrick Donnelly
d-config-ref/ in particular: "osd client message size cap". Also: http://docs.ceph.com/docs/hammer/rados/configuration/journal-ref/ -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS snapshot preferred behaviors

2016-07-27 Thread Patrick Donnelly
napshot at "/1/2/foo", it *will not* capture the file > data in bar. Is that okay? Doing otherwise is *exceedingly* difficult. This is only the case if /1/2/foo/ does not have the embedded inode for "bar", right? (That's normally the case but an intervening unlink of &

Re: [ceph-users] MDS crash

2016-08-10 Thread Patrick Donnelly
ble, or `objdump -rdS ` is needed to > interpret this. I have a bug report filed for this issue: http://tracker.ceph.com/issues/16983 I believe it should be straightforward to solve and we'll have a fix for it soon. Thanks for the report! -- Patrick Donnelly _

Re: [ceph-users] MDS crash

2016-08-10 Thread Patrick Donnelly
Randy, are you using ceph-fuse or the kernel client (or something else)? On Wed, Aug 10, 2016 at 2:33 PM, Randy Orr wrote: > Great, thank you. Please let me know if I can be of any assistance in > testing or validating a fix. > > -Randy > > On Wed, Aug 10, 2016 at 1:21 PM

Re: [ceph-users] ceph mds log: dne in the mdsmap

2017-07-11 Thread Patrick Donnelly
ning that it has > been removed from the mdsmap. > > The message could definitely be better worded! Tracker ticket: http://tracker.ceph.com/issues/20583 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] updating the documentation

2017-07-12 Thread Patrick Donnelly
a doc check. A developer must comment on the PR to say it passes documentation requirements before the bot changes the check to pass. This addresses all three points in an automatic way. -- Patrick Donnelly ___ ceph-users mailing list ceph-u

Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Patrick Donnelly
0.2.9 ? I'm not aware of or see any changes that would make downgrading back to 10.2.7 a problem but the safest thing to do would be to replace the v10.2.8 ceph-mds binaries with the v10.2.7 binary. If that's not practical, I would recommend a cluster-wide downgrade to 10.2.7. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Change the meta data pool of cephfs

2017-07-12 Thread Patrick Donnelly
t? How can this be best done? There is currently no way to change the metadata pool except through manual recovery into a new pool: http://docs.ceph.com/docs/master/cephfs/disaster-recovery/#using-an-alternate-metadata-pool-for-recovery I would strongly recommend backups before trying such a

Re: [ceph-users] Stealth Jewel release?

2017-07-14 Thread Patrick Donnelly
grade to that instead. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Yet another performance tuning for CephFS

2017-07-17 Thread Patrick Donnelly
oo. Same results. > > > > $ dd if=/dev/zero of=/mnt/c/testfile bs=100M count=10 oflag=direct This looks like your problem: don't use oflag=direct. That will cause CephFS to do synchronous I/O at great cost to performance in order to av

Re: [ceph-users] Yet another performance tuning for CephFS

2017-07-17 Thread Patrick Donnelly
t is the bandwidth limit of your local device rsync is reading from? -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Defining quota in CephFS - quota is ignored

2017-07-26 Thread Patrick Donnelly
100MB? I don't have a cluster to check this on now but perhaps because a sparse file (you wrote all zeros) does not consume its entire file size in the quota (only what it uses). Retry with /dev/urandom. (And the usual disclaimer: quotas only work with libcephfs/ceph-fuse. The kernel clie

Re: [ceph-users] ceph-fuse hanging on df with ceph luminous >= 12.1.3

2017-08-22 Thread Patrick Donnelly
son why you would experience a hang in the client. You can try adding "debug client = 20" and "debug ms = 5" to your ceph.conf [2] to get more information. [1] https://github.com/ceph/ceph/pull/16378/ [2] http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/ -

Re: [ceph-users] CephFS Segfault 12.2.0

2017-09-18 Thread Patrick Donnelly
We were able to then reboot clients (RHEL 7.4) and have them re-connect > to the file system. This looks like an instance of: http://tracker.ceph.com/issues/21070 Upcoming v12.2.1 has the fix. Until then, you will need to apply the patch locally. -- Patrick Donnelly

Re: [ceph-users] MDS crashes shortly after startup while trying to purge stray files.

2017-10-02 Thread Patrick Donnelly
you're using multiple active metadata servers? If so, that's not stable in Jewel. You may have tripped on one of many bugs fixed in Luminous for that configuration. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] what does associating ceph pool to application do?

2017-10-06 Thread Patrick Donnelly
g about changes which would allow multiple ceph file systems to use the same data pool by having each FS work in a separate namespace. See also: http://tracker.ceph.com/issues/15066 Support with CephFS and RBD using the same pool may follow that.

Re: [ceph-users] cephfs ceph-fuse performance

2017-10-18 Thread Patrick Donnelly
t to the kernel is one of our priorities for Mimic. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] luminous ceph-fuse crashes with "failed to remount for kernel dentry trimming"

2017-11-27 Thread Patrick Donnelly
. Are you running out of memory on these machines? [1] http://tracker.ceph.com/issues/17517 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] strange error on link() for nfs over cephfs

2017-11-29 Thread Patrick Donnelly
ng right now, a number of > machines still auto-mount users home directories from that nfsd. You need to try a newer kernel as there have been many fixes since 4.4 which probably have not been backported to your distribution's kernel. -- Patrick Donnelly _

Re: [ceph-users] CephFS log jam prevention

2017-12-05 Thread Patrick Donnelly
(36252/30)max_segments: 30, > num_segments: 36252 See also: http://tracker.ceph.com/issues/21975 You can try doubling (several times if necessary) the MDS configs `mds_log_max_segments` and `mds_log_max_expiring` to make it more aggressively trim its journal. (That

Re: [ceph-users] cephfs mds millions of caps

2017-12-14 Thread Patrick Donnelly
ttps://github.com/ceph/ceph/pull/17925 I suggest setting that config manually to false on all of your clients and ensure each client can remount itself to trim dentries (i.e. it's being run as root or with sufficient capabiltities) which is a fall

Re: [ceph-users] cephfs mds millions of caps

2017-12-14 Thread Patrick Donnelly
On Thu, Dec 14, 2017 at 4:44 PM, Webert de Souza Lima wrote: > Hi Patrick, > > On Thu, Dec 14, 2017 at 7:52 PM, Patrick Donnelly > wrote: >> >> >> It's likely you're a victim of a kernel backport that removed a dentry >> invalidation mechanism

Re: [ceph-users] MDS cache size limits

2018-01-04 Thread Patrick Donnelly
ues/21402 The memory usage tracking is off by a constant factor. I'd suggest just lowering the limit so it's about where it should be for your system. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS cache size limits

2018-01-05 Thread Patrick Donnelly
On Fri, Jan 5, 2018 at 3:54 AM, Stefan Kooman wrote: > Quoting Patrick Donnelly (pdonn...@redhat.com): >> >> It's expected but not desired: http://tracker.ceph.com/issues/21402 >> >> The memory usage tracking is off by a constant factor. I'd suggest >>

Re: [ceph-users] After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

2018-01-18 Thread Patrick Donnelly
create a tracker ticket and use ceph-post-file [1] to share logs. [1] http://docs.ceph.com/docs/hammer/man/8/ceph-post-file/ -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] client with uid

2018-01-25 Thread Patrick Donnelly
//tracker.ceph.com/issues/22802 [2] http://tracker.ceph.com/issues/22801 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] client with uid

2018-02-06 Thread Patrick Donnelly
:_ll_drop_pins()+0x67) [0x558336e5dea7] > 4: (Client::unmount()+0x943) [0x558336e67323] > 5: (main()+0x7ed) [0x558336e02b0d] > 6: (__libc_start_main()+0xea) [0x7efc7a892f2a] > 7: (_start()+0x2a) [0x558336e0b73a] > ceph-fuse [25154]: (33) Numerical

Re: [ceph-users] MDS Problems

2016-11-04 Thread Patrick Donnelly
7911d] > 8: (()+0x76fa) [0x7f04026b06fa] > 9: (clone()+0x6d) [0x7f0400b71b5d] > NOTE: a copy of the executable, or `objdump -rdS ` is needed to > interpret this. This assert might be this issue: http://tracker.ceph.com/issues/17531 However, the exe_path debug line in your log would not indicate that bug. You would see something like: 2016-10-06 15:12:04.933212 7fd94f072700 1 mds.a exe_path /home/pdonnell/ceph/build/bin/ceph-mds (deleted) -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Best practices for use ceph cluster and directories with many! Entries

2016-11-15 Thread Patrick Donnelly
Does anyone have a timeline for the testing dir frag mds? Directory fragmentation is on track to be stable for the Luminous release. [1] https://github.com/ceph/ceph/pull/9789 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.c

Re: [ceph-users] mds daemon damaged

2018-07-12 Thread Patrick Donnelly
mds.ds29 was a standby and > has data it should become live. If it was not > I assume we will lose the filesystem at this point > > Why didn't the standby MDS failover? > > Just looking for any way to recover the cephfs, thanks! I think it's time to do a scrub on the PG containing that object. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds daemon damaged

2018-07-12 Thread Patrick Donnelly
On Thu, Jul 12, 2018 at 3:55 PM, Patrick Donnelly wrote: >> Recommends fixing error by hand. Tried running deep scrub on pg 2.4, it >> completes but still have the same issue above >> >> Final option is to attempt removing mds.ds27. If mds.ds29 was a standby and >>

Re: [ceph-users] Insane CPU utilization in ceph.fuse

2018-07-23 Thread Patrick Donnelly
those were in 12.2.6. How fast does your MDS reach 15GB? Your MDS cache size should be configured to 1-8GB (depending on your preference) so it's disturbing to see you set it so low. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists

Re: [ceph-users] Secure way to wipe a Ceph cluster

2018-07-27 Thread Patrick Donnelly
the device firmware may retire bad blocks and make them inaccessible. It may not be possible for the device to physically destroy those blocks either even with SMART directives. You may be stuck with an industrial shredder to be compliant if the rules ar

Re: [ceph-users] CephFS Quota and ACL support

2018-08-27 Thread Patrick Donnelly
nux v4.17+. See also: https://github.com/ceph/ceph/pull/23728/files -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] omap vs. xattr in librados

2018-09-11 Thread Patrick Donnelly
prefer using xattrs over omap (outside of ease of > use in the API), correct? You may prefer xattrs on bluestore if the metadata is small and you may need to store the xattrs on an EC pool. omap is not supported on ecpools. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS performance.

2018-10-04 Thread Patrick Donnelly
iping via layouts: http://docs.ceph.com/docs/master/cephfs/file-layouts/ -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-08 Thread Patrick Donnelly
on 13.2.2 are also affected but do not require immediate action. A procedure for handling upgrades of fresh deployments from 13.2.2 to 13.2.3 will be included in the release notes for 13.2.3. -- Patrick Donnelly ___ ceph-users mailing list ce

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-08 Thread Patrick Donnelly
s). Thanks for the detailed notes. It looks like the MDS is stuck somewhere it's not even outputting any log messages. If possible, it'd be helpful to get a coredump (e.g. by sending SIGQUIT to the MDS) or, if you're comfortable with gdb, a backtrace of any threads that look suspiciou

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-17 Thread Patrick Donnelly
ade to 13.2.2 ? > > or better to wait to 13.2.3 ? or install 13.2.1 for now ? Upgrading to 13.2.1 would be safe. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Balanced MDS, all as active and recomended client settings.

2018-02-21 Thread Patrick Donnelly
; like cache size, and maybe something to lower the metadata servers load. >> >> ## >> [mds] >> mds_cache_size = 25 >> mds_cache_memory_limit = 792723456 You should only specify one of those. See also: http://docs.ceph.com/docs/master/cephf

Re: [ceph-users] Balanced MDS, all as active and recomended client settings.

2018-02-22 Thread Patrick Donnelly
of > time read-only so MDS data can be also cached for a while. The MDS issues capabilities that allow clients to coherently cache metadata. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cg

Re: [ceph-users] Balanced MDS, all as active and recomended client settings.

2018-02-23 Thread Patrick Donnelly
any quotas then there is no added load on the client/mds. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS very unstable with many small files

2018-02-26 Thread Patrick Donnelly
(which includes the inode count in cache) by collecting a `perf dump` via the admin socket. Then you can begin to find out what's consuming all of the MDS memory. Additionally, I concur with John on digging into why the MDS is missing heartbeats by collecting debug logs (`debug mds = 15`) at that time. It may also shed light on the issue. Thanks for performing the test and letting us know the results. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Storage usage of CephFS-MDS

2018-02-26 Thread Patrick Donnelly
ode (file). Using that you can extrapolate how much space the data pool needs based on your file system usage. (If all you're doing is filling the file system with empty files, of course you're going to need an unusually large metadata pool.) -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Storage usage of CephFS-MDS

2018-02-26 Thread Patrick Donnelly
On Mon, Feb 26, 2018 at 7:59 AM, Patrick Donnelly wrote: > It seems in the above test you're using about 1KB per inode (file). > Using that you can extrapolate how much space the data pool needs s/data pool/metadata pool/ -- Patrick Donnelly _

Re: [ceph-users] Updating standby mds from 12.2.2 to 12.2.4 caused up:active 12.2.2 mds's to suicide

2018-02-28 Thread Patrick Donnelly
o reduce the actives to 1 (max_mds -> 1; deactivate other ranks), shutdown standbys, upgrade the single active, then upgrade/start the standbys. Unfortunately this didn't get flagged in upgrade testing. Thanks for the report Dan. -- Patrick Donnelly ___

Re: [ceph-users] Ceph-Fuse and mount namespaces

2018-02-28 Thread Patrick Donnelly
hence close the mount > namespace), the "old" helper is finally freed. Once the last mount point is unmounted, FUSE will destroy the userspace helper. [1] http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=configuration#metavariab

Re: [ceph-users] Don't use ceph mds set max_mds

2018-03-07 Thread Patrick Donnelly
if called. > > The multi-fs stuff went in for Jewel, so maybe we should think about > removing the old commands in Mimic: any thoughts Patrick? These commands have already been removed (obsoleted) in master/Mimic. You can no longer use them. In Luminous, the commands are deprecated (basically,

Re: [ceph-users] Updating standby mds from 12.2.2 to 12.2.4 caused up:active 12.2.2 mds's to suicide

2018-03-14 Thread Patrick Donnelly
On Wed, Mar 14, 2018 at 5:48 AM, Lars Marowsky-Bree wrote: > On 2018-02-28T02:38:34, Patrick Donnelly wrote: > >> I think it will be necessary to reduce the actives to 1 (max_mds -> 1; >> deactivate other ranks), shutdown standbys, upgrade the single active, >> then

Re: [ceph-users] rctime not tracking inode ctime

2018-03-14 Thread Patrick Donnelly
t to reflect changes to directory inodes. Traditionally, modifying a file (truncate, write) does not involve metadata changes to a directory inode. Whether that is the intended behavior is a good question. Perhaps it should be changed? -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cephfs and number of clients

2018-03-20 Thread Patrick Donnelly
l2 > /vol3 > /vol4 > > At the moment the root of the cephfs filesystem is mounted to each web > server. The query is would there be a benefit to having separate mount > points for each folder like above? Performance benefit? No. Data isolation

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-03-27 Thread Patrick Donnelly
": 2, > "traverse": 1591668547, > "traverse_hit": 1259482170, > "traverse_forward": 0, > "traverse_discover": 0, > "traverse_dir_fetch": 30827836, > "traverse_remote_ino": 7510

Re: [ceph-users] Is it possible to suggest the active MDS to move to a datacenter ?

2018-03-29 Thread Patrick Donnelly
et exist for NFS-Ganesha+CephFS outside of Openstack Queens deployments. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph-fuse segfaults

2018-04-02 Thread Patrick Donnelly
mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cephfs hardlink snapshot

2018-04-05 Thread Patrick Donnelly
Hardlink handling for snapshots will be in Mimic. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs snapshot format upgrade

2018-04-10 Thread Patrick Donnelly
ze() >1)) && latest_scrubbed_version < > mimic This sounds like the right approach to me. The mons should also be capable of performing the same test and raising a health error that pre-Mimic MDSs must be started and the number of actives be reduced to 1. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Patrick Donnelly
mds-cluster -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-12 Thread Patrick Donnelly
On Thu, Apr 12, 2018 at 5:05 AM, Mark Schouten wrote: > On Wed, 2018-04-11 at 17:10 -0700, Patrick Donnelly wrote: >> No longer recommended. See: >> http://docs.ceph.com/docs/master/cephfs/upgrading/#upgrading-the-mds- >> cluster > > Shouldn't docs.ceph.com/docs/l

Re: [ceph-users] cephfs luminous 12.2.4 - multi-active MDSes with manual pinning

2018-04-24 Thread Patrick Donnelly
As Dan said, this is simply a spurious log message. Nothing is being exported. This will be fixed in 12.2.6 as part of several fixes to the load balancer: https://github.com/ceph/ceph/pull/21412/commits/cace918dd044b979cd0d54b16a6296094c8a9f90 -- Patrick Donnelly __

Re: [ceph-users] Multi-MDS Failover

2018-04-26 Thread Patrick Donnelly
ccess ceph folder and everything seems to > be running. It depends(tm) on how the metadata is distributed and what locks are held by each MDS. Standbys are not optional in any production cluster. -- Patrick Donnelly ___ ceph-users mailing list ceph-user

Re: [ceph-users] Multi-MDS Failover

2018-04-26 Thread Patrick Donnelly
error or becoming unavailable -- when the standbys exist to make the system available. There's nothing to enforce. A warning is sufficient for the operator that (a) they didn't configure any standbys or (b) MDS daemon processes/boxes are going away and not coming back as standbys

Re: [ceph-users] Multi-MDS Failover

2018-04-27 Thread Patrick Donnelly
the necessary locks. No metadata is lost. No inconsistency is created between clients. Full availability will be restored when the lost ranks come back online. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 12.2.4 Both Ceph MDS nodes crashed. Please help.

2018-04-30 Thread Patrick Donnelly
dsend! Thanks for keeping the list apprised of your efforts. Since this is so easily reproduced for you, I would suggest that you next get higher debug logs (debug_mds=20/debug_ms=1) from the MDS. And, since this is a segmentation fault, a backtrace with debug symbols from gdb would als

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-10 Thread Patrick Donnelly
limit. If the cache size is larger than the limit (use `cache status` admin socket command) then we'd be interested in seeing a few seconds of the MDS debug log with higher debugging set (`config set debug_mds 20`). -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-10 Thread Patrick Donnelly
process is using over 100GB RAM in my > 128GB host. I thought I was playing it safe by configuring at 80. What other > things consume a lot of RAM for this process? > > Let me know if I need to create a new thread. The cache size measurement is imprecise pre-12.2.5 [1]. Yo

Re: [ceph-users] Too many active mds servers

2018-05-15 Thread Patrick Donnelly
> > I've run ceph fs set cephfs max_mds 2 which set the max_mds from 3 to 2 but > has no effect on my running configuration. http://docs.ceph.com/docs/luminous/cephfs/multimds/#decreasing-the-number-of-ranks Note: the behavior is changing in Mimic to be automatic

Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Patrick Donnelly
can handle the load for you without trying to micromanage things via pins. You can use pinning to isolate metadata load from other ranks as a stop-gap measure. [1] https://github.com/ceph/ceph/pull/21412 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Patrick Donnelly
think). It may be possible to allow the rename in the MDS and check quotas there. I've filed a tracker ticket here: http://tracker.ceph.com/issues/24305 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Mimic (13.2.0) Release Notes Bug on CephFS Snapshot Upgrades

2018-06-07 Thread Patrick Donnelly
13-2-0-mimic-released/ [3] https://github.com/ceph/ceph/pull/22445/files -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS mount in Kubernetes requires setenforce

2018-06-18 Thread Patrick Donnelly
he mounted > CephFS volumes unless I do "setenforce 0" on the CentOS hosts. Is this > expected? Is there a better way to enable pods to write to the CephFS > volumes? It's a known issue that the CephFS kernel client doesn't work with SELinux yet: http://tra

Re: [ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Patrick Donnelly
ot sure if that's normally recommended.] Thanks for your writeup! -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] systemd and ceph-mon autostart on Ubuntu 16.04

2017-01-25 Thread Patrick Donnelly
o automatically be enabled on package install? That doesn't sound good to me but I'm not familiar with Ubuntu's packaging rules. I would think the sysadmin must enable the services they install themselves. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Passing LUA script via python rados execute

2017-02-19 Thread Patrick Donnelly
RADOS. Users would install locally and then upload the tree through some tool. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Passing LUA script via python rados execute

2017-02-21 Thread Patrick Donnelly
in C++ > might be the best approach for this for now? FYI, since you are writing a book: Lua is not an acronym: https://www.lua.org/about.html#name -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/l

Re: [ceph-users] purging strays faster

2017-03-07 Thread Patrick Donnelly
mmand descriptions: [Errno 2] No such file > or directory > > I am guessing there is a path set up incorrectly somewhere, but I do not > know where to look. You need to run the command on the machine where the daemon is running. -- Patrick Donnelly

Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-04 Thread Patrick Donnelly
by ZFS. Which means that the used > bandwidth to the SSDs is double of what it could be. > > Had some discussion about this, but disabling the Ceph journal is not > just setting an option. Although I would like to test performance of an > OSD with just the ZFS journal. But I expec

Re: [ceph-users] CephFS: ceph-fuse segfaults

2017-04-07 Thread Patrick Donnelly
at > http://voms.simonsfoundation.org:50013/9SXnEpflYPmE6UhM9EgOR3us341eqym/ceph-20170328 This is a reference count bug. I'm afraid it won't be possible to debug it without a higher debug setting (probably "debug client = 0/20"). Be aware that will slow down yo

Re: [ceph-users] Race Condition(?) in CephFS

2017-04-25 Thread Patrick Donnelly
hing in CephFS first. To me, this looks like: http://tracker.ceph.com/issues/17858 Fortunately you should only need to upgrade to 10.2.6 or 10.2.7 to fix this. HTH, -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Failed to read JournalPointer - MDS error (mds rank 0 is damaged)

2017-05-02 Thread Patrick Donnelly
Looks like: http://tracker.ceph.com/issues/17236 The fix is in v10.2.6. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Removing MDS

2018-10-30 Thread Patrick Donnelly
deactivation. To get more help, you need to describe your environment, version of Ceph in use, relevant log snippets, etc. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mon:failed in thread_name:safe_timer

2018-11-19 Thread Patrick Donnelly
the OSDs): https://tracker.ceph.com/issues/35848 -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-20 Thread Patrick Donnelly
You either need to accept that reads/writes will land on different data centers, primary OSD for a given pool is always in the desired data center, or some other non-Ceph solution which will have either expensive, eventual, or false consistency. On Fri, Nov 16, 2018, 10:07 AM Vlad Kopylov This is

Re: [ceph-users] mon:failed in thread_name:safe_timer

2018-11-21 Thread Patrick Donnelly
t; confused about it. How did you restart the MDSs? If you used `ceph mds fail` then the executable version (v12.2.8) will not change. Also, the monitor failure requires updating the monitor to v12.2.9. What version is the mons? -- Patrick Donnelly ___ ceph-u

Re: [ceph-users] CephFS MDS optimal setup on Google Cloud

2019-01-07 Thread Patrick Donnelly
ing the MDS global lock, is it it a single lock per MDS or is it a > global distributed lock for all MDSs? per-MDS -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-maintainers] v13.2.4 Mimic released

2019-01-08 Thread Patrick Donnelly
you are already running v13.2.2, > >>upgrading to v13.2.3 does not require special action. > > Any special action for upgrading from 13.2.1 ? No special actions for CephFS are required for the upgrade. -- Patrick Donnelly ___ ceph-users

Re: [ceph-users] tuning ceph mds cache settings

2019-01-09 Thread Patrick Donnelly
a "daemon mds.blah > cache drop". The performance bump lasts for quite a long time--far longer > than it takes for the cache to "fill" according to the stats. What version of Ceph are you running? Can you expand on what this performance im

Re: [ceph-users] CEPH_FSAL Nfs-ganesha

2019-01-15 Thread Patrick Donnelly
d inodes) and the NFS server is expected to have a lot of load, breaking out the exports can have a positive impact on performance. If there are hard links, then the clients associated with the exports will potentially fight over capabilities which will add to request latency.) -- Patr

Re: [ceph-users] How to do multiple cephfs mounts.

2019-01-17 Thread Patrick Donnelly
; Using 2 kernel mounts on CentOS 7.6 It's unlikely this changes anything unless you also split the workload into two. That may allow the kernel to do parallel requests? -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ce

Re: [ceph-users] Multi-filesystem wthin a cluster

2019-01-17 Thread Patrick Donnelly
On Thu, Jan 17, 2019 at 2:44 AM Dan van der Ster wrote: > > On Wed, Jan 16, 2019 at 11:17 PM Patrick Donnelly wrote: > > > > On Wed, Jan 16, 2019 at 1:21 AM Marvin Zhang wrote: > > > Hi CephFS experts, > > > From document, I know multi-fs within a cluste

Re: [ceph-users] Update / upgrade cluster with MDS from 12.2.7 to 12.2.11

2019-02-11 Thread Patrick Donnelly
inadvertently think something went wrong. If you don't mind seeing those errors and you're using 1 active MDS, then don't worry about it. Good luck! -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS - read latency.

2019-02-18 Thread Patrick Donnelly
as a median, 32ms average is still on the high side, > but way, way better. I'll use this opportunity to point out that serial archive programs like tar are terrible for distributed file systems. It would be awesome if someone multithreaded tar or extended it for asynchronous

Re: [ceph-users] Understanding EC properties for CephFS / small files.

2019-02-18 Thread Patrick Donnelly
13c $ bin/rados -p cephfs.a.data stat 13c. cephfs.a.data/100003c. mtime 2019-02-18 14:02:11.00, size 211224 So the object holding "grep" still only uses ~200KB and not 4MB. -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] faster switch to another mds

2019-02-20 Thread Patrick Donnelly
oes inform the monitors if it has been shutdown. If you pull the plug or SIGKILL, it does not. :) -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS_SLOW_METADATA_IO

2019-02-28 Thread Patrick Donnelly
dumped ops in flight on the MDSes but all ops that are printed are > finished in a split second (duration: 0.000152), flag_point": "acquired > locks". I believe you're looking at the wrong "ops" dump. You want to check "objector_requests". -- Patric

Re: [ceph-users] How To Scale Ceph for Large Numbers of Clients?

2019-03-06 Thread Patrick Donnelly
ouble believing the MDS still at 3GB memory usage with that number of clients and mds_cache_memory_limit=17179869184 (16GB). -- Patrick Donnelly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  1   2   >