Re: [ceph-users] ceph df shows global-used more than real data size

2019-12-24 Thread Hector Martin
need to reduce bluestore_min_alloc_size. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS deletion performance

2019-09-18 Thread Hector Martin
g cleaned up. I guess I'll see once I catch up on snapshot deletions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS deletion performance

2019-09-14 Thread Hector Martin
On 13/09/2019 16.25, Hector Martin wrote: > Is this expected for CephFS? I know data deletions are asynchronous, but > not being able to delete metadata/directories without an undue impact on > the whole filesystem performance is somewhat problematic. I think I'm getting a feelin

[ceph-users] CephFS deletion performance

2019-09-13 Thread Hector Martin
ch at that time, so I'm not sure what the bottleneck is here. Is this expected for CephFS? I know data deletions are asynchronous, but not being able to delete metadata/directories without an undue impact on the whole filesystem performance is somewhat problematic. -- Hector Martin (hec...@marc

Re: [ceph-users] Ceph for "home lab" / hobbyist use?

2019-09-10 Thread Hector Martin
magine you should be able to get reasonable aggregate performance out of the whole thing, but I've never tried a setup like that. I'm actually considering this kind of thing in the future (moving from one monolithic server to a more cluster-like setup) but it's just an idea for now. -- Hec

Re: [ceph-users] Stray count increasing due to snapshots (?)

2019-09-05 Thread Hector Martin
involve keeping two months worth of snapshots? That CephFS can't support this kind of use case (and in general that CephFS uses the stray subdir persistently for files in snapshots that could remain forever, while the stray dirs don't scale) sounds like a bug. -- Hector Martin (hec

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Hector Martin
dx 0xd788 <+536>: mov%edx,0x48(%r15) That means req->r_reply_info.filelock_reply was NULL. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-13 Thread Hector Martin
about reconnections and such) and seems to be fine. I can't find these errors anywhere, so I'm guessing they're not known bugs? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.co

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-27 Thread Hector Martin
y and tested and everything seems fine. I deployed it to production and got rid of the drop_caches hack and I've seen no stuck ops for two days so far. If there is a bug or PR opened for this can you point me to it so I can track when it goes into a release? Thanks! -- Hector Martin (hec...@ma

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-19 Thread Hector Martin
On 13/06/2019 14.31, Hector Martin wrote: > On 12/06/2019 22.33, Yan, Zheng wrote: >> I have tracked down the bug. thank you for reporting this. 'echo 2 > >> /proc/sys/vm/drop_cache' should fix the hang. If you can compile ceph >> from source, please try following patch

Re: [ceph-users] Protecting against catastrophic failure of host filesystem

2019-06-18 Thread Hector Martin
if they need to be base64 decoded or what have you) if you really want to go this route. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users

Re: [ceph-users] Ceph Clients Upgrade?

2019-06-18 Thread Hector Martin
s@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- *Pardhiv Karri* "Rise and Rise again untilLAMBSbecome LIONS" ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Broken mirrors: hk, us-east, de, se, cz, gigenet

2019-06-16 Thread Hector Martin
://mirrors.gigenet.com/ceph/ This one is *way* behind on sync, it doesn't even have Nautilus. Perhaps there should be some monitoring for public mirror quality? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
NAP)) > cap->mark_needsnapflush(); > } > > > That was quick, thanks! I can build from source but I won't have time to do so and test it until next week, if that's okay. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___

[ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
"event": "dispatched" }, { "time": "2019-06-12 16:15:59.096318", "event": "failed to rdlock, waiting" }, { "time": "2019-06-12 16:15:59.268368", "event": "failed to rdlock, waiting" } ] } } ], "num_ops": 1 } My guess is somewhere along the line of this process there's a race condition and the dirty client isn't properly flushing its data. A 'sync' on host2 does not clear the stuck op. 'echo 1 > /proc/sys/vm/drop_caches' does not either, while 'echo 2 > /proc/sys/vm/drop_caches' does fix it. So I guess the problem is a dentry/inode that is stuck dirty in the cache of host2? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-10 Thread Hector Martin
ves in, and you need to hit all 3). This is marginally higher than the ~ 0.00891% with uniformly distributed PGs, because you've eliminated all sets of OSDs which share a host. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ c

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Hector Martin
t. https://www.memset.com/support/resources/raid-calculator/ I'll take a look tonight :) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Hector Martin
EC encoding, but if you do lose a PG you'll lose more data because there are fewer PGs. Feedback on my math welcome. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] Rebuild after upgrade

2019-03-17 Thread Hector Martin
/ In particular, you turned on CRUSH_TUNALBLES5, which causes a large amount of data movement: http://docs.ceph.com/docs/master/rados/operations/crush-map/#jewel-crush-tunables5 Going from Firefly to Hammer has a much smaller impact (see the CRUSH_V4 section). -- Hector Martin (hec...@marcansoft.com

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
oit.io> > Tel: +49 89 1896585 90 > > On Tue, Mar 12, 2019 at 10:07 AM Hector Martin > mailto:hec...@marcansoft.com>> wrote: > > > > It's worth noting that most containerized deployments can effectively > > limit RAM for containers (cgroups

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
e per day), have one active metadata server, and change several TB daily - it's much, *much* faster than with fuse. Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding. ta ta Jake On 3/6/19 11:10 AM, Hector Martin wrote: On 06/03/2019 12:07, Zhenshi Zhou wrote: Hi, I'm gon

Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread Hector Martin
been doing this on two machines (single-host Ceph clusters) for months with no ill effects. The FUSE client performs a lot worse than the kernel client, so I switched to the latter, and it's been working well with no deadlocks. -- Hector Martin (hec...@marcansoft.com) Public Key: https

Re: [ceph-users] Erasure coded pools and ceph failure domain setup

2019-03-04 Thread Hector Martin
m OSDs without regard for the hosts; you will be able to use effectively any EC widths you want, but there will be no guarantees of data durability if you lose a whole host. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph

Re: [ceph-users] Mimic and cephfs

2019-02-27 Thread Hector Martin
for months now without any issues in two single-host setups. I'm also in the process of testing and migrating a production cluster workload from a different setup to CephFS on 13.2.4 and it's looking good. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-27 Thread Hector Martin
2 Sept 2028, the day and month are also wrong. Obvious question: are you sure the date/time on your cluster nodes and your clients is correct? Can you track down which files (if any) have the ctime in the future by following the rctime down the filesystem

Re: [ceph-users] Files in CephFS data pool

2019-02-26 Thread Hector Martin
to know about all the files in a pool. As far as I can tell you *can* read the ceph.file.layout.pool xattr on any files in CephFS, even those that haven't had it explicitly set. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub

Re: [ceph-users] Controlling CephFS hard link "primary name" for recursive stat

2019-02-12 Thread Hector Martin
tat(), right. (I only just realized this :-)) Are there Python bindings for what ceph-dencoder does, or at least a C API? I could shell out to ceph-dencoder but I imagine that won't be too great for performance. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-12 Thread Hector Martin
file (formerly variable length, now I just pad it to the full 128 bytes and rewrite it in-place). This is good information to know for optimizing things :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Hector Martin
the cluster seed. > > > I appreciate small clusters are not the target use case of Ceph, but > everyone has to start somewhere! > > _______ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

[ceph-users] Controlling CephFS hard link "primary name" for recursive stat

2019-02-08 Thread Hector Martin
ust do the above dance for every hardlinked file to move the primaries off, but this seems fragile and likely to break in certain situations (or do needless work). Any other ideas? Thanks, -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] change OSD IP it uses

2019-02-08 Thread Hector Martin
an-it' and make sure there isn't a spurious entry for it in ceph.conf, then re-deploy. Once you do that there is no possible other place for the OSD to somehow remember its old IP. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Hector Martin
lt data pool. The FSMap seems to store pools by ID, not by name, so renaming the pools won't work. This past thread has an untested procedure for migrating CephFS pools: https://www.spinics.net/lists/ceph-users/msg29536.html -- Hector Martin (hec...@marcansoft.com) Public Key: https:/

Re: [ceph-users] change OSD IP it uses

2019-02-08 Thread Hector Martin
g to connect via the external IP of that node. Does your ceph.conf have the right network settings? Compare it with the other nodes. Also check that your network interfaces and routes are correctly configured on the problem node, of course. -- Hector Martin (hec...@marcansoft.com) Public Key: htt

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
apparently created on deletion (I wasn't aware of this). So for ~700 snapshots the output you're seeing is normal. It seems that using a "rolling snapshot" pattern in CephFS inherently creates a "one present, one deleted" pattern in the underlying pools. -- Hector Martin (hec...@ma

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-07 Thread Hector Martin
without truncation does not. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
might want to go through your snapshots and check that you aren't leaking old snapshots forever, or deleting the wrong ones. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
ere's some discussion on this here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020510.html -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://list

[ceph-users] CephFS overwrite/truncate performance hit

2019-02-06 Thread Hector Martin
they're atomic. Is there any documentation on what write operations incur significant overhead on CephFS like this, and why? This particular issue isn't mentioned in http://docs.ceph.com/docs/master/cephfs/app-best-practices/ (which seems like it mostly deals with reads, not writes). -- Hecto

Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-04 Thread Hector Martin
> /etc/conf.d/ceph-osd The Gentoo initscript setup for Ceph is unfortunately not very well documented. I've been meaning to write a blogpost about this to try to share what I've learned :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc __

[ceph-users] CephFS performance vs. underlying storage

2019-01-30 Thread Hector Martin
s a slight disadvantage here because its chunk of the drives is logically after the traditional RAID, and HDDs get slower towards higher logical addresses, but this should be on the order of a 15-20% hit at most. -- Hector Martin (hec...@marcansoft.com) Public Key: https://m

Re: [ceph-users] Boot volume on OSD device

2019-01-20 Thread Hector Martin
LVM on both ends!) Ultimately a lot of this is dictated by whatever tools you feel comfortable using :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Boot volume on OSD device

2019-01-18 Thread Hector Martin
On 19/01/2019 02.24, Brian Topping wrote: > > >> On Jan 18, 2019, at 4:29 AM, Hector Martin wrote: >> >> On 12/01/2019 15:07, Brian Topping wrote: >>> I’m a little nervous that BlueStore assumes it owns the partition table and >>> will not be happy tha

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-18 Thread Hector Martin
On 18/01/2019 22.33, Alfredo Deza wrote: > On Fri, Jan 18, 2019 at 7:07 AM Hector Martin wrote: >> >> On 17/01/2019 00:45, Sage Weil wrote: >>> Hi everyone, >>> >>> This has come up several times before, but we need to make a final >>> decis

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-18 Thread Hector Martin
) to hopefully squash more lurking Python 3 bugs. (just my 2c - maybe I got unlucky and otherwise things work well enough for everyone else in Py3; I'm certainly happy to get rid of Py2 ASAP). -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc

Re: [ceph-users] Suggestions/experiences with mixed disk sizes and models from 4TB - 14TB

2019-01-18 Thread Hector Martin
so far in my home cluster, but I haven't finished setting things up yet. Those are definitely not SMR. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com h

Re: [ceph-users] Boot volume on OSD device

2019-01-18 Thread Hector Martin
with some custom code, but then normal usage just uses ceph-disk (it certainly doesn't care about extra partitions once everything is set up). This was formerly FileStore and now BlueStore, but it's a legacy setup. I expect to move this over to ceph-volume at some point. -- Hector Martin (hec

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
blem then, good to know it isn't *supposed* to work yet :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
g-bluefs 20 --log-file bluefs-bdev-expand.log" Perhaps it makes sense to open a ticket at ceph bug tracker to proceed... Thanks, Igor On 12/27/2018 12:19 PM, Hector Martin wrote: Hi list, I'm slightly expanding the underlying LV for two OSDs and figured I could use ceph-bluestore-tool to avo

[ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Hector Martin
this again with osd.1 if needed and see if I can get it fixed. Otherwise I'll just re-create it and move on. # ceph --version ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-20 Thread Hector Martin
On 21/12/2018 03.02, Gregory Farnum wrote: > RBD snapshots are indeed crash-consistent. :) > -Greg Thanks for the confirmation! May I suggest putting this little nugget in the docs somewhere? This might help clarify things for others :) -- Hector Martin (hec...@marcansoft.com) Public Key:

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
ionally reset the VM if thawing fails. Ultimately this whole thing is kind of fragile, so if I can get away without freezing at all it would probably make the whole process a lot more robust. -- Hector Martin (hec...@marcansoft.com) Public Key: https://

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
5 minutes, we'll see if the problem recurs. Given this, it makes even more sense to just avoid the freeze if at all reasonable. There's no real way to guarantee that a fsfreeze will complete in a "reasonable" amount of time as far as I can tell. -- Hector Martin (hec...@marcansoft.com)

[ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
has higher impact but also probably a much lower chance of messing up (or having excess latency), since it doesn't involve the guest OS or the qemu agent at all... -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc

Re: [ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-25 Thread Hector Martin
On 26/11/2018 11.05, Yan, Zheng wrote: > On Mon, Nov 26, 2018 at 4:30 AM Hector Martin wrote: >> >> On 26/11/2018 00.19, Paul Emmerich wrote: >>> No, wait. Which system did kernel panic? Your CephFS client running rsync? >>> In this case this would be expect

Re: [ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-25 Thread Hector Martin
ose pages are flushed? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-23 Thread Hector Martin
hings go down at once?) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Effects of restoring a cluster's mon from an older backup

2018-11-12 Thread Hector Martin
some DR tests when I set this up, to prove to myself that it all works out :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Effects of restoring a cluster's mon from an older backup

2018-11-08 Thread Hector Martin
://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds Would this be preferable to just restoring the mon from a backup? What about the MDS map? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2018-11-08 Thread Hector Martin
the cache, OSDs will creep up in memory usage up to some threshold, and I'm not sure what determines what that baseline usage is or whether it can be controlled. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-u

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
ceph ceph 6 Oct 28 16:12 ready -rw--- 1 ceph ceph 10 Oct 28 16:12 type -rw--- 1 ceph ceph 3 Oct 28 16:12 whoami (lockbox.keyring is for encryption, which you do not use) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mr

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
because there is a backup at the end of the device, but wipefs *should* know about that as far as I know. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.co

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
ileStore remnant tries to mount phantom partitions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin "marcan"
zap these 10 osds and start over although at >this point I am afraid even zapping may not be a simple task > > > >On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin >wrote: > >> On 11/7/18 5:27 AM, Hayashida, Mami wrote: >> > 1. Stopped osd.60-69: no

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
re of by the ceph-volume activation. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
, it's safe to move or delete all those OSD directories for BlueStore OSDs and try activating them cleanly again, which hopefully will do the right thing. In the end this all might fix your device ownership woes too, making the udev rule unnecessary. If it all works out, try a reboot and

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
and "mount | grep osd" instead and see if ceph-60 through ceph-69 show up. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
        8:112  0   3.7T  0 disk  > └─hdd60-data60 252:1    0   3.7T  0 lvm  > > and "ceph osd tree" shows  > 60   hdd    3.63689         osd.60         up  1.0 1.0 That looks correct as far as the weight goes, but I'm really confused as to why you have a 10GB "bl

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
links to block devices. I'm not sure what happened there. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
SUBSYSTEM=="block", ENV{DEVTYPE}=="disk", \ ENV{DM_LV_NAME}=="db*", ENV{DM_VG_NAME}=="ssd0", \ OWNER="ceph", GROUP="ceph", MODE="660" Reboot after that and see if the OSDs come up without further action. -- Hector Martin (hec

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
On 11/6/18 3:21 AM, Alfredo Deza wrote: > On Mon, Nov 5, 2018 at 11:51 AM Hector Martin wrote: >> >> Those units don't get triggered out of nowhere, there has to be a >> partition table with magic GUIDs or a fstab or something to cause them >> to be triggered. The bett

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
hat references any of the old partitions that don't exist (/dev/sdh1 etc) should be removed. The disks are now full-disk LVM PVs and should have no partitions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users maili

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
;>> > ├─ssd0-db61    252:1    0    40G  0 lvm > >> >> >>> > ├─ssd0-db62    252:2    0    40G  0 lvm > >> >> >>> > ├─ssd0-db63    252:3    0    40G  0 lvm > >> >> >>> > ├─ssd0-db64    252:4    0    40G  0 lvm > >> >> >>> > ├─ssd0-db65    252:5    0    40G  0 lvm &g

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
On 11/6/18 1:08 AM, Hector Martin wrote: > On 11/6/18 12:42 AM, Hayashida, Mami wrote: >> Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not >> mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns >> nothing.).  They don't show up wh

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
emd is still trying to mount the old OSDs, which used disk partitions. Look in /etc/fstab and in /etc/systemd/system for any references to those filesystems and get rid of them. /dev/sdh1 and company no longer exist, and nothing should reference them. -- Hector Martin (hec...@marcansoft.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-04 Thread Hector Martin
just try to start the OSDs again? Maybe check the overall system log with journalctl for hints. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-10-31 Thread Hector Martin
... ceph-volume lvm activate --all I think it might be possible to just let ceph-volume create the PV/VG/LV for the data disks and only manually create the DB LVs, but it shouldn't hurt to do it on your own and just give ready-made LVs to ceph-volume for everything. -- Hector Martin (hec

Re: [ceph-users] CephFS dropping data with rsync?

2018-06-15 Thread Hector Martin
On 2018-06-16 13:04, Hector Martin wrote: > I'm at a loss as to what happened here. Okay, I just realized CephFS has a default 1TB file size... that explains what triggered the problem. I just bumped it to 10TB. What that doesn't explain is why rsync didn't complain about anything. Maybe w

[ceph-users] CephFS dropping data with rsync?

2018-06-15 Thread Hector Martin
t've happened here? If this happens again / is reproducible I'll try to see if I can do some more debugging... -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://l

[ceph-users] pg_num docs conflict with Hammer PG count warning

2015-08-06 Thread Hector Martin
is currently very overprovisioned for space, so we're probably not going to be adding OSDs for quite a while, but we'll be adding pools. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users

Re: [ceph-users] pg_num docs conflict with Hammer PG count warning

2015-08-06 Thread Hector Martin
as you add pools. We are following the hardware recommendations for RAM: 1GB per 1TB of storage, so 16GB for each OSD box (4GB per OSD daemon, each OSD being one 4TB drive). -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Hector Martin
On 06/02/15 21:07, Udo Lembke wrote: Am 06.02.2015 09:06, schrieb Hector Martin: On 02/02/15 03:38, Udo Lembke wrote: With 3 hosts only you can't survive an full node failure, because for that you need host = k + m. Sure you can. k=2, m=1 with the failure domain set to host will survive