Re: [ceph-users] Issues with Nautilus 14.2.6 ceph-volume lvm batch --bluestore ?

2020-01-20 Thread Janne Johansson
Den mån 20 jan. 2020 kl 09:03 skrev Dave Hall : > Hello, > Since upgrading to Nautilus (+ Debian 10 Backports), when I issue > 'ceph-volume lvm batch --bluestore ' it fails with > > bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid > > I previously had Luminous + Debian 9

Re: [ceph-users] block db sizing and calculation

2020-01-14 Thread Janne Johansson
(sorry for empty mail just before) > i'm plannung to split the block db to a seperate flash device which i >> also would like to use as an OSD for erasure coding metadata for rbd >> devices. >> >> If i want to use 14x 14TB HDDs per Node >> >>

Re: [ceph-users] block db sizing and calculation

2020-01-14 Thread Janne Johansson
Den mån 13 jan. 2020 kl 08:09 skrev Stefan Priebe - Profihost AG < s.pri...@profihost.ag>: > Hello, > > i'm plannung to split the block db to a seperate flash device which i > also would like to use as an OSD for erasure coding metadata for rbd > devices. > > If i want to use 14x 14TB HDDs per

Re: [ceph-users] Looking for experience

2020-01-09 Thread Janne Johansson
> > > I'm currently trying to workout a concept for a ceph cluster which can > be used as a target for backups which satisfies the following requirements: > > - approx. write speed of 40.000 IOP/s and 2500 Mbyte/s > You might need to have a large (at least non-1) number of writers to get to that

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-04 Thread Janne Johansson
Den tors 5 dec. 2019 kl 00:28 skrev Milan Kupcevic < milan_kupce...@harvard.edu>: > > > There is plenty of space to take more than a few failed nodes. But the > question was about what is going on inside a node with a few failed > drives. Current Ceph behavior keeps increasing number of placement

Re: [ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Janne Johansson
Den ons 4 dec. 2019 kl 09:57 skrev Marc Roos : > > But I guess that in 'ceph osd tree' the ssd's were then also displayed > as hdd? > Probably, and the difference in perf would be the different defaults hdd gets vs ssd OSDs with regards to bluestore caches. -- May the most significant bit of

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-04 Thread Janne Johansson
Den ons 4 dec. 2019 kl 01:37 skrev Milan Kupcevic < milan_kupce...@harvard.edu>: > This cluster can handle this case at this moment as it has got plenty of > free space. I wonder how is this going to play out when we get to 90% of > usage on the whole cluster. A single backplane failure in a node

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-11-26 Thread Janne Johansson
It's mentioned here among other places https://books.google.se/books?id=vuiLDwAAQBAJ=PA79=PA79=rocksdb+sizes+3+30+300+g=bl=TlH4GR0E8P=ACfU3U0QOJQZ05POZL9DQFBVwTapML81Ew=en=X=2ahUKEwiPscq57YfmAhVkwosKHY1bB1YQ6AEwAnoECAoQAQ#v=onepage=rocksdb%20sizes%203%2030%20300%20g=false The 4% was a quick

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Janne Johansson
Den fre 15 nov. 2019 kl 19:40 skrev Mike Cave : > So would you recommend doing an entire node at the same time or per-osd? > You should be able to do it per-OSD (or per-disk in case you run more than one OSD per disk), to minimize data movement over the network, letting other OSDs on the same

Re: [ceph-users] Strange CEPH_ARGS problems

2019-11-15 Thread Janne Johansson
Is the flip between the client name "rz" and "user" also a mistype? It's hard to divinate if it is intentional or not since you are mixing it about. Den fre 15 nov. 2019 kl 10:57 skrev Rainer Krienke : > I found a typo in my post: > > Of course I tried > > export CEPH_ARGS="-n client.rz

Re: [ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-05 Thread Janne Johansson
Den tis 5 nov. 2019 kl 19:10 skrev J David : > On Tue, Nov 5, 2019 at 3:18 AM Paul Emmerich > wrote: > > could be a new feature, I've only realized this exists/works since > Nautilus. > > You seem to be a relatively old version since you still have ceph-disk > installed > > The next approach may

Re: [ceph-users] Bluestore runs out of space and dies

2019-10-31 Thread Janne Johansson
Den tors 31 okt. 2019 kl 15:07 skrev George Shuklin < george.shuk...@gmail.com>: > Thank you everyone, I got it. There is no way to fix out-of-space > bluestore without expanding it. > > Therefore, in production we would stick with 99%FREE size for LV, as it > gives operators 'last chance' to

Re: [ceph-users] Ceph pg in inactive state

2019-10-31 Thread Janne Johansson
Den tors 31 okt. 2019 kl 04:22 skrev soumya tr : > Thanks 潘东元 for the response. > > The creation of a new pool works, and all the PGs corresponding to that > pool have active+clean state. > > When I initially set ceph 3 node cluster using juju charms (replication > count per object was set to 3),

Re: [ceph-users] Can't create erasure coded pools with k+m greater than hosts?

2019-10-24 Thread Janne Johansson
(Slightly abbreviated) Den tors 24 okt. 2019 kl 09:24 skrev Frank Schilder : > What I learned are the following: > > 1) Avoid this work-around too few hosts for EC rule at all cost. > > 2) Do not use EC 2+1. It does not offer anything interesting for > production. Use 4+2 (or 8+2, 8+3 if you

Re: [ceph-users] cluster network down

2019-09-30 Thread Janne Johansson
> > I don't remember where I read it, but it was told that the cluster is > migrating its complete traffic over to the public network when the cluster > networks goes down. So this seems not to be the case? > Be careful with generalizations like "when a network acts up, it will be completely down

Re: [ceph-users] ceph-volume lvm create leaves half-built OSDs lying around

2019-09-11 Thread Janne Johansson
Den ons 11 sep. 2019 kl 12:18 skrev Matthew Vernon : > We keep finding part-made OSDs (they appear not attached to any host, > and down and out; but still counting towards the number of OSDs); we > never saw this with ceph-disk. On investigation, this is because > ceph-volume lvm create makes the

Re: [ceph-users] WAL/DB size

2019-08-15 Thread Janne Johansson
Den tors 15 aug. 2019 kl 00:16 skrev Anthony D'Atri : > Good points in both posts, but I think there’s still some unclarity. > ... > We’ve seen good explanations on the list of why only specific DB sizes, > say 30GB, are actually used _for the DB_. > If the WAL goes along with the DB,

Re: [ceph-users] strange backfill delay after outing one node

2019-08-14 Thread Janne Johansson
Den ons 14 aug. 2019 kl 09:49 skrev Simon Oosthoek : > Hi all, > > Yesterday I marked out all the osds on one node in our new cluster to > reconfigure them with WAL/DB on their NVMe devices, but it is taking > ages to rebalance. > > > ceph tell 'osd.*' injectargs '--osd-max-backfills 16' > >

Re: [ceph-users] High memory usage OSD with BlueStore

2019-08-01 Thread Janne Johansson
Den tors 1 aug. 2019 kl 11:31 skrev dannyyang(杨耿丹) : > H all: > > we have a cephfs env,ceph version is 12.2.10,server in arm,but fuse clients > are x86, > osd disk size is 8T,some osd use 12GB memory,is that normal? > > For bluestore, there are certain tuneables you can use to limit memory a

Re: [ceph-users] Urgent Help Needed (regarding rbd cache)

2019-08-01 Thread Janne Johansson
Den tors 1 aug. 2019 kl 07:31 skrev Muhammad Junaid : > Your email has cleared many things to me. Let me repeat my understanding. > Every Critical data (Like Oracle/Any Other DB) writes will be done with > sync, fsync flags, meaning they will be only confirmed to DB/APP after it > is actually

Re: [ceph-users] Urgent Help Needed (regarding rbd cache)

2019-07-31 Thread Janne Johansson
Den ons 31 juli 2019 kl 06:55 skrev Muhammad Junaid : > The question is about RBD Cache in write-back mode using KVM/libvirt. If > we enable this, it uses local KVM Host's RAM as cache for VM's write > requests. And KVM Host immediately responds to VM's OS that data has been > written to Disk

Re: [ceph-users] Problems understanding 'ceph-features' output

2019-07-30 Thread Janne Johansson
Den tis 30 juli 2019 kl 10:33 skrev Massimo Sgaravatto < massimo.sgarava...@gmail.com>: > The documentation that I have seen says that the minimum requirements for > clients to use upmap are: > - CentOs 7.5 or kernel 4.5 > - Luminous version > E.g. right now I am interested about

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Janne Johansson
Den tors 25 juli 2019 kl 10:47 skrev 展荣臻(信泰) : > > 1、Adding osds in same one failure domain is to ensure only one PG in pg up > set (ceph pg dump shows)to remap. > 2、Setting "osd_pool_default_min_size=1" is to ensure objects to read/write > uninterruptedly while pg remap. > Is this wrong? > How

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Janne Johansson
Den tors 25 juli 2019 kl 04:36 skrev zhanrzh...@teamsun.com.cn < zhanrzh...@teamsun.com.cn>: > I think it should to set "osd_pool_default_min_size=1" before you add osd , > and the osd that you add at a time should in same Failure domain. > That sounds like weird or even bad advice? What is

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Janne Johansson
Den ons 24 juli 2019 kl 21:48 skrev Wido den Hollander : > Right now I'm just trying to find a clever solution to this. It's a 2k > OSD cluster and the likelihood of an host or OSD crashing is reasonable > while you are performing maintenance on a different host. > > All kinds of things have

Re: [ceph-users] Future of Filestore?

2019-07-19 Thread Janne Johansson
Den fre 19 juli 2019 kl 12:43 skrev Marc Roos : > > Maybe a bit of topic, just curious what speeds did you get previously? > Depending on how you test your native drive of 5400rpm, the performance > could be similar. 4k random read of my 7200rpm/5400 rpm results in > ~60iops at 260kB/s. > I also

Re: [ceph-users] What if etcd is lost

2019-07-16 Thread Janne Johansson
Den tis 16 juli 2019 kl 18:15 skrev Oscar Segarra : > Hi Paul, > That is the initial question, is it possible to recover my ceph cluster > (docker based) if I loose all information stored in the etcd... > I don't know if anyone has a clear answer to these questions.. > 1.- I bootstrap a complete

Re: [ceph-users] What if etcd is lost

2019-07-16 Thread Janne Johansson
t; -e KV_TYPE=etcd \ > -e KV_IP=192.168.0.20 \ > ceph/daemon osd > > Thanks a lot for your help, > > Óscar > > > > > El mar., 16 jul. 2019 17:34, Janne Johansson > escribió: > >> Den mån 15 juli 2019 kl 23:05 skrev Oscar Segarra < >> oscar.sega

Re: [ceph-users] What if etcd is lost

2019-07-16 Thread Janne Johansson
Den mån 15 juli 2019 kl 23:05 skrev Oscar Segarra : > Hi Frank, > Thanks a lot for your quick response. > Yes, the use case that concerns me is the following: > 1.- I bootstrap a complete cluster mons, osds, mgr, mds, nfs, etc using > etcd as a key store > as a key store ... for what? Are you

Re: [ceph-users] pools limit

2019-07-16 Thread Janne Johansson
Den tis 16 juli 2019 kl 16:16 skrev M Ranga Swami Reddy < swamire...@gmail.com>: > Hello - I have created 10 nodes ceph cluster with 14.x version. Can you > please confirm below: > Q1 - Can I create 100+ pool (or more) on the cluster? (the reason is - > creating a pool per project). Any

Re: [ceph-users] OSD's won't start - thread abort

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 20:51 skrev Austin Workman : > > But a very strange number shows up in the active sections of the pg's > that's the same number roughly as 2147483648. This seems very odd, > and maybe the value got lodged somewhere it doesn't belong which is causing > an issue. > >

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 09:01 skrev Luk : > Hello, > > I have strange problem with scrubbing. > > When scrubbing starts on PG which belong to default.rgw.buckets.index > pool, I can see that this OSD is very busy (see attachment), and starts > showing many > slow request, after the

Re: [ceph-users] How does monitor know OSD is dead?

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 05:41 skrev Bryan Henderson : > I may need to modify the above, though, now that I know how Ceph works, > because I've seen storage server products that use Ceph inside. However, > I'll > bet the people who buy those are not aware that it's designed never to go > down >

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 15:47 skrev Sean Redmond : > Hi James, > Thanks for your comments. > I think the CPU burn is more of a concern to soft iron here as they are > using low power ARM64 CPU's to keep the power draw low compared to using > Intel CPU's where like you say the problem maybe less

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 13:58 skrev Sean Redmond : > Hi Ceph-Uers, > I noticed that Soft Iron now have hardware acceleration for Erasure > Coding[1], this is interesting as the CPU overhead can be a problem in > addition to the extra disk I/O required for EC pools. > Does anyone know if any other

Re: [ceph-users] OSD caching on EC-pools (heavy cross OSD communication on cached reads)

2019-06-10 Thread Janne Johansson
Den sön 9 juni 2019 kl 18:29 skrev : > make sense - makes the cases for ec pools smaller though. > > Sunday, 9 June 2019, 17.48 +0200 from paul.emmer...@croit.io < > paul.emmer...@croit.io>: > > Caching is handled in BlueStore itself, erasure coding happens on a higher > layer. > > > In your

Re: [ceph-users] RGW metadata pool migration

2019-05-23 Thread Janne Johansson
Den ons 22 maj 2019 kl 17:43 skrev Nikhil Mitra (nikmitra) < nikmi...@cisco.com>: > Hi All, > > What are the metadata pools in an RGW deployment that need to sit on the > fastest medium to better the client experience from an access standpoint ? > > Also is there an easy way to migrate these

Re: [ceph-users] Is there a Ceph-mon data size partition max limit?

2019-05-10 Thread Janne Johansson
Den fre 10 maj 2019 kl 14:48 skrev Poncea, Ovidiu < ovidiu.pon...@windriver.com>: > Oh... joy :) Do you know if, after replay, ceph-mon data will decrease or > do we need to do some manual cleanup? Hopefully we don't keep it in there > forever. > You get the storage back as soon as the situation

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-10 Thread Janne Johansson
Den tors 9 maj 2019 kl 17:46 skrev Feng Zhang : > Thanks, guys. > > I forgot the IOPS. So since I have 100disks, the total > IOPS=100X100=10K. For the 4+2 erasure, one disk fail, then it needs to > read 5 and write 1 objects.Then the whole 100 disks can do 10K/6 ~ 2K > rebuilding actions per

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 16:17 skrev Marc Roos : > > > Fancy fast WAL/DB/Journals probably help a lot here, since they do > affect the "iops" > > you experience from your spin-drive OSDs. > > What difference can be expected if you have a 100 iops hdd and you start > using > wal/db/journals on

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 15:46 skrev Feng Zhang : > > For erasure pool, suppose I have 10 nodes, each has 10 6TB drives, so > in total 100 drives. I make a 4+2 erasure pool, failure domain is > host/node. Then if one drive failed, (assume the 6TB is fully used), > what the maximum speed the

Re: [ceph-users] Is there a Ceph-mon data size partition max limit?

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 11:52 skrev Poncea, Ovidiu < ovidiu.pon...@windriver.com>: > Hi folks, > > What is the commanded size for the ceph-mon data partitions? Is there a > maximum limit to it? If not is there a way to limit it's growth (or celan > it up)? To my knowledge ceph-mon doesn't use a

Re: [ceph-users] rbd ssd pool for (windows) vms

2019-05-06 Thread Janne Johansson
Den mån 6 maj 2019 kl 10:03 skrev Marc Roos : > > Yes but those 'changes' can be relayed via the kernel rbd driver not? > Besides I don't think you can move a rbd block device being used to a > different pool anyway. > > No, but you can move the whole pool, which takes all RBD images with it. >

Re: [ceph-users] Restricting access to RadosGW/S3 buckets

2019-05-03 Thread Janne Johansson
Den tors 2 maj 2019 kl 23:41 skrev Vladimir Brik < vladimir.b...@icecube.wisc.edu>: > Hello > I am trying to figure out a way to restrict access to S3 buckets. Is it > possible to create a RadosGW user that can only access specific bucket(s)? > You can have a user with very small bucket/bytes

Re: [ceph-users] rbd ssd pool for (windows) vms

2019-05-03 Thread Janne Johansson
Den ons 1 maj 2019 kl 23:00 skrev Marc Roos : > Do you need to tell the vm's that they are on a ssd rbd pool? Or does > ceph and the libvirt drivers do this automatically for you? > When testing a nutanix acropolis virtual install, I had to 'cheat' it by > adding this > > To make the installer

Re: [ceph-users] clock skew

2019-04-25 Thread Janne Johansson
Den tors 25 apr. 2019 kl 13:00 skrev huang jun : > mj 于2019年4月25日周四 下午6:34写道: > > > > Hi all, > > > > On our three-node cluster, we have setup chrony for time sync, and even > > though chrony reports that it is synced to ntp time, at the same time > > ceph occasionally reports time skews that

Re: [ceph-users] getting pg inconsistent periodly

2019-04-24 Thread Janne Johansson
Den ons 24 apr. 2019 kl 08:46 skrev Zhenshi Zhou : > Hi, > > I'm running a cluster for a period of time. I find the cluster usually > run into unhealthy state recently. > > With 'ceph health detail', one or two pg are inconsistent. What's > more, pg in wrong state each day are not placed on the

Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Janne Johansson
Den fre 19 apr. 2019 kl 12:10 skrev Marc Roos : > > [...]since nobody here is interested in a better rgw client for end > users. I am wondering if the rgw is even being used like this, and what > most production environments look like. > > "Like this" ? People use tons of scriptable and built-in

Re: [ceph-users] rgw windows/mac clients shitty, develop a new one?

2019-04-18 Thread Janne Johansson
https://www.reddit.com/r/netsec/comments/8t4xrl/filezilla_malware/ not saying it definitely is, or isn't malware-ridden, but it sure was shady at that time. I would suggest not pointing people to it. Den tors 18 apr. 2019 kl 16:41 skrev Brian : : > Hi Marc > > Filezilla has decent S3 support

Re: [ceph-users] showing active config settings

2019-04-10 Thread Janne Johansson
Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block : > > If you don't specify which daemon to talk to, it tells you what the > > defaults would be for a random daemon started just now using the same > > config as you have in /etc/ceph/ceph.conf. > > I tried that, too, but the result is not correct:

Re: [ceph-users] showing active config settings

2019-04-10 Thread Janne Johansson
Den ons 10 apr. 2019 kl 13:31 skrev Eugen Block : > > While --show-config still shows > > host1:~ # ceph --show-config | grep osd_recovery_max_active > osd_recovery_max_active = 3 > > > It seems as if --show-config is not really up-to-date anymore? > Although I can execute it, the option doesn't

Re: [ceph-users] cluster is not stable

2019-03-15 Thread Janne Johansson
Den tors 14 mars 2019 kl 17:00 skrev Zhenshi Zhou : > I think I've found the root cause which make the monmap contains no > feature. As I moved the servers from one place to another, I modified > the monmap once. If this was the empty cluster that you refused to redo from scratch, then I feel it

Re: [ceph-users] ceph migration

2019-02-25 Thread Janne Johansson
Den mån 25 feb. 2019 kl 13:40 skrev Eugen Block : > I just moved a (virtual lab) cluster to a different network, it worked > like a charm. > In an offline method - you need to: > > - set osd noout, ensure there are no OSDs up > - Change the MONs IP, See the bottom of [1] "CHANGING A MONITOR’S IP >

Re: [ceph-users] ceph migration

2019-02-25 Thread Janne Johansson
Den mån 25 feb. 2019 kl 12:33 skrev Zhenshi Zhou : > I deployed a new cluster(mimic). Now I have to move all servers > in this cluster to another place, with new IP. > I'm not sure if the cluster will run well or not after I modify config > files, include /etc/hosts and /etc/ceph/ceph.conf. No,

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread Janne Johansson
Den fre 22 feb. 2019 kl 12:35 skrev M Ranga Swami Reddy < swamire...@gmail.com>: > No seen the CPU limitation because we are using the 4 cores per osd daemon. > But still using "ms_crc_data = true and ms_crc_header = true". Will > disable these and try the performance. > I am a bit sceptical to

Re: [ceph-users] How to trim default.rgw.log pool?

2019-02-14 Thread Janne Johansson
While we're at it, a way to know what in the default.rgw...non-ec pool one can remove. We have tons of old zero-size objects there which are probably useless and just take up (meta)space. Den tors 14 feb. 2019 kl 09:26 skrev Charles Alva : > Hi All, > > Is there a way to trim Ceph

Re: [ceph-users] Multicast communication compuverde

2019-02-06 Thread Janne Johansson
For EC coded stuff,at 10+4 with 13 others needing data apart from the primary, they are specifically NOT getting the same data, they are getting either 1/10th of the pieces, or one of the 4 different checksums, so it would be nasty to send full data to all OSDs expecting a 14th of the data. Den

Re: [ceph-users] Multicast communication compuverde

2019-02-06 Thread Janne Johansson
Multicast traffic from storage has a point in things like the old Windows provisioning software Ghost where you could netboot a room full och computers, have them listen to a mcast stream of the same data/image and all apply it at the same time, and perhaps re-sync potentially missing stuff at the

Re: [ceph-users] ceph block - volume with RAID#0

2019-01-31 Thread Janne Johansson
Den fre 1 feb. 2019 kl 06:30 skrev M Ranga Swami Reddy : > Here user requirement is - less write and more reads...so not much > worried on performance . > So why go for raid0 at all? It is the least secure way to store data. -- May the most significant bit of your life be positive.

Re: [ceph-users] ceph block - volume with RAID#0

2019-01-30 Thread Janne Johansson
Den ons 30 jan. 2019 kl 14:47 skrev M Ranga Swami Reddy < swamire...@gmail.com>: > Hello - Can I use the ceph block volume with RAID#0? Are there any > issues with this? > Hard to tell if you mean raid0 over a block volume or a block volume over raid0. Still, it is seldom a good idea to stack

Re: [ceph-users] Best practice for increasing number of pg and pgp

2019-01-30 Thread Janne Johansson
Den ons 30 jan. 2019 kl 05:24 skrev Linh Vu : > > We use https://github.com/cernceph/ceph-scripts ceph-gentle-split script to > slowly increase by 16 pgs at a time until we hit the target. > > Somebody recommends that this adjustment should be done in multiple stages, > e.g. increase 1024 pg

Re: [ceph-users] Modify ceph.mon network required

2019-01-25 Thread Janne Johansson
Den fre 25 jan. 2019 kl 09:52 skrev cmonty14 <74cmo...@gmail.com>: > > Hi, > I have identified a major issue with my cluster setup consisting of 3 nodes: > all monitors are connected to cluster network. > > Question: > How can I modify the network configuration of mon? > > It's not working to

Re: [ceph-users] quick questions about a 5-node homelab setup

2019-01-22 Thread Janne Johansson
Den tis 22 jan. 2019 kl 00:50 skrev Brian Topping : > > I've scrounged up 5 old Atom Supermicro nodes and would like to run them > > 365/7 for limited production as RBD with Bluestore (ideally latest 13.2.4 > > Mimic), triple copy redundancy. Underlying OS is a Debian 9 64 bit, minimal > >

Re: [ceph-users] quick questions about a 5-node homelab setup

2019-01-21 Thread Janne Johansson
Den fre 18 jan. 2019 kl 12:42 skrev Robert Sander : > > Assuming BlueStore is too fat for my crappy nodes, do I need to go to > > FileStore? If yes, then with xfs as the file system? Journal on the SSD as > > a directory, then? > > Journal for FileStore is also a block device. It can be a file

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Janne Johansson
Den tis 8 jan. 2019 kl 16:05 skrev Yoann Moulin : > The best thing you can do here is added two disks to pf-us1-dfs3. After that, get a fourth host with 4 OSDs on it and add to the cluster. If you have 3 replicas (which is good!), then any downtime will mean the cluster is kept in a degraded

Re: [ceph-users] Balancer=on with crush-compat mode

2019-01-06 Thread Janne Johansson
Den sön 6 jan. 2019 kl 13:22 skrev Marc Roos : > > >If I understand the balancer correct, it balances PGs not data. > >This worked perfectly fine in your case. > > > >I prefer a PG count of ~100 per OSD, you are at 30. Maybe it would > >help to bump the PGs. > > > I am not sure if I should

Re: [ceph-users] list admin issues

2018-12-26 Thread Janne Johansson
Den lör 22 dec. 2018 kl 19:18 skrev Brian : : > Sorry to drag this one up again. Not as sorry to drag it up as you > Just got the unsubscribed due to excessive bounces thing. And me. > 'Your membership in the mailing list ceph-users has been disabled due > to excessive bounces The last bounce

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread Janne Johansson
Den tors 20 dec. 2018 kl 22:45 skrev Vladimir Brik : > Hello > I am considering using logical volumes of an NVMe drive as DB or WAL > devices for OSDs on spinning disks. > The documentation recommends against DB devices smaller than 4% of slow > disk size. Our servers have 16x 10TB HDDs and a

Re: [ceph-users] Scheduling deep-scrub operations

2018-12-14 Thread Janne Johansson
Den fre 14 dec. 2018 kl 12:25 skrev Caspar Smit : > We have operating hours from 4 pm until 7 am each weekday and 24 hour days in > the weekend. > I was wondering if it's possible to allow deep-scrubbing from 7 am until 15 > pm only on weekdays and prevent any deep-scrubbing in the weekend. >

Re: [ceph-users] yet another deep-scrub performance topic

2018-12-11 Thread Janne Johansson
Den tis 11 dec. 2018 kl 12:54 skrev Caspar Smit : > > On a Luminous 12.2.7 cluster these are the defaults: > ceph daemon osd.x config show thank you very much. -- May the most significant bit of your life be positive. ___ ceph-users mailing list

Re: [ceph-users] yet another deep-scrub performance topic

2018-12-11 Thread Janne Johansson
Den tis 11 dec. 2018 kl 12:26 skrev Caspar Smit : > > Furthermore, presuming you are running Jewel or Luminous you can change some > settings in ceph.conf to mitigate the deep-scrub impact: > > osd scrub max interval = 4838400 > osd scrub min interval = 2419200 > osd scrub interval randomize

Re: [ceph-users] High average apply latency Firefly

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 11:20 skrev Klimenko, Roman : > > Hi everyone! > > On the old prod cluster > - baremetal, 5 nodes (24 cpu, 256G RAM) > - ceph 0.80.9 filestore > - 105 osd, size 114TB (each osd 1.1T, SAS Seagate ST1200MM0018) , raw used 60% > - 15 journals (eash journal 0.4TB, Toshiba

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 10:37 skrev linghucongsong : > Thank you for reply! > But it is just in case suddenly power off for all the hosts! > So the best way for this it is to have the snapshot on the import vms or > have to mirror the > images to other ceph cluster? Best way is probably to do

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 09:49 skrev linghucongsong : > HI all! > > I have a ceph test envirment use ceph with openstack. There are some vms > run on the openstack. It is just a test envirment. > my ceph version is 12.2.4. Last day I reboot all the ceph hosts before > this I do not shutdown the vms

Re: [ceph-users] Disable intra-host replication?

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 12:11 skrev Marco Gaiarin : > Mandi! Janne Johansson > In chel di` si favelave... > > > The default crush rules with replication=3 would only place PGs on > > separate hosts, > > so in that case it would go into degraded mode if a node g

Re: [ceph-users] Sizing for bluestore db and wal

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 10:10 skrev Felix Stolte : > > Hi folks, > > i upgraded our ceph cluster from jewel to luminous and want to migrate > from filestore to bluestore. Currently we use one SSD as journal for > thre 8TB Sata Drives with a journal partition size of 40GB. If my > understanding of

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 09:39 skrev Stefan Kooman : > > It is a slight mistake in reporting it in the same way as an error, > > even if it looks to the > > cluster just as if it was in error and needs fixing. This gives the > > new ceph admins a > > sense of urgency or danger whereas it should be

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Janne Johansson
Den sön 25 nov. 2018 kl 22:10 skrev Stefan Kooman : > > Hi List, > > Another interesting and unexpected thing we observed during cluster > expansion is the following. After we added extra disks to the cluster, > while "norebalance" flag was set, we put the new OSDs "IN". As soon as > we did that

Re: [ceph-users] Disable intra-host replication?

2018-11-23 Thread Janne Johansson
Den fre 23 nov. 2018 kl 15:19 skrev Marco Gaiarin : > > > Previous (partial) node failures and my current experiments on adding a > node lead me to the fact that, when rebalancing are needed, ceph > rebalance also on intra-node: eg, if an OSD of a node die, data are > rebalanced on all OSD, even

Re: [ceph-users] New OSD with weight 0, rebalance still happen...

2018-11-23 Thread Janne Johansson
Den fre 23 nov. 2018 kl 11:08 skrev Marco Gaiarin : > Reading ceph docs lead to me that 'ceph osd reweight' and 'ceph osd crush > reweight' was roughly the same, the first is effectively 'temporary' > and expressed in percentage (0-1), while the second is 'permanent' and > expressed, normally, as

Re: [ceph-users] I can't find the configuration of user connection log in RADOSGW

2018-11-12 Thread Janne Johansson
Den mån 12 nov. 2018 kl 06:19 skrev 대무무 : > > Hello. > I installed ceph framework in 6 servers and I want to manage the user access > log. So I configured ceph.conf in the server which installing the rgw. > > ceph.conf > [client.rgw.~~~] > ... > rgw enable usage log = True > > However, I

Re: [ceph-users] ceph 12.2.9 release

2018-11-08 Thread Janne Johansson
Den ons 7 nov. 2018 kl 18:43 skrev David Turner : > > My big question is that we've had a few of these releases this year that are > bugged and shouldn't be upgraded to... They don't have any release notes or > announcement and the only time this comes out is when users finally ask about > it

Re: [ceph-users] list admin issues

2018-11-06 Thread Janne Johansson
Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu : > I'm bumping this old thread cause it's getting annoying. My membership get > disabled twice a month. > Between my two Gmail accounts I'm in more than 25 mailing lists and I see > this behavior only here. Why is only ceph-users only affected?

Re: [ceph-users] EC K + M Size

2018-11-03 Thread Janne Johansson
Den lör 3 nov. 2018 kl 09:10 skrev Ashley Merrick : > > Hello, > > Tried to do some reading online but was unable to find much. > > I can imagine a higher K + M size with EC requires more CPU to re-compile the > shards into the required object. > > But is there any benefit or negative going with

Re: [ceph-users] Priority for backfilling misplaced and degraded objects

2018-11-01 Thread Janne Johansson
I think that all the misplaced PGs that are in the queue that get writes _while_ waiting for backfill will get the "degraded" status, meaning that before they were just on the wrong place, now they are on the wrong place, AND the newly made PG they should backfill into will get an old dump made

Re: [ceph-users] Misplaced/Degraded objects priority

2018-10-24 Thread Janne Johansson
Den ons 24 okt. 2018 kl 13:09 skrev Florent B : > On a Luminous cluster having some misplaced and degraded objects after > outage : > > health: HEALTH_WARN > 22100/2496241 objects misplaced (0.885%) > Degraded data redundancy: 964/2496241 objects degraded > (0.039%), 3 p >

Re: [ceph-users] RGW stale buckets

2018-10-23 Thread Janne Johansson
When you run rgw it creates a ton of pools, so one of the other pools were holding the indexes of what buckets there are, and the actual data is what got stored in default.rgw.data (or whatever name it had), so that cleanup was not complete and this is what causes your issues, I'd say. How to

Re: [ceph-users] What is rgw.none

2018-10-22 Thread Janne Johansson
Den mån 6 aug. 2018 kl 12:58 skrev Tomasz Płaza : > Hi all, > > I have a bucket with a vary big num_objects in rgw.none: > > { > "bucket": "dyna", > > "usage": { > "rgw.none": { > > "num_objects": 18446744073709551615 > } > > What is rgw.none and is this big number OK?

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Janne Johansson
terms > of raw storage, is about 50 % used. > > But in terms of storage shown for that pool, it's almost 63 % %USED. > So I guess this can purely be from bad balancing, correct? > > Cheers, > Oliver > > Am 20.10.18 um 19:49 schrieb Janne Johansson: > > Do mi

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Janne Johansson
Do mind that drives may have more than one pool on them, so RAW space is what it says, how much free space there is. Then the avail and %USED on per-pool stats will take replication into account, it can tell how much data you may write into that particular pool, given that pools replication or EC

Re: [ceph-users] Does anyone use interactive CLI mode?

2018-10-11 Thread Janne Johansson
Den ons 10 okt. 2018 kl 16:20 skrev John Spray : > So the question is: does anyone actually use this feature? It's not > particularly expensive to maintain, but it might be nice to have one > less path through the code if this is entirely unused. It can go as far as I am concerned too. Better

Re: [ceph-users] list admin issues

2018-10-06 Thread Janne Johansson
Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu : > > Hi, > > I'm bumping this old thread cause it's getting annoying. My membership get > disabled twice a month. > Between my two Gmail accounts I'm in more than 25 mailing lists and I see > this behavior only here. Why is only ceph-users only

Re: [ceph-users] hardware heterogeneous in same pool

2018-10-04 Thread Janne Johansson
Den tors 4 okt. 2018 kl 00:09 skrev Bruno Carvalho : > Hi Cephers, I would like to know how you are growing the cluster. > Using dissimilar hardware in the same pool or creating a pool for each > different hardware group. > What problem would I have many problems using different hardware (CPU, >

Re: [ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-10-02 Thread Janne Johansson
Den mån 1 okt. 2018 kl 22:08 skrev John Spray : > > > totally new for me, also not what I would expect of a mv on a fs. I know > > this is normal to expect coping between pools, also from the s3cmd > > client. But I think more people will not expect this behaviour. Can't > > the move be

Re: [ceph-users] Mimic upgrade failure

2018-09-10 Thread Janne Johansson
Den mån 10 sep. 2018 kl 08:10 skrev Kevin Hrpcek : > Update for the list archive. > > I went ahead and finished the mimic upgrade with the osds in a fluctuating > state of up and down. The cluster did start to normalize a lot easier after > everything was on mimic since the random mass OSD

Re: [ceph-users] advice with erasure coding

2018-09-07 Thread Janne Johansson
Den fre 7 sep. 2018 kl 13:44 skrev Maged Mokhtar : > > Good day Cephers, > > I want to get some guidance on erasure coding, the docs do state the > different plugins and settings but to really understand them all and their > use cases is not easy: > > -Are the majority of implementations using

Re: [ceph-users] Luminous RGW errors at start

2018-09-03 Thread Janne Johansson
Did you change the default pg_num or pgp_num so the pools that did show up made it go past the mon_max_pg_per_osd ? Den fre 31 aug. 2018 kl 17:20 skrev Robert Stanford : > > I installed a new Luminous cluster. Everything is fine so far. Then I > tried to start RGW and got this error: > >

Re: [ceph-users] Network cluster / addr

2018-08-21 Thread Janne Johansson
Den tis 21 aug. 2018 kl 09:31 skrev Nino Bosteels : > > * Does ceph interpret multiple values for this in the ceph.conf (I > wouldn’t say so out of my tests)? > > * Shouldn’t public network be your internet facing range and cluster > network the private range? > "Public" doesn't necessarily mean

Re: [ceph-users] limited disk slots - should I ran OS on SD card ?

2018-08-15 Thread Janne Johansson
Den ons 15 aug. 2018 kl 10:04 skrev Wido den Hollander : > > This is the case for filesystem journals (xfs, ext4, almost all modern > > filesystems). Been there, done that, had two storage systems failing due > > to SD wear > > > > I've been running OS on the SuperMicro 64 and 128GB SATA-DOMs

Re: [ceph-users] Least impact when adding PG's

2018-08-14 Thread Janne Johansson
Den mån 13 aug. 2018 kl 23:30 skrev : > > > Am 7. August 2018 18:08:05 MESZ schrieb John Petrini < > jpetr...@coredial.com>: > >Hi All, > > Hi John, > > > > >Any advice? > > > > I am Not sure but what i would do is to increase the PG Step by Step and > always with a value of "Power of two" i.e.

  1   2   >