Re: [ceph-users] mgr dashboard differs from ceph status

2018-05-07 Thread Janne Johansson
2018-05-04 8:21 GMT+02:00 Tracy Reed : > > services: > mon: 3 daemons, quorum ceph01,ceph03,ceph07 > mgr: ceph01(active), standbys: ceph-ceph07, ceph03 > osd: 78 osds: 78 up, 78 in > Don't know if it matters, but the naming seems different even though I guess

Re: [ceph-users] The mystery of sync modules

2018-04-27 Thread Janne Johansson
2018-04-27 17:33 GMT+02:00 Sean Purdy : > > Mimic has a new feature, a cloud sync module for radosgw to sync objects > to some other S3-compatible destination. > > This would be a lovely thing to have here, and ties in nicely with object > versioning and DR. But I am put

Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

2018-05-08 Thread Janne Johansson
2018-05-08 1:46 GMT+02:00 Maciej Puzio : > Paul, many thanks for your reply. > Thinking about it, I can't decide if I'd prefer to operate the storage > server without redundancy, or have it automatically force a downtime, > subjecting me to a rage of my users and my boss. >

Re: [ceph-users] multi site with cephfs

2018-05-21 Thread Janne Johansson
Den mån 21 maj 2018 kl 18:28 skrev Up Safe : > I don't believe I have this kind of behavior. > AFAIK, files are created or modified by only 1 client at a time. > Make sure that this is the case then, its _very_ easy to start out with something along the lines of "right now I

Re: [ceph-users] Ceph replication factor of 2

2018-05-24 Thread Janne Johansson
Den tors 24 maj 2018 kl 00:20 skrev Jack : > Hi, > > I have to say, this is a common yet worthless argument > If I have 3000 OSD, using 2 or 3 replica will not change much : the > probability of losing 2 devices is still "high" > On the other hand, if I have a small cluster,

Re: [ceph-users] Ceph replication factor of 2

2018-05-25 Thread Janne Johansson
Den fre 25 maj 2018 kl 00:20 skrev Jack : > On 05/24/2018 11:40 PM, Stefan Kooman wrote: > >> What are your thoughts, would you run 2x replication factor in > >> Production and in what scenarios? > Me neither, mostly because I have yet to read a technical point of view, >

Re: [ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access)

2018-06-08 Thread Janne Johansson
Den fre 8 juni 2018 kl 12:35 skrev Marc Roos : > > I am getting the impression that not everyone understands the subject > that has been raised here. > Or they do and they do not agree with your vision of how things should be done. That is a distinct possibility one has to consider when using

Re: [ceph-users] performance exporting RBD over NFS

2018-06-18 Thread Janne Johansson
Den mån 18 juni 2018 kl 14:55 skrev Marc Boisis : > Hi, > > I want to export rbd over nfs in a 10Gb network. Server and Client are > DELL R620 with 10Gb nics. > > NFS client write bandwith on the rbd export is only 233MB/s. > > > My conclusion: > - rbd write performance is good >

Re: [ceph-users] Crush maps : split the root in two parts on an OSD node with same disks ?

2018-06-12 Thread Janne Johansson
Den tis 12 juni 2018 kl 15:06 skrev Hervé Ballans < herve.ball...@ias.u-psud.fr>: > Hi all, > > I have a cluster with 6 OSD nodes, each has 20 disks, all of the 120 > disks are strictly identical (model and size). > (The cluster is also composed of 3 MON servers on 3 other machines) > > For

Re: [ceph-users] Radosgw

2018-05-28 Thread Janne Johansson
Den mån 28 maj 2018 kl 15:28 skrev Marc-Antoine Desrochers < marc-antoine.desroch...@sogetel.com>: > Hi, > > > > Im new in a business and I took on the ceph project. > > Im still a newbie on that subject and I try to understand what the > previous guy was trying to do. > > > > Is there any reason

Re: [ceph-users] Dashboard runs on all manager instances?

2018-01-09 Thread Janne Johansson
2018-01-09 19:34 GMT+01:00 Tim Bishop : > Hi, > > I've recently upgraded from Jewel to Luminous and I'm therefore new to > using the Dashboard. I noted this section in the documentation: > > http://docs.ceph.com/docs/master/mgr/dashboard/#load-balancer > > "Please

Re: [ceph-users] Reduced data availability: 4 pgs inactive, 4 pgs incomplete

2018-01-05 Thread Janne Johansson
2018-01-05 6:56 GMT+01:00 Brent Kennedy : > We have upgraded from Hammer to Jewel and then Luminous 12.2.2 as of > today. During the hammer upgrade to Jewel we lost two host servers and let > the cluster rebalance/recover, it ran out of space and stalled. We then > added

Re: [ceph-users] Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 Thread Janne Johansson
2018-01-10 8:51 GMT+01:00 Brent Kennedy : > As per a previous thread, my pgs are set too high. I tried adjusting the > “mon max pg per osd” up higher and higher, which did clear the > error(restarted monitors and managers each time), but it seems that data > simply wont move

Re: [ceph-users] issue adding OSDs

2018-01-12 Thread Janne Johansson
Running "ceph mon versions" and "ceph osd versions" and so on as you do the upgrades would have helped I guess. 2018-01-11 17:28 GMT+01:00 Luis Periquito : > this was a bit weird, but is now working... Writing for future > reference if someone faces the same issue. > > this

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-01-31 Thread Janne Johansson
2018-01-30 17:24 GMT+01:00 Bryan Banister : > Hi all, > > > > We are still very new to running a Ceph cluster and have run a RGW cluster > for a while now (6-ish mo), it mainly holds large DB backups (Write once, > read once, delete after N days). The system is now

Re: [ceph-users] rgw s3 clients android windows macos

2018-02-01 Thread Janne Johansson
Perhaps you should look at fuse-s3 clients? On win, mountainduck (from same people as cyberduck) and Cloudberry will in paid versions allow a mount to look like any SMB share using S3 as a backend. 2018-02-01 12:27 GMT+01:00 Marc Roos : > > I was just wondering if

Re: [ceph-users] Debugging fstrim issues

2018-01-29 Thread Janne Johansson
2018-01-29 12:29 GMT+01:00 Nathan Harper : > Hi, > I don't know if this is strictly a Ceph issue, but hoping someone will be > able to shed some light. We have an Openstack environment (Ocata) backed > onto a Jewel cluster. > We recently ran into some issues with full

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-01-31 Thread Janne Johansson
2018-01-31 15:58 GMT+01:00 Bryan Banister : > > > > Given that this will move data around (I think), should we increase the > pg_num and pgp_num first and then see how it looks? > > > I guess adding pgs and pgps will move stuff around too, but if the PGCALC formula

Re: [ceph-users] degraded PGs when adding OSDs

2018-02-09 Thread Janne Johansson
2018-02-08 23:38 GMT+01:00 Simon Ironside : > Hi Everyone, > I recently added an OSD to an active+clean Jewel (10.2.3) cluster and was > surprised to see a peak of 23% objects degraded. Surely this should be at > or near zero and the objects should show as misplaced? >

Re: [ceph-users] Questions about pg num setting

2018-01-03 Thread Janne Johansson
In some common cases (when you have lots of objects per pg) ceph will warn about it. 2018-01-03 11:10 GMT+01:00 Marc Roos : > > > Is there a disadvantage to just always start pg_num and pgp_num with > something low like 8, and then later increase it when necessary? > >

Re: [ceph-users] limited disk slots - should I ran OS on SD card ?

2018-08-15 Thread Janne Johansson
Den ons 15 aug. 2018 kl 10:04 skrev Wido den Hollander : > > This is the case for filesystem journals (xfs, ext4, almost all modern > > filesystems). Been there, done that, had two storage systems failing due > > to SD wear > > > > I've been running OS on the SuperMicro 64 and 128GB SATA-DOMs

Re: [ceph-users] Stale PG data loss

2018-08-13 Thread Janne Johansson
"Don't run with replication 1 ever". Even if this is a test, it tests something for which a resilient cluster is specifically designed to avoid. As for enumerating what data is missing, it would depend on if the pool(s) had cephfs, rbd images or rgw data in them. When this kind of data loss

Re: [ceph-users] understanding pool capacity and usage

2018-08-10 Thread Janne Johansson
Den fre 27 juli 2018 kl 12:24 skrev Anton Aleksandrov : > Hello, > > Might sounds strange, but I could not find answer in google or docs, might > be called somehow else. > > I dont understand pool capacity policy and how to set/define it. I have > created simple cluster for CephFS on 4 servers,

Re: [ceph-users] Applicability and migration path

2018-08-10 Thread Janne Johansson
Den fre 10 aug. 2018 kl 04:33 skrev Matthew Pounsett : > > First, in my tests and reading I haven't encountered anything that > suggests I should expect problems from using a small number of large file > servers in a cluster. But I recognize that this isn't the preferred > configuration, and I'm

Re: [ceph-users] questions about rbd used percentage

2018-08-10 Thread Janne Johansson
You can halve the time by running "rbd du" once, keep the output and run the grep over the output instead. Den tors 2 aug. 2018 kl 12:53 skrev : > Hi! > > I want to monitor rbd image size to enable enlager size when > use percentage above 80%. > > I find a way with `rbd du`: > > total=$(rbd du

Re: [ceph-users] Secure way to wipe a Ceph cluster

2018-08-10 Thread Janne Johansson
Den fre 27 juli 2018 kl 21:20 skrev Patrick Donnelly : > > > as part of deprovisioning customers, we regularly have the task of > > wiping their Ceph clusters. Is there a certifiable, GDPR compliant way > > to do so without physically shredding the disks? > > This should work and should be as

Re: [ceph-users] Least impact when adding PG's

2018-08-14 Thread Janne Johansson
Den mån 13 aug. 2018 kl 23:30 skrev : > > > Am 7. August 2018 18:08:05 MESZ schrieb John Petrini < > jpetr...@coredial.com>: > >Hi All, > > Hi John, > > > > >Any advice? > > > > I am Not sure but what i would do is to increase the PG Step by Step and > always with a value of "Power of two" i.e.

Re: [ceph-users] Network cluster / addr

2018-08-21 Thread Janne Johansson
Den tis 21 aug. 2018 kl 09:31 skrev Nino Bosteels : > > * Does ceph interpret multiple values for this in the ceph.conf (I > wouldn’t say so out of my tests)? > > * Shouldn’t public network be your internet facing range and cluster > network the private range? > "Public" doesn't necessarily mean

Re: [ceph-users] Mimic upgrade failure

2018-09-10 Thread Janne Johansson
Den mån 10 sep. 2018 kl 08:10 skrev Kevin Hrpcek : > Update for the list archive. > > I went ahead and finished the mimic upgrade with the osds in a fluctuating > state of up and down. The cluster did start to normalize a lot easier after > everything was on mimic since the random mass OSD

Re: [ceph-users] advice with erasure coding

2018-09-07 Thread Janne Johansson
Den fre 7 sep. 2018 kl 13:44 skrev Maged Mokhtar : > > Good day Cephers, > > I want to get some guidance on erasure coding, the docs do state the > different plugins and settings but to really understand them all and their > use cases is not easy: > > -Are the majority of implementations using

Re: [ceph-users] CHOOSING THE NUMBER OF PLACEMENT GROUPS

2018-03-09 Thread Janne Johansson
2018-03-09 10:27 GMT+01:00 Will Zhao : > Hi all: > > I have a tiny question. I have read the documents, and it > recommend approximately 100 placement groups for normal usage. > Per OSD. Approximately 100 PGs per OSD, when all used pools are summed up. For things like

Re: [ceph-users] Radosgw switch from replicated to erasure

2018-04-20 Thread Janne Johansson
2018-04-20 6:06 GMT+02:00 Marc Roos : > > I want to start using the radowsgw a bit. For now I am fine with the 3 > replicated setup, in the near future when I add a host. I would like to > switch to ec, is there something I should do now to make this switch > more

Re: [ceph-users] What do you use to benchmark your rgw?

2018-03-28 Thread Janne Johansson
s3cmd and cli version of cyberduck to test it end-to-end using parallelism if possible. Getting some 100MB/s at most, from 500km distance over https against 5*radosgw behind HAProxy. 2018-03-28 11:17 GMT+02:00 Matthew Vernon : > Hi, > > What are people here using to

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Janne Johansson
2018-03-29 11:50 GMT+02:00 David Rabel <ra...@b1-systems.de>: > On 29.03.2018 11:43, Janne Johansson wrote: > > 2018-03-29 11:39 GMT+02:00 David Rabel <ra...@b1-systems.de>: > > > >> For example a replicated pool with size 4: Do i always have to set the &

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Janne Johansson
2018-03-29 11:39 GMT+02:00 David Rabel : > Hi there. > > Are there possibilities to prevent osd-split-brain in a replicated pool > with an even size? Or do you always have to make min_size big enough to > cover this? > > For example a replicated pool with size 4: Do i always

Re: [ceph-users] What do you use to benchmark your rgw?

2018-03-28 Thread Janne Johansson
2018-03-28 16:21 GMT+02:00 David Byte : > I use cosbench (the last rc works well enough). I can get multiple GB/s > from my 6 node cluster with 2 RGWs. > > > To add info to this, it's not unexpectedly low for us, we know the S3+https layer added latencies, and it is EC pools on

Re: [ceph-users] Cluster is empty but it still use 1Gb of data

2018-03-02 Thread Janne Johansson
2018-03-02 11:21 GMT+01:00 Max Cuttins : > Hi everybody, > > i deleted everything from the cluster after some test with RBD. > Now I see that there something still in use: > > data: > pools: 0 pools, 0 pgs > objects: 0 objects, 0 bytes > usage: *9510 MB

Re: [ceph-users] What is rgw.none

2018-10-22 Thread Janne Johansson
Den mån 6 aug. 2018 kl 12:58 skrev Tomasz Płaza : > Hi all, > > I have a bucket with a vary big num_objects in rgw.none: > > { > "bucket": "dyna", > > "usage": { > "rgw.none": { > > "num_objects": 18446744073709551615 > } > > What is rgw.none and is this big number OK?

Re: [ceph-users] RGW stale buckets

2018-10-23 Thread Janne Johansson
When you run rgw it creates a ton of pools, so one of the other pools were holding the indexes of what buckets there are, and the actual data is what got stored in default.rgw.data (or whatever name it had), so that cleanup was not complete and this is what causes your issues, I'd say. How to

Re: [ceph-users] Misplaced/Degraded objects priority

2018-10-24 Thread Janne Johansson
Den ons 24 okt. 2018 kl 13:09 skrev Florent B : > On a Luminous cluster having some misplaced and degraded objects after > outage : > > health: HEALTH_WARN > 22100/2496241 objects misplaced (0.885%) > Degraded data redundancy: 964/2496241 objects degraded > (0.039%), 3 p >

Re: [ceph-users] Priority for backfilling misplaced and degraded objects

2018-11-01 Thread Janne Johansson
I think that all the misplaced PGs that are in the queue that get writes _while_ waiting for backfill will get the "degraded" status, meaning that before they were just on the wrong place, now they are on the wrong place, AND the newly made PG they should backfill into will get an old dump made

Re: [ceph-users] EC K + M Size

2018-11-03 Thread Janne Johansson
Den lör 3 nov. 2018 kl 09:10 skrev Ashley Merrick : > > Hello, > > Tried to do some reading online but was unable to find much. > > I can imagine a higher K + M size with EC requires more CPU to re-compile the > shards into the required object. > > But is there any benefit or negative going with

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Janne Johansson
terms > of raw storage, is about 50 % used. > > But in terms of storage shown for that pool, it's almost 63 % %USED. > So I guess this can purely be from bad balancing, correct? > > Cheers, > Oliver > > Am 20.10.18 um 19:49 schrieb Janne Johansson: > > Do mi

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Janne Johansson
Do mind that drives may have more than one pool on them, so RAW space is what it says, how much free space there is. Then the avail and %USED on per-pool stats will take replication into account, it can tell how much data you may write into that particular pool, given that pools replication or EC

Re: [ceph-users] list admin issues

2018-11-06 Thread Janne Johansson
Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu : > I'm bumping this old thread cause it's getting annoying. My membership get > disabled twice a month. > Between my two Gmail accounts I'm in more than 25 mailing lists and I see > this behavior only here. Why is only ceph-users only affected?

Re: [ceph-users] ceph 12.2.9 release

2018-11-08 Thread Janne Johansson
Den ons 7 nov. 2018 kl 18:43 skrev David Turner : > > My big question is that we've had a few of these releases this year that are > bugged and shouldn't be upgraded to... They don't have any release notes or > announcement and the only time this comes out is when users finally ask about > it

Re: [ceph-users] I can't find the configuration of user connection log in RADOSGW

2018-11-12 Thread Janne Johansson
Den mån 12 nov. 2018 kl 06:19 skrev 대무무 : > > Hello. > I installed ceph framework in 6 servers and I want to manage the user access > log. So I configured ceph.conf in the server which installing the rgw. > > ceph.conf > [client.rgw.~~~] > ... > rgw enable usage log = True > > However, I

Re: [ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-10-02 Thread Janne Johansson
Den mån 1 okt. 2018 kl 22:08 skrev John Spray : > > > totally new for me, also not what I would expect of a mv on a fs. I know > > this is normal to expect coping between pools, also from the s3cmd > > client. But I think more people will not expect this behaviour. Can't > > the move be

Re: [ceph-users] Does anyone use interactive CLI mode?

2018-10-11 Thread Janne Johansson
Den ons 10 okt. 2018 kl 16:20 skrev John Spray : > So the question is: does anyone actually use this feature? It's not > particularly expensive to maintain, but it might be nice to have one > less path through the code if this is entirely unused. It can go as far as I am concerned too. Better

Re: [ceph-users] hardware heterogeneous in same pool

2018-10-04 Thread Janne Johansson
Den tors 4 okt. 2018 kl 00:09 skrev Bruno Carvalho : > Hi Cephers, I would like to know how you are growing the cluster. > Using dissimilar hardware in the same pool or creating a pool for each > different hardware group. > What problem would I have many problems using different hardware (CPU, >

Re: [ceph-users] list admin issues

2018-10-06 Thread Janne Johansson
Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu : > > Hi, > > I'm bumping this old thread cause it's getting annoying. My membership get > disabled twice a month. > Between my two Gmail accounts I'm in more than 25 mailing lists and I see > this behavior only here. Why is only ceph-users only

Re: [ceph-users] Luminous RGW errors at start

2018-09-03 Thread Janne Johansson
Did you change the default pg_num or pgp_num so the pools that did show up made it go past the mon_max_pg_per_osd ? Den fre 31 aug. 2018 kl 17:20 skrev Robert Stanford : > > I installed a new Luminous cluster. Everything is fine so far. Then I > tried to start RGW and got this error: > >

Re: [ceph-users] Balancer=on with crush-compat mode

2019-01-06 Thread Janne Johansson
Den sön 6 jan. 2019 kl 13:22 skrev Marc Roos : > > >If I understand the balancer correct, it balances PGs not data. > >This worked perfectly fine in your case. > > > >I prefer a PG count of ~100 per OSD, you are at 30. Maybe it would > >help to bump the PGs. > > > I am not sure if I should

Re: [ceph-users] quick questions about a 5-node homelab setup

2019-01-21 Thread Janne Johansson
Den fre 18 jan. 2019 kl 12:42 skrev Robert Sander : > > Assuming BlueStore is too fat for my crappy nodes, do I need to go to > > FileStore? If yes, then with xfs as the file system? Journal on the SSD as > > a directory, then? > > Journal for FileStore is also a block device. It can be a file

Re: [ceph-users] quick questions about a 5-node homelab setup

2019-01-22 Thread Janne Johansson
Den tis 22 jan. 2019 kl 00:50 skrev Brian Topping : > > I've scrounged up 5 old Atom Supermicro nodes and would like to run them > > 365/7 for limited production as RBD with Bluestore (ideally latest 13.2.4 > > Mimic), triple copy redundancy. Underlying OS is a Debian 9 64 bit, minimal > >

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread Janne Johansson
Den tors 20 dec. 2018 kl 22:45 skrev Vladimir Brik : > Hello > I am considering using logical volumes of an NVMe drive as DB or WAL > devices for OSDs on spinning disks. > The documentation recommends against DB devices smaller than 4% of slow > disk size. Our servers have 16x 10TB HDDs and a

Re: [ceph-users] list admin issues

2018-12-26 Thread Janne Johansson
Den lör 22 dec. 2018 kl 19:18 skrev Brian : : > Sorry to drag this one up again. Not as sorry to drag it up as you > Just got the unsubscribed due to excessive bounces thing. And me. > 'Your membership in the mailing list ceph-users has been disabled due > to excessive bounces The last bounce

Re: [ceph-users] yet another deep-scrub performance topic

2018-12-11 Thread Janne Johansson
Den tis 11 dec. 2018 kl 12:26 skrev Caspar Smit : > > Furthermore, presuming you are running Jewel or Luminous you can change some > settings in ceph.conf to mitigate the deep-scrub impact: > > osd scrub max interval = 4838400 > osd scrub min interval = 2419200 > osd scrub interval randomize

Re: [ceph-users] yet another deep-scrub performance topic

2018-12-11 Thread Janne Johansson
Den tis 11 dec. 2018 kl 12:54 skrev Caspar Smit : > > On a Luminous 12.2.7 cluster these are the defaults: > ceph daemon osd.x config show thank you very much. -- May the most significant bit of your life be positive. ___ ceph-users mailing list

Re: [ceph-users] Scheduling deep-scrub operations

2018-12-14 Thread Janne Johansson
Den fre 14 dec. 2018 kl 12:25 skrev Caspar Smit : > We have operating hours from 4 pm until 7 am each weekday and 24 hour days in > the weekend. > I was wondering if it's possible to allow deep-scrubbing from 7 am until 15 > pm only on weekdays and prevent any deep-scrubbing in the weekend. >

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 09:49 skrev linghucongsong : > HI all! > > I have a ceph test envirment use ceph with openstack. There are some vms > run on the openstack. It is just a test envirment. > my ceph version is 12.2.4. Last day I reboot all the ceph hosts before > this I do not shutdown the vms

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 10:37 skrev linghucongsong : > Thank you for reply! > But it is just in case suddenly power off for all the hosts! > So the best way for this it is to have the snapshot on the import vms or > have to mirror the > images to other ceph cluster? Best way is probably to do

Re: [ceph-users] High average apply latency Firefly

2018-12-04 Thread Janne Johansson
Den tis 4 dec. 2018 kl 11:20 skrev Klimenko, Roman : > > Hi everyone! > > On the old prod cluster > - baremetal, 5 nodes (24 cpu, 256G RAM) > - ceph 0.80.9 filestore > - 105 osd, size 114TB (each osd 1.1T, SAS Seagate ST1200MM0018) , raw used 60% > - 15 journals (eash journal 0.4TB, Toshiba

Re: [ceph-users] New OSD with weight 0, rebalance still happen...

2018-11-23 Thread Janne Johansson
Den fre 23 nov. 2018 kl 11:08 skrev Marco Gaiarin : > Reading ceph docs lead to me that 'ceph osd reweight' and 'ceph osd crush > reweight' was roughly the same, the first is effectively 'temporary' > and expressed in percentage (0-1), while the second is 'permanent' and > expressed, normally, as

Re: [ceph-users] Disable intra-host replication?

2018-11-23 Thread Janne Johansson
Den fre 23 nov. 2018 kl 15:19 skrev Marco Gaiarin : > > > Previous (partial) node failures and my current experiments on adding a > node lead me to the fact that, when rebalancing are needed, ceph > rebalance also on intra-node: eg, if an OSD of a node die, data are > rebalanced on all OSD, even

Re: [ceph-users] Disable intra-host replication?

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 12:11 skrev Marco Gaiarin : > Mandi! Janne Johansson > In chel di` si favelave... > > > The default crush rules with replication=3 would only place PGs on > > separate hosts, > > so in that case it would go into degraded mode if a node g

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Janne Johansson
Den sön 25 nov. 2018 kl 22:10 skrev Stefan Kooman : > > Hi List, > > Another interesting and unexpected thing we observed during cluster > expansion is the following. After we added extra disks to the cluster, > while "norebalance" flag was set, we put the new OSDs "IN". As soon as > we did that

Re: [ceph-users] Sizing for bluestore db and wal

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 10:10 skrev Felix Stolte : > > Hi folks, > > i upgraded our ceph cluster from jewel to luminous and want to migrate > from filestore to bluestore. Currently we use one SSD as journal for > thre 8TB Sata Drives with a journal partition size of 40GB. If my > understanding of

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Janne Johansson
Den mån 26 nov. 2018 kl 09:39 skrev Stefan Kooman : > > It is a slight mistake in reporting it in the same way as an error, > > even if it looks to the > > cluster just as if it was in error and needs fixing. This gives the > > new ceph admins a > > sense of urgency or danger whereas it should be

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Janne Johansson
Den tis 8 jan. 2019 kl 16:05 skrev Yoann Moulin : > The best thing you can do here is added two disks to pf-us1-dfs3. After that, get a fourth host with 4 OSDs on it and add to the cluster. If you have 3 replicas (which is good!), then any downtime will mean the cluster is kept in a degraded

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread Janne Johansson
Den fre 22 feb. 2019 kl 12:35 skrev M Ranga Swami Reddy < swamire...@gmail.com>: > No seen the CPU limitation because we are using the 4 cores per osd daemon. > But still using "ms_crc_data = true and ms_crc_header = true". Will > disable these and try the performance. > I am a bit sceptical to

Re: [ceph-users] ceph migration

2019-02-25 Thread Janne Johansson
Den mån 25 feb. 2019 kl 13:40 skrev Eugen Block : > I just moved a (virtual lab) cluster to a different network, it worked > like a charm. > In an offline method - you need to: > > - set osd noout, ensure there are no OSDs up > - Change the MONs IP, See the bottom of [1] "CHANGING A MONITOR’S IP >

Re: [ceph-users] ceph migration

2019-02-25 Thread Janne Johansson
Den mån 25 feb. 2019 kl 12:33 skrev Zhenshi Zhou : > I deployed a new cluster(mimic). Now I have to move all servers > in this cluster to another place, with new IP. > I'm not sure if the cluster will run well or not after I modify config > files, include /etc/hosts and /etc/ceph/ceph.conf. No,

Re: [ceph-users] cluster is not stable

2019-03-15 Thread Janne Johansson
Den tors 14 mars 2019 kl 17:00 skrev Zhenshi Zhou : > I think I've found the root cause which make the monmap contains no > feature. As I moved the servers from one place to another, I modified > the monmap once. If this was the empty cluster that you refused to redo from scratch, then I feel it

Re: [ceph-users] How to trim default.rgw.log pool?

2019-02-14 Thread Janne Johansson
While we're at it, a way to know what in the default.rgw...non-ec pool one can remove. We have tons of old zero-size objects there which are probably useless and just take up (meta)space. Den tors 14 feb. 2019 kl 09:26 skrev Charles Alva : > Hi All, > > Is there a way to trim Ceph

Re: [ceph-users] Multicast communication compuverde

2019-02-06 Thread Janne Johansson
Multicast traffic from storage has a point in things like the old Windows provisioning software Ghost where you could netboot a room full och computers, have them listen to a mcast stream of the same data/image and all apply it at the same time, and perhaps re-sync potentially missing stuff at the

Re: [ceph-users] Multicast communication compuverde

2019-02-06 Thread Janne Johansson
For EC coded stuff,at 10+4 with 13 others needing data apart from the primary, they are specifically NOT getting the same data, they are getting either 1/10th of the pieces, or one of the 4 different checksums, so it would be nasty to send full data to all OSDs expecting a 14th of the data. Den

Re: [ceph-users] Best practice for increasing number of pg and pgp

2019-01-30 Thread Janne Johansson
Den ons 30 jan. 2019 kl 05:24 skrev Linh Vu : > > We use https://github.com/cernceph/ceph-scripts ceph-gentle-split script to > slowly increase by 16 pgs at a time until we hit the target. > > Somebody recommends that this adjustment should be done in multiple stages, > e.g. increase 1024 pg

Re: [ceph-users] Modify ceph.mon network required

2019-01-25 Thread Janne Johansson
Den fre 25 jan. 2019 kl 09:52 skrev cmonty14 <74cmo...@gmail.com>: > > Hi, > I have identified a major issue with my cluster setup consisting of 3 nodes: > all monitors are connected to cluster network. > > Question: > How can I modify the network configuration of mon? > > It's not working to

Re: [ceph-users] ceph block - volume with RAID#0

2019-01-31 Thread Janne Johansson
Den fre 1 feb. 2019 kl 06:30 skrev M Ranga Swami Reddy : > Here user requirement is - less write and more reads...so not much > worried on performance . > So why go for raid0 at all? It is the least secure way to store data. -- May the most significant bit of your life be positive.

Re: [ceph-users] ceph block - volume with RAID#0

2019-01-30 Thread Janne Johansson
Den ons 30 jan. 2019 kl 14:47 skrev M Ranga Swami Reddy < swamire...@gmail.com>: > Hello - Can I use the ceph block volume with RAID#0? Are there any > issues with this? > Hard to tell if you mean raid0 over a block volume or a block volume over raid0. Still, it is seldom a good idea to stack

Re: [ceph-users] showing active config settings

2019-04-10 Thread Janne Johansson
Den ons 10 apr. 2019 kl 13:31 skrev Eugen Block : > > While --show-config still shows > > host1:~ # ceph --show-config | grep osd_recovery_max_active > osd_recovery_max_active = 3 > > > It seems as if --show-config is not really up-to-date anymore? > Although I can execute it, the option doesn't

Re: [ceph-users] showing active config settings

2019-04-10 Thread Janne Johansson
Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block : > > If you don't specify which daemon to talk to, it tells you what the > > defaults would be for a random daemon started just now using the same > > config as you have in /etc/ceph/ceph.conf. > > I tried that, too, but the result is not correct:

Re: [ceph-users] rgw windows/mac clients shitty, develop a new one?

2019-04-18 Thread Janne Johansson
https://www.reddit.com/r/netsec/comments/8t4xrl/filezilla_malware/ not saying it definitely is, or isn't malware-ridden, but it sure was shady at that time. I would suggest not pointing people to it. Den tors 18 apr. 2019 kl 16:41 skrev Brian : : > Hi Marc > > Filezilla has decent S3 support

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 15:47 skrev Sean Redmond : > Hi James, > Thanks for your comments. > I think the CPU burn is more of a concern to soft iron here as they are > using low power ARM64 CPU's to keep the power draw low compared to using > Intel CPU's where like you say the problem maybe less

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 13:58 skrev Sean Redmond : > Hi Ceph-Uers, > I noticed that Soft Iron now have hardware acceleration for Erasure > Coding[1], this is interesting as the CPU overhead can be a problem in > addition to the extra disk I/O required for EC pools. > Does anyone know if any other

Re: [ceph-users] OSD caching on EC-pools (heavy cross OSD communication on cached reads)

2019-06-10 Thread Janne Johansson
Den sön 9 juni 2019 kl 18:29 skrev : > make sense - makes the cases for ec pools smaller though. > > Sunday, 9 June 2019, 17.48 +0200 from paul.emmer...@croit.io < > paul.emmer...@croit.io>: > > Caching is handled in BlueStore itself, erasure coding happens on a higher > layer. > > > In your

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 15:46 skrev Feng Zhang : > > For erasure pool, suppose I have 10 nodes, each has 10 6TB drives, so > in total 100 drives. I make a 4+2 erasure pool, failure domain is > host/node. Then if one drive failed, (assume the 6TB is fully used), > what the maximum speed the

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 16:17 skrev Marc Roos : > > > Fancy fast WAL/DB/Journals probably help a lot here, since they do > affect the "iops" > > you experience from your spin-drive OSDs. > > What difference can be expected if you have a 100 iops hdd and you start > using > wal/db/journals on

Re: [ceph-users] Is there a Ceph-mon data size partition max limit?

2019-05-09 Thread Janne Johansson
Den tors 9 maj 2019 kl 11:52 skrev Poncea, Ovidiu < ovidiu.pon...@windriver.com>: > Hi folks, > > What is the commanded size for the ceph-mon data partitions? Is there a > maximum limit to it? If not is there a way to limit it's growth (or celan > it up)? To my knowledge ceph-mon doesn't use a

Re: [ceph-users] RGW metadata pool migration

2019-05-23 Thread Janne Johansson
Den ons 22 maj 2019 kl 17:43 skrev Nikhil Mitra (nikmitra) < nikmi...@cisco.com>: > Hi All, > > What are the metadata pools in an RGW deployment that need to sit on the > fastest medium to better the client experience from an access standpoint ? > > Also is there an easy way to migrate these

Re: [ceph-users] Is there a Ceph-mon data size partition max limit?

2019-05-10 Thread Janne Johansson
Den fre 10 maj 2019 kl 14:48 skrev Poncea, Ovidiu < ovidiu.pon...@windriver.com>: > Oh... joy :) Do you know if, after replay, ceph-mon data will decrease or > do we need to do some manual cleanup? Hopefully we don't keep it in there > forever. > You get the storage back as soon as the situation

Re: [ceph-users] maximum rebuild speed for erasure coding pool

2019-05-10 Thread Janne Johansson
Den tors 9 maj 2019 kl 17:46 skrev Feng Zhang : > Thanks, guys. > > I forgot the IOPS. So since I have 100disks, the total > IOPS=100X100=10K. For the 4+2 erasure, one disk fail, then it needs to > read 5 and write 1 objects.Then the whole 100 disks can do 10K/6 ~ 2K > rebuilding actions per

Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Janne Johansson
Den fre 19 apr. 2019 kl 12:10 skrev Marc Roos : > > [...]since nobody here is interested in a better rgw client for end > users. I am wondering if the rgw is even being used like this, and what > most production environments look like. > > "Like this" ? People use tons of scriptable and built-in

Re: [ceph-users] rbd ssd pool for (windows) vms

2019-05-06 Thread Janne Johansson
Den mån 6 maj 2019 kl 10:03 skrev Marc Roos : > > Yes but those 'changes' can be relayed via the kernel rbd driver not? > Besides I don't think you can move a rbd block device being used to a > different pool anyway. > > No, but you can move the whole pool, which takes all RBD images with it. >

Re: [ceph-users] Restricting access to RadosGW/S3 buckets

2019-05-03 Thread Janne Johansson
Den tors 2 maj 2019 kl 23:41 skrev Vladimir Brik < vladimir.b...@icecube.wisc.edu>: > Hello > I am trying to figure out a way to restrict access to S3 buckets. Is it > possible to create a RadosGW user that can only access specific bucket(s)? > You can have a user with very small bucket/bytes

Re: [ceph-users] rbd ssd pool for (windows) vms

2019-05-03 Thread Janne Johansson
Den ons 1 maj 2019 kl 23:00 skrev Marc Roos : > Do you need to tell the vm's that they are on a ssd rbd pool? Or does > ceph and the libvirt drivers do this automatically for you? > When testing a nutanix acropolis virtual install, I had to 'cheat' it by > adding this > > To make the installer

Re: [ceph-users] How does monitor know OSD is dead?

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 05:41 skrev Bryan Henderson : > I may need to modify the above, though, now that I know how Ceph works, > because I've seen storage server products that use Ceph inside. However, > I'll > bet the people who buy those are not aware that it's designed never to go > down >

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 09:01 skrev Luk : > Hello, > > I have strange problem with scrubbing. > > When scrubbing starts on PG which belong to default.rgw.buckets.index > pool, I can see that this OSD is very busy (see attachment), and starts > showing many > slow request, after the

Re: [ceph-users] OSD's won't start - thread abort

2019-07-03 Thread Janne Johansson
Den ons 3 juli 2019 kl 20:51 skrev Austin Workman : > > But a very strange number shows up in the active sections of the pg's > that's the same number roughly as 2147483648. This seems very odd, > and maybe the value got lodged somewhere it doesn't belong which is causing > an issue. > >

  1   2   >