Re: [ceph-users] RGW how to delete orphans

2019-08-13 Thread Andrei Mikhailovsky
Hello I was hoping to follow up on this email and if Florian manage to get to the bottom of this. I have a case where I believe my RGW bucket is using too much space. For me, the ceph df command shows over 16TB usage, whereas the bucket stats shows the total of about 6TB. So, It seems that

Re: [ceph-users] troubleshooting space usage

2019-07-04 Thread Andrei Mikhailovsky
Thanks for trying to help, Igor. > From: "Igor Fedotov" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Thursday, 4 July, 2019 12:52:16 > Subject: Re: [ceph-users] troubleshooting space usage > Yep, this looks fine.. > hmm... sorry,

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
, 2019 13:49:02 > Subject: Re: [ceph-users] troubleshooting space usage > Looks fine - comparing bluestore_allocated vs. bluestore_stored shows a little > difference. So that's not the allocation overhead. > What's about comparing object counts reported by ceph and radosgw tools? >

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
tion overhead. Looks > like > some orphaned objects in the pool. Could you please compare and share the > amounts of objects in the pool reported by "ceph (or rados) df detail" and > radosgw tools? > Thanks, > Igor > On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Andrei Mikhailovsky
re performance counter dumps (ceph > daemon > osd.N perf dump) and " > " reports from a couple of your OSDs. > Thanks, > Igor > On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote: >> Bump! >>> From: "Andrei Mikhailovsky" [ mailto:and...@arh

Re: [ceph-users] troubleshooting space usage

2019-07-02 Thread Andrei Mikhailovsky
Bump! > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Friday, 28 June, 2019 14:54:53 > Subject: [ceph-users] troubleshooting space usage > Hi > Could someone please explain / show how to troubleshoot the space usage in > Ceph > and how to

[ceph-users] troubleshooting space usage

2019-06-28 Thread Andrei Mikhailovsky
Hi Could someone please explain / show how to troubleshoot the space usage in Ceph and how to reclaim the unused space? I have a small cluster with 40 osds, replica of 2, mainly used as a backend for cloud stack as well as the S3 gateway. The used space doesn't make any sense to me,

Re: [ceph-users] performance in a small cluster

2019-05-29 Thread Andrei Mikhailovsky
It would be interesting to learn the improvements types and the BIOS changes that helped you. Thanks > From: "Martin Verges" > To: "Robert Sander" > Cc: "ceph-users" > Sent: Wednesday, 29 May, 2019 10:19:09 > Subject: Re: [ceph-users] performance in a small cluster > Hello Robert, >> We

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Andrei Mikhailovsky
Hi Christian, - Original Message - > From: "Christian Balzer" > To: "ceph-users" > Cc: "Andrei Mikhailovsky" > Sent: Tuesday, 16 October, 2018 08:51:36 > Subject: Re: [ceph-users] Luminous with osd flapping, slow requests when deep > sc

[ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-15 Thread Andrei Mikhailovsky
Hello, I am currently running Luminous 12.2.8 on Ubuntu with 4.15.0-36-generic kernel from the official ubuntu repo. The cluster has 4 mon + osd servers. Each osd server has the total of 9 spinning osds and 1 ssd for the hdd and ssd pools. The hdds are backed by the S3710 ssds for journaling

Re: [ceph-users] bluestore osd journal move

2018-09-24 Thread Andrei Mikhailovsky
r us, there's no guarantee that they will work for you. > Read it very carefully and recheck every step before executing it. > > Regards, > Eugen > > [1] > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-February/024913.html > [2] > http://heiterbiswolkig.blogs.nd

[ceph-users] bluestore osd journal move

2018-09-24 Thread Andrei Mikhailovsky
Hello everyone, I am wondering if it is possible to move the ssd journal for the bluestore osd? I would like to move it from one ssd drive to another. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] how to swap osds between servers

2018-09-03 Thread Andrei Mikhailovsky
Hello everyone, I am in the process of adding an additional osd server to my small ceph cluster as well as migrating from filestore to bluestore. Here is my setup at the moment: Ceph - 12.2.5 , running on Ubuntu 16.04 with latest updates 3 x osd servers with 10x3TB SAS drives, 2 x Intel

[ceph-users] checking rbd volumes modification times

2018-07-16 Thread Andrei Mikhailovsky
Dear cephers, Could someone tell me how to check the rbd volumes modification times in ceph pool? I am currently in the process of trimming our ceph pool and would like to start with volumes which were not modified for a long time. How do I get that information? Cheers Andrei

Re: [ceph-users] Luminous Bluestore performance, bcache

2018-06-29 Thread Andrei Mikhailovsky
Thanks Richard, That sounds impressive, especially the around 30% hit ratio. That would be ideal for me, but we were only getting single digit results during my trials. I think around 5% was the figure if I remember correctly. However, most of our vms were created a bit chaotically (not using

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-28 Thread Andrei Mikhailovsky
. All other pools look okay so far. I am wondering what could have got horribly wrong with the above pool? Cheers Andrei - Original Message - > From: "Brad Hubbard" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Thursday, 28 June

Re: [ceph-users] Luminous Bluestore performance, bcache

2018-06-28 Thread Andrei Mikhailovsky
Hi Richard, It is an interesting test for me too as I am planning to migrate to Bluestore storage and was considering repurposing the ssd disks that we currently use for journals. I was wondering if you are using the Filestore or the bluestone for the osds? Also, when you perform your

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
; : "", "snapid" : -2, "max" : 0 }, "truncate_size" : 0, "version" : "120985'632942", "expected_object_size" : 0, "omap_digest" : "0x",

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Andrei Mikhailovsky
"num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 207, "num_bytes_recovered": 0, "num_keys_recovered": 9482826,

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-26 Thread Andrei Mikhailovsky
r you > can query any other pg that has osd.21 as its primary? > > On Mon, Jun 25, 2018 at 8:04 PM, Andrei Mikhailovsky > wrote: >> Hi Brad, >> >> here is the output: >> >> -- >> >> root@arh-ibstorage1-ib:/home/andrei# ceph

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-25 Thread Andrei Mikhailovsky
2.168.168.201:0/3046734987 wait complete. 2018-06-25 10:59:12.112764 7fe244b28700 1 -- 192.168.168.201:0/3046734987 >> 192.168.168.201:0/3046734987 conn(0x7fe240167220 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down 2018-06-25 10:59:12.112770 7fe244b28700 2 -- 192.168.168.201:0/3046734987 >>

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-20 Thread Andrei Mikhailovsky
, 2018 00:02:07 > Subject: Re: [ceph-users] fixing unrepairable inconsistent PG > Can you post the output of a pg query? > > On Tue, Jun 19, 2018 at 11:44 PM, Andrei Mikhailovsky > wrote: >> A quick update on my issue. I have noticed that while I was trying to move >> the pr

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
A quick update on my issue. I have noticed that while I was trying to move the problem object on osds, the file attributes got lost on one of the osds, which is I guess why the error messages showed the no attribute bit. I then copied the attributes metadata to the problematic object and

[ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
Hello everyone I am having trouble repairing one inconsistent and stubborn PG. I get the following error in ceph.log: 2018-06-19 11:00:00.000225 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 675 : cluster [ERR] overall HEALTH_ERR noout flag(s) set; 4 scrub errors; Possible data

Re: [ceph-users] decreasing number of PGs

2017-10-03 Thread Andrei Mikhailovsky
threshold for the warning, but it is still a problem you should > address > in your cluster. > On Mon, Oct 2, 2017 at 4:02 PM Jack < [ mailto:c...@jack.fr.eu.org | > c...@jack.fr.eu.org ] > wrote: >> You cannot; >> On 02/10/2017 21:43, Andrei Mikhailovsky wrote: >

[ceph-users] decreasing number of PGs

2017-10-02 Thread Andrei Mikhailovsky
Hello everyone, what is the safest way to decrease the number of PGs in the cluster. Currently, I have too many per osd. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Migration to ceph BlueStore

2017-08-02 Thread Andrei Mikhailovsky
Hello everyone, with the release of Kraken, I was thinking to migrate our existing ceph cluster to the BlueStore and use the existing journal ssd disks in a cache tier. The cluster that I have is pretty small, 3 servers with 10 osds each + 2 Intel 3710 SSDs for journals. Each server is also a

Re: [ceph-users] Ceph on XenServer

2017-02-24 Thread Andrei Mikhailovsky
Hi Max, I've played around with ceph on xenserver about 2-3 years ago. I made it work, but it was all hackish and a lot of manual work. It didn't play well with the cloud orchestrator and I gave up hoping that either Citrix or Ceph team would make it work. Currently, I would not recommend

[ceph-users] temp workaround for the unstable Jewel cluster

2017-02-16 Thread Andrei Mikhailovsky
Hello fellow cephers, I have been struggling with stability of my Jewel cluster and from what I can see I am not the only person. My setup is: 3 osd+mon servers, 30 osds, half a dozen of client host servers for rbd access, 40gbit/s infiniband link, all ceph servers are running on Ubuntu

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-09 Thread Andrei Mikhailovsky
Hi Jim, I've got a few questions for you as it looks like we have a similar cluster for our ceph infrastructure. A quick overview of what we have. We are also running a small cluster of 3 storage nodes (30 osds in total) and 5 clients over 40gig/s infiniband link (ipoib). Ever since

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-08 Thread Andrei Mikhailovsky
+1 Ever since upgrading to 10.2.x I have been seeing a lot of issues with our ceph cluster. I have been seeing osds down, osd servers running out of memory and killing all ceph-osd processes. Again, 10.2.5 on 4.4.x kernel. It seems what with every release there are more and more problems with

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-08 Thread Andrei Mikhailovsky
+1 Ever since upgrading to 10.2.x I have been seeing a lot of issues with our ceph cluster. I have been seeing osds down, osd servers running out of memory and killing all ceph-osd processes. Again, 10.2.5 on 4.4.x kernel. It seems what with every release there are more and more problems with

Re: [ceph-users] renaming ceph server names

2016-12-02 Thread Andrei Mikhailovsky
*BUMP* > From: "andrei" > To: "ceph-users" > Sent: Tuesday, 29 November, 2016 12:46:05 > Subject: [ceph-users] renaming ceph server names > Hello. > As a part of the infrastructure change we are planning to rename the servers > running ceph-osd,

[ceph-users] renaming ceph server names

2016-11-29 Thread Andrei Mikhailovsky
Hello. As a part of the infrastructure change we are planning to rename the servers running ceph-osd, ceph-mon and radosgw services. The IP addresses will be the same, it's only the server names which will need to change. I would like to find out the steps required to perform these changes?

Re: [ceph-users] Big problems encoutered during upgrade from hammer 0.94.5 to jewel 10.2.3

2016-11-13 Thread Andrei Mikhailovsky
Hi Vincent, when i did the upgrade, i've done all clients and servers at the same time. No issue during the upgrade at all. No downtime. However, when I set the tunables to optimal i've lost all IO to the clients, which happened gradually, like over a few hours the iowait went from low

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-12 Thread Andrei Mikhailovsky
Hi Orit, your instructions to the workaround has helped to solve the bucket creation problems that I had. again, many thanks for your help Andrei - Original Message - > From: "Orit Wasserman" <owass...@redhat.com> > To: "Andrei Mikhailovsky" <and

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-12 Thread Andrei Mikhailovsky
h>, "ceph-users" > <ceph-users@lists.ceph.com> > Sent: Saturday, 12 November, 2016 13:22:02 > Subject: Re: [ceph-users] radosgw - http status 400 while creating a bucket > On Fri, Nov 11, 2016 at 11:27 PM, Andrei Mikhailovsky <and...@arhont.com> > wrote: &

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-11 Thread Andrei Mikhailovsky
: >rados rm .rgw.root > 5. radosgw-admin realm create --rgw-realm=myrealm > 6. radosgw-admin zonegroup set --rgw-zonegroup=default --default < > default-zg.json > 7. radosgw-admin zone set --rgw-zone=default --deault < default-zone.json > 8. radosgw-admin period upd

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
gt; Subject: Re: [ceph-users] radosgw - http status 400 while creating a bucket > Your RGW doesn't think it's the master, and cannot connect to the > master, thus the create fails. > > Daniel > > On 11/08/2016 06:36 PM, Andrei Mikhailovsky wrote: >> Hello &

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
- > From: "Orit Wasserman" <owass...@redhat.com> > To: "Andrei Mikhailovsky" <and...@arhont.com> > Cc: "Yoann Moulin" <yoann.mou...@epfl.ch>, "ceph-users" > <ceph-users@lists.ceph.com> > Sent: Thursday, 10 November, 201

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
- Original Message - > From: "Orit Wasserman" <owass...@redhat.com> > To: "Andrei Mikhailovsky" <and...@arhont.com> > Cc: "Yoann Moulin" <yoann.mou...@epfl.ch>, "ceph-users" > <ceph-users@lists.ceph.com> > Sent:

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
"default", "domain_root": ".rgw", "control_pool": ".rgw.control", "gc_pool": ".rgw.gc", "log_pool": ".log", "intent_log_pool": ".intent-log", "usage_log_pool&

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
both services before running the script. I will run it again to make sure. Andrei - Original Message - > From: "Orit Wasserman" <owass...@redhat.com> > To: "Andrei Mikhailovsky" <and...@arhont.com> > Cc: "Yoann Moulin" <yoann.mou..

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
> "endpoints": [], >>> "hostnames": [], >>> "hostnames_s3website": [], >>> "master_zone": "", >>> "zones": [ >>> { >>> "id": &quo

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-10 Thread Andrei Mikhailovsky
***bump*** this is pretty broken and urgent. thanks - Original Message - > From: "Andrei Mikhailovsky" <and...@arhont.com> > To: "Yoann Moulin" <yoann.mou...@epfl.ch> > Cc: "ceph-users" <ceph-users@lists.ceph.com> > Sent: Wed

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
ot;log_data": "false", >> "bucket_index_max_shards": 0, >> "read_only": "false" >> } >> ], >> "placement_targets": [ >> { >> "name": &

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
} ], "default_placement": "default-placement", "realm_id": "" } The strange thing as you can see, following the "radosgw-admin period update --commit" command, the master_zone and the realm_id values reset to blank. What coul

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
- Original Message - > From: "Yehuda Sadeh-Weinraub" <yeh...@redhat.com> > To: "Andrei Mikhailovsky" <and...@arhont.com> > Cc: "ceph-users" <ceph-users@lists.ceph.com> > Sent: Wednesday, 9 November, 2016 01:13:48 > Subject: Re:

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-09 Thread Andrei Mikhailovsky
uot;name": "default-placement", "tags": [] } ], "default_placement": "default-placement", "realm_id": "5b41b1b2-0f92-463d-b582-07552f83e66c" } As you can see, the master_zone is now set to default. Howev

Re: [ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
: ".usage", "user_keys_pool": ".users", "user_email_pool": ".users.email", "user_swift_pool": ".users.swift", "user_uid_pool": ".users.uid", "system_key": { "access_key&q

[ceph-users] radosgw - http status 400 while creating a bucket

2016-11-08 Thread Andrei Mikhailovsky
Hello I am having issues with creating buckets in radosgw. It started with an upgrade to version 10.2.x When I am creating a bucket I get the following error on the client side: boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-07 Thread Andrei Mikhailovsky
t;, "user_uid_pool": ".users.uid", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [ { "key": "default-placement",

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-06 Thread Andrei Mikhailovsky
mally - is it possible that this is preventing the config migration > alluded to in that thread? I'm reluctant to do anything to the > still-working 0.94.9 gateway until I can get the 10.2.3 gateways working! > > Graham > > On 10/05/2016 04:23 PM, Andrei Mikhailovsky wrote: >> Hel

[ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-05 Thread Andrei Mikhailovsky
Hello everyone, I've just updated my ceph to version 10.2.3 from 10.2.2 and I am no longer able to start the radosgw service. When executing I get the following error: 2016-10-05 22:14:10.735883 7f1852d26a00 0 ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b), process radosgw,

[ceph-users] Jewel - frequent ceph-osd crashes

2016-08-30 Thread Andrei Mikhailovsky
Hello I've got a small cluster of 3 osd servers and 30 osds between them running Jewel 10.2.2 on Ubuntu 16.04 LTS with stock kernel version 4.4.0-34-generic. I am experiencing rather frequent osd crashes, which tend to happen a few times a month on random osds. The latest one gave me the

Re: [ceph-users] Advice on migrating from legacy tunables to Jewel tunables.

2016-08-09 Thread Andrei Mikhailovsky
Gregory, I've been given a tip by one of the ceph user list members on tuning values and data migration and cluster IO. I had an issues twice already where my vms would simply loose IO and crash while the cluster is being optimised for the new tunables. The recommendations were to upgrade the

Re: [ceph-users] change of dns names and IP addresses of cluster members

2016-07-22 Thread Andrei Mikhailovsky
> Subject: Re: [ceph-users] change of dns names and IP addresses of cluster > members > On 16-07-22 13:33, Andrei Mikhailovsky wrote: >> Hello >> We are planning to make changes to our IT infrastructure and as a result the >> fqdn and IPs of the ceph cluster will change

[ceph-users] change of dns names and IP addresses of cluster members

2016-07-22 Thread Andrei Mikhailovsky
Hello We are planning to make changes to our IT infrastructure and as a result the fqdn and IPs of the ceph cluster will change. Could someone suggest the best way of dealing with this to make sure we have a minimal ceph downtime? Many thanks Andrei

Re: [ceph-users] Error EPERM when running ceph tell command

2016-07-11 Thread Andrei Mikhailovsky
Hello again Any thoughts on this issue? Cheers Andrei > From: "Andrei Mikhailovsky" <and...@arhont.com> > To: "ceph-users" <ceph-users@lists.ceph.com> > Sent: Wednesday, 22 June, 2016 18:02:28 > Subject: [ceph-users] Error EPERM when running c

[ceph-users] Error EPERM when running ceph tell command

2016-06-22 Thread Andrei Mikhailovsky
Hi I am trying to run an osd level benchmark but get the following error: # ceph tell osd.3 bench Error EPERM: problem getting command descriptions from osd.3 I am running Jewel 10.2.2 on Ubuntu 16.04 servers. Has the syntax change or do I have an issue? Cheers Andrei

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Andrei Mikhailovsky
el tunables and > client IO optimisations > On 22/06/16 17:54, Andrei Mikhailovsky wrote: >> Hi Daniel, >> >> Many thanks for your useful tests and your results. >> >> How much IO wait do you have on your client vms? Has it significantly >> increased

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Andrei Mikhailovsky
Hi Daniel, Many thanks for your useful tests and your results. How much IO wait do you have on your client vms? Has it significantly increased or not? Many thanks Andrei - Original Message - > From: "Daniel Swarbrick" > To: "ceph-users"

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-20 Thread Andrei Mikhailovsky
had cascading failures of VMs. >> > However, after performing hard shutdowns on the VMs and restarting them, >> > they seemed to be OK. >> > At this stage, I have a strong suspicion that it is the introduction of >> > "require_feature_tunables5 = 1" in th

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-20 Thread Andrei Mikhailovsky
tarted to increase over the course of about an hour. At the end, there was 100% iowait on all vms. If this was the case, wouldn't I see iowait jumping to 100% pretty quickly? Also, I wasn't able to start any of my vms until i've rebooted one of my osd / mon servers following the successful PGs rebuild.

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-20 Thread Andrei Mikhailovsky
ive > > mailto:i...@ip-interactive.de > > Anschrift: > > IP Interactive UG ( haftungsbeschraenkt ) > Zum Sonnenberg 1-3 > 63571 Gelnhausen > > HRB 93402 beim Amtsgericht Hanau > Geschäftsführung: Oliver Dzombic > > Steuer Nr.: 35 236 3622 1 > US

[ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-18 Thread Andrei Mikhailovsky
Hello ceph users, I've recently upgraded my ceph cluster from Hammer to Jewel (10.2.1 and then 10.2.2). The cluster was running okay after the upgrade. I've decided to use the optimal tunables for Jewel as the ceph status was complaining about the straw version and my cluster settings were

Re: [ceph-users] Jewel ubuntu release is half cooked

2016-05-27 Thread Andrei Mikhailovsky
;ceph", MODE="660" So, it looks as all /dev/sd** (including partition numbers) which has the model attribute INTEL SSDSC2BA20 and changes the ownership. You might want to adjust your model number for the ssd journals. Andrei > From: "Ernst Pijper" <ernst.pij

Re: [ceph-users] using jemalloc in trusty

2016-05-25 Thread Andrei Mikhailovsky
Interesting, I've switched to jemalloc about a month ago while running Hammer. after installing the library and using the /etc/ld.so.preload I am seeing that all ceph-osd processes are indeed using the library. I've upgraded to Jewel a few days ago and see the same picture: # time lsof |grep

Re: [ceph-users] Jewel ubuntu release is half cooked

2016-05-24 Thread Andrei Mikhailovsky
Hi Anthony, > >> 2. Inefficient chown documentation - The documentation states that one should >> "chown -R ceph:ceph /var/lib/ceph" if one is looking to have ceph-osd ran as >> user ceph and not as root. Now, this command would run a chown process one >> osd >> at a time. I am considering my

[ceph-users] Jewel ubuntu release is half cooked

2016-05-23 Thread Andrei Mikhailovsky
Hello I've recently updated my Hammer ceph cluster running on Ubuntu 14.04 LTS servers and noticed a few issues during the upgrade. Just wanted to share my experience. I've installed the latest Jewel release. In my opinion, some of the issues I came across relate to poor upgrade

[ceph-users] existing ceph cluster - clean start

2016-05-03 Thread Andrei Mikhailovsky
Hello, I am planning to make some changes to our ceph cluster and would like to ask the community of the best route to take. Our existing cluster is made of 3 osd servers (two of which are also mon servers) and the total of 3 mon servers. The cluster is currently running on Ubuntu 14.04.x

[ceph-users] Optimal OS configuration for running ceph

2016-04-29 Thread Andrei Mikhailovsky
Hello everyone, Please excuse me if this topic has been covered already. I've not managed to find a guide, checklist or even a set of notes on optimising OS level settings/configuration/services for running ceph. One of the main reasons for asking is I've recently had to troubleshoot a bunch

Re: [ceph-users] Hammer broke after adding 3rd osd server

2016-04-29 Thread Andrei Mikhailovsky
t morning. > > If anyone has an idea what else I could try, please let me know. > > Andrei > > - Original Message - >> From: "Wido den Hollander" <w...@42on.com> >> To: "andrei" <and...@arhont.com> >> Cc: &quo

Re: [ceph-users] Hammer broke after adding 3rd osd server

2016-04-28 Thread Andrei Mikhailovsky
- > From: "Wido den Hollander" <w...@42on.com> > To: "andrei" <and...@arhont.com> > Cc: "ceph-users" <ceph-users@lists.ceph.com> > Sent: Tuesday, 26 April, 2016 22:18:37 > Subject: Re: [ceph-users] Hammer broke after adding 3rd osd s

Re: [ceph-users] Hammer broke after adding 3rd osd server

2016-04-26 Thread Andrei Mikhailovsky
so OK."? By clients do you mean the host servers? Many thanks Andrei - Original Message - > From: "Wido den Hollander" <w...@42on.com> > To: "ceph-users" <ceph-users@lists.ceph.com>, "Andrei Mikhailovsky" > <and...@arhont.com>

[ceph-users] Hammer broke after adding 3rd osd server

2016-04-26 Thread Andrei Mikhailovsky
Hello everyone, I've recently performed a hardware upgrade on our small two osd server ceph cluster, which seems to have broke the ceph cluster. We are using ceph for cloudstack rbd images for vms.All of our servers are Ubuntu 14.04 LTS with latest updates and kernel 4.4.6 from ubuntu repo.

[ceph-users] Ceph cluster upgrade - adding ceph osd server

2016-04-15 Thread Andrei Mikhailovsky
Hi all, Was wondering what is the best way to add a new osd server to the small ceph cluster? I am interested in minimising performance degradation as the cluster is live and actively used. At the moment i've got the following setup: 2 osd servers (9 osds each) Journals on Intel 520/530

Re: [ceph-users] rebalance near full osd

2016-04-12 Thread Andrei Mikhailovsky
I've done the ceph osd reweight-by-utilization and it seems to have solved the issue. However, not sure if this will be the long term solution. Thanks for your help Andrei - Original Message - > From: "Shinobu Kinjo" <shinobu...@gmail.com> > To: "Andrei Mikha

Re: [ceph-users] rebalance near full osd

2016-04-07 Thread Andrei Mikhailovsky
for pointing it out. Cheers Andrei - Original Message - > From: "Christian Balzer" <ch...@gol.com> > To: "ceph-users" <ceph-users@lists.ceph.com> > Cc: "Andrei Mikhailovsky" <and...@arhont.com> > Sent: Wednesday, 6 April, 2

[ceph-users] rebalance near full osd

2016-04-05 Thread Andrei Mikhailovsky
Hi I've just had a warning ( from ceph -s) that one of the osds is near full. Having investigated the warning, i've located that osd.6 is 86% full. The data distribution is nowhere near to being equal on my osds as you can see from the df command output below: /dev/sdj1 2.8T 2.4T 413G 86%

Re: [ceph-users] Unable to upload files with special characters like +

2016-02-02 Thread Andrei Mikhailovsky
Hi Eric, I remember having very similar issue when I was setting up radosgw. It turned out to be the issue on the proxy server side and not the radosgw. After trying a different proxy server the problem has been solved. Perhaps you have the same issue. Andrei > From: "Eric Magutu"

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-26 Thread Andrei Mikhailovsky
ot;Tyler Bishop" <tyler.bis...@beyondhosting.net> > To: "Lionel Bouton" <lionel+c...@bouton.name> > Cc: "Andrei Mikhailovsky" <and...@arhont.com>, "ceph-users" > <ceph-users@lists.ceph.com> > Sent: Tuesday, 22 December, 2015 16:

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-22 Thread Andrei Mikhailovsky
Hello guys, Was wondering if anyone has done testing on Samsung PM863 120 GB version to see how it performs? IMHO the 480GB version seems like a waste for the journal as you only need to have a small disk size to fit 3-4 osd journals. Unless you get a far greater durability. I am planning to

[ceph-users] release of the next Infernalis

2015-12-22 Thread Andrei Mikhailovsky
Hello guys, I was planning to upgrade our ceph cluster over the holiday period and was wondering when are you planning to release the next point release of the Infernalis? Should I wait for it or just roll out 9.2.0 for the time being? thanks Andrei

Re: [ceph-users] ceph and upgrading OS version

2015-10-22 Thread Andrei Mikhailovsky
Any thoughts anyone? Is it safe to perform OS version upgrade on the osd and mon servers? Thanks Andrei - Original Message - From: "Andrei Mikhailovsky" <and...@arhont.com> To: ceph-us...@ceph.com Sent: Tuesday, 20 October, 2015 8:05:19 PM Subject: [

Re: [ceph-users] ceph and upgrading OS version

2015-10-22 Thread Andrei Mikhailovsky
and follow the procedure on the second osd server. Do you think this could work? Performance wise, i do not have a great IO demand, in particular over a weekend. Thanks Andrei - Original Message - From: "Luis Periquito" <periqu...@gmail.com> To: "Andrei Mikhailovsk

[ceph-users] [urgent] KVM issues after upgrade to 0.94.4

2015-10-21 Thread Andrei Mikhailovsky
Hello guys, I've upgraded to the latest Hammer release and I've just noticed a massive issue after the upgrade ((( I am using ceph for virtual machine rbd storage over cloudstack. I am having issues with starting virtual routers. The libvirt error message is: cat r-1407-VM.log 2015-10-21

[ceph-users] ceph and upgrading OS version

2015-10-20 Thread Andrei Mikhailovsky
Hello everyone I am planning to upgrade my ceph servers from Ubuntu 12.04 to 14.04 and I am wondering if you have a recommended process of upgrading the OS version without causing any issues to the ceph cluster? Many thanks Andrei ___ ceph-users

Re: [ceph-users] v0.94.4 Hammer released

2015-10-20 Thread Andrei Mikhailovsky
Same here, the upgrade went well. So far so good. - Original Message - From: "Francois Lafont" To: "ceph-users" Sent: Tuesday, 20 October, 2015 9:14:43 PM Subject: Re: [ceph-users] v0.94.4 Hammer released Hi, On 20/10/2015 20:11,

[ceph-users] too many kworker processes after upgrade to 0.94.3

2015-10-20 Thread Andrei Mikhailovsky
Hello I've recently upgraded my ceph cluster from 0.94.1 to 0.94.3 and noticed that after about a day i started getting the emails from our network/host monitoring system. The notifications were that there are too many processes on the osd servers. I've not seen this before and I am running

Re: [ceph-users] Ceph and EnhanceIO cache

2015-06-26 Thread Andrei Mikhailovsky
Hi Nick, I've played with Flashcache and EnhanceIO, but I've decided not to use it for production in the end. The reason was that using both has increased the amount of slow requests that I had on the cluster and I have also noticed somewhat higher level of iowait on the vms. At that time, I

Re: [ceph-users] latest Hammer for Ubuntu precise

2015-06-22 Thread Andrei Mikhailovsky
Thanks Mate, I was under the same impression. Could someone at Inktank please help us with this problem? Is this intentional or has it simply been an error? Thanks Andrei -- Andrei Mikhailovsky Director Arhont Information Security Web: http://www.arhont.com http://www.wi-foo.com

[ceph-users] latest Hammer for Ubuntu precise

2015-06-21 Thread Andrei Mikhailovsky
Hi, I seem to be missing the latest Hammer release 0.94.2 in the repo for Ubuntu precise. I can see the packages for trusty, but precise still shows 0.94.1. Is there a miss or did you stop supporting precise? Or perhaps something is odd happened with my precise servers? Cheers Andrei

Re: [ceph-users] rbd performance issue - can't find bottleneck

2015-06-19 Thread Andrei Mikhailovsky
Hi guys, I also use a combination of intel 520 and 530 for my journals and have noticed that the latency and the speed of 520s is better than 530s. Could someone please confirm that doing the following at start up will stop the dsync on the relevant drives? # echo temporary write through

Re: [ceph-users] rbd performance issue - can't find bottleneck

2015-06-19 Thread Andrei Mikhailovsky
sense to get a small battery protected raid card in front of the 520s and 530s to protect against these types of scenarios? Cheers - Original Message - From: Mark Nelson mnel...@redhat.com To: Andrei Mikhailovsky and...@arhont.com Cc: ceph-users@lists.ceph.com Sent: Friday, 19 June, 2015

Re: [ceph-users] How to estimate whether putting a journal on SSD will help with performance?

2015-05-01 Thread Andrei Mikhailovsky
Piotr, You may also investigate if the cache tier made of a couple of ssds could help you. Not sure how the data is used in your company, but if you have a bunch of hot data that moves around from one vm to another it might greatly speed up the rsync. On the other hand, if a lot of rsync data

Re: [ceph-users] Possible improvements for a slow write speed (excluding independent SSD journals)

2015-04-26 Thread Andrei Mikhailovsky
Anthony, I doubt the manufacturer reported 315MB/s for 4K block size. Most likely they've used 1M or 4M as the block size to achieve the 300MB/s+ speeds Andrei - Original Message - From: Alexandre DERUMIER aderum...@odiso.com To: Anthony Levesque aleves...@gtcomm.net Cc:

Re: [ceph-users] Possible improvements for a slow write speed (excluding independent SSD journals)

2015-04-21 Thread Andrei Mikhailovsky
Hi I have been testing the Samsung 840 Pro (128gb) for quite sometime and I can also confirm that this drive is unsuitable for osd journal. The performance and latency that I get from these drives (according to ceph osd perf) are between 10 - 15 times slower compared to the Intel 520. The

Re: [ceph-users] deep scrubbing causes osd down

2015-04-12 Thread Andrei Mikhailovsky
will need to revert back to the default settings as the cluster as it currently is is not functional. Andrei - Original Message - From: LOPEZ Jean-Charles jelo...@redhat.com To: Andrei Mikhailovsky and...@arhont.com Cc: LOPEZ Jean-Charles jelo...@redhat.com, ceph-users@lists.ceph.com

Re: [ceph-users] deep scrubbing causes osd down

2015-04-12 Thread Andrei Mikhailovsky
not want to have more than 1 or 2 scrub/deep-scrubs running at the same time on my cluster. How do I implement this? Thanks Andrei - Original Message - From: Andrei Mikhailovsky and...@arhont.com To: LOPEZ Jean-Charles jelo...@redhat.com Cc: ceph-users@lists.ceph.com Sent: Sunday, 12

  1   2   3   >