Re: [ceph-users] Running ceph in docker

2016-07-05 Thread Josef Johansson
erber, wrote: > > > >>> Josef Johansson schrieb am Donnerstag, 30. Juni > 2016 um > 15:23: > > Hi, > > > > Hi, > > > You could actually managed every osd and mon and mds through docker > swarm, > > since all just software it make sense to dep

Re: [ceph-users] Mounting Ceph RBD image to XenServer 7 as SR

2016-06-30 Thread Josef Johansson
Also, is it possible to recompile the rbd kernel module in XenServer? I am under the impression that it's open source as well. Regards, Josef On Fri, 1 Jul 2016, 04:52 Mike Jacobacci, wrote: > Thanks Somnath and Christian, > > Yes, it looks like the latest version of XenServer still runs on an

Re: [ceph-users] Running ceph in docker

2016-06-30 Thread Josef Johansson
Hi, You could actually managed every osd and mon and mds through docker swarm, since all just software it make sense to deploy it through docker where you add the disk that is needed. Mons does not need permanent storage either. Not that a restart of the docker instance would remove the but rathe

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-20 Thread Josef Johansson
Hi, People ran into this when there were some changes in tunables that caused 70-100% movement, the solution was to find out what values that changed and increment them in the smallest steps possible. I've found that with major rearrangement in ceph the VMs does not neccesarily survive ( last tim

Re: [ceph-users] ceph cookbook failed: Where to report that https://git.ceph.com/release.asc is down?

2016-06-18 Thread Josef Johansson
your help. > > > On Sat, Jun 18, 2016 at 11:33 AM, Josef Johansson > wrote: > >> Hi, >> >> Shouldn't https://github.com/ceph/ceph/blob/master/keys/release.asc be >> up to date? >> >> Regards, >> Josef >> >> On Sat, 18 Jun 201

Re: [ceph-users] ceph cookbook failed: Where to report that https://git.ceph.com/release.asc is down?

2016-06-18 Thread Josef Johansson
Hi, Shouldn't https://github.com/ceph/ceph/blob/master/keys/release.asc be up to date? Regards, Josef On Sat, 18 Jun 2016, 17:29 Soonthorn Ativanichayaphong, < soont...@getfelix.com> wrote: > Hello, > > Our chef cookbook fail due to https://ceph.com/git/?p=ceph.git;a=bl > connection timedout. D

Re: [ceph-users] OSPF to the host

2016-06-08 Thread Josef Johansson
Hi, Regarding single points of failure on the daemon on the host I was thinking about doing a cluster setup with i.e. VyOS on kvm-machines on the host, and they handle all the ospf stuff as well. I have not done any performance benchmarks but it should be possible to do at least. Maybe even possib

[ceph-users] MONs fall out of quorum

2016-05-24 Thread Josef Johansson
Hi, I’m diagnosing a problem where monitors fall out of quorum now and then. It seems that when two monitors do a new election, one answer is not received until 5 minutes later. I checked ntpd on the servers, and all of them are spot on, no sync problems. This is happening a couple of time ever

Re: [ceph-users] Diagnosing slow requests

2016-05-24 Thread Josef Johansson
Hi, > On 24 May 2016, at 09:16, Christian Balzer wrote: > > > Hello, > > On Tue, 24 May 2016 07:03:25 +0000 Josef Johansson wrote: > >> Hi, >> >> You need to monitor latency instead of peak points. As Ceph is writing to >> two other nodes if you

Re: [ceph-users] Diagnosing slow requests

2016-05-24 Thread Josef Johansson
Hi, You need to monitor latency instead of peak points. As Ceph is writing to two other nodes if you have 3 replicas that is 4x extra the latency compared to one roundtrip to the first OSD from client. So smaller and more IO equals more pain in latency. And the worst thing is that there is nothin

Re: [ceph-users] ceph cache tier clean rate too low

2016-04-19 Thread Josef Johansson
Hi, response in line On 20 Apr 2016 7:45 a.m., "Christian Balzer" wrote: > > > Hello, > > On Wed, 20 Apr 2016 03:42:00 + Stephen Lord wrote: > > > > > OK, you asked ;-) > > > > I certainly did. ^o^ > > > This is all via RBD, I am running a single filesystem on top of 8 RBD > > devices in an

Re: [ceph-users] User Interface

2016-03-11 Thread Josef Johansson
Proxmox handles the block storage at least, I know that ownCloud handles object storage through rgw nowadays :) Regards, Josef > On 02 Mar 2016, at 20:51, Michał Chybowski > wrote: > > Unfortunately, VSM can manage only pools / clusters created by itself. > Pozdrawiam > Michał Chybowski > Tik

[ceph-users] Problems with starting services on Debian Jessie/Infernalis

2016-03-07 Thread Josef Johansson
Hi, We’re setting up a new cluster, but we’re having trouble restarting the monitor services. The problem is the difference between the ceph.service and ceph-mon@osd11 service in our case. root@osd11:/etc/init.d# /bin/systemctl status ceph.service ● ceph.service - LSB: Start Ceph distributed

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
Hm, I should be a bit more updated now. At least for {debian,rpm}-{hammer,infernalis,testing} /Josef > On 29 Feb 2016, at 19:19, Wido den Hollander wrote: > > >> Op 29 februari 2016 om 18:22 schreef Austin Johnson >> : >> >> >> All, >> >> I agree that rsync is down on download.ceph.com. I

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
Got rpm-infernalis in now, and I’m updating debian-infernalis as well. /Josef > On 29 Feb 2016, at 15:44, Josef Johansson wrote: > > Syncing now. >> On 29 Feb 2016, at 15:38, Josef Johansson > <mailto:jose...@gmail.com>> wrote: >> >> I’ll check if I c

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
Syncing now. > On 29 Feb 2016, at 15:38, Josef Johansson wrote: > > I’ll check if I can mirror it though http. >> On 29 Feb 2016, at 15:31, Josef Johansson > <mailto:jose...@gmail.com>> wrote: >> >> Then we’re all in the same boat. >> >>&g

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
I’ll check if I can mirror it though http. > On 29 Feb 2016, at 15:31, Josef Johansson wrote: > > Then we’re all in the same boat. > >> On 29 Feb 2016, at 15:30, Florent B > <mailto:flor...@coppint.com>> wrote: >> >> Hi and thank you. But for me, yo

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
Then we’re all in the same boat. > On 29 Feb 2016, at 15:30, Florent B wrote: > > Hi and thank you. But for me, you are out of sync as eu.ceph.com. Can't find > Infernalis 9.2.1 on your mirror :( > > On 02/29/2016 03:21 PM, Josef Johansson wrote: >> Y

Re: [ceph-users] Ceph mirrors wanted!

2016-02-29 Thread Josef Johansson
You could sync from me instead @ se.ceph.com As a start. Regards /Josef > On 29 Feb 2016, at 15:19, Florent B wrote: > > I would like to inform you that I have difficulties to set-up a mirror. > > rsync on download.ceph.com is down > > # rsync download.ceph.com:: > rsyn

Re: [ceph-users] v0.94.6 Hammer released

2016-02-29 Thread Josef Johansson
Maybe the reverse is possible, where we as a community lend out computing resources that the central build system could use. > On 29 Feb 2016, at 14:38, Josef Johansson wrote: > > Hi, > > There is also https://github.com/jordansissel/fpm/wiki > <https://github.com/

Re: [ceph-users] v0.94.6 Hammer released

2016-02-29 Thread Josef Johansson
Hi, There is also https://github.com/jordansissel/fpm/wiki I find it quite useful for building deb/rpm. What would be useful for the community per se would be if you made a Dockerfile for each type of combination, i.e. Ubuntu trusty / 10.0.3 and so fo

Re: [ceph-users] Dedumplication feature

2016-02-28 Thread Josef Johansson
I assume you author meant deduplication? :-) Cheers, Josef > On 28 Feb 2016, at 02:08, Lindsay Mathieson > wrote: > > On 28/02/2016 10:23 AM, Shinobu Kinjo wrote: >> Does the Ceph have ${subject}? > > Well ceph 0.67 was codename "Dumpling", and we are well past that, so yes I > guess ceph has

Re: [ceph-users] Tips for faster openstack instance boot

2016-02-09 Thread Josef Johansson
The biggest question here is if the OS is using systemctl or not. Cl7 boots extremely quick but our cl6 instances take up to 90 seconds if the cluster has work to do. I know there a lot to do in the init as well with boot profiling etc that could help. /Josef On Tue, 9 Feb 2016 17:11 Vickey Sing

Re: [ceph-users] Ceph mirrors wanted!

2016-02-07 Thread Josef Johansson
gt; > Wido > > > Op 6 februari 2016 om 8:22 schreef Josef Johansson : > > > > > > Hi Wido, > > > > We're planning on hosting here in Sweden. > > > > I can let you know when we're ready. > > > > Regards > > > > >

Re: [ceph-users] Ceph mirrors wanted!

2016-02-05 Thread Josef Johansson
Hi Wido, We're planning on hosting here in Sweden. I can let you know when we're ready. Regards Josef On Sat, 30 Jan 2016 15:15 Wido den Hollander wrote: > Hi, > > My PR was merged with a script to mirror Ceph properly: > https://github.com/ceph/ceph/tree/master/mirroring > > Currently ther

Re: [ceph-users] Ceph Tech Talk - High-Performance Production Databases on Ceph

2016-02-03 Thread Josef Johansson
I was fascinated as well. This is how it should be done ☺ We are in the middle of ordering and I saw the notice that they use single socket systems for the OSDs due to latency issues. I have only seen dual socket systems on the OSD setups here. Is this something you should do with new SSD clusters

Re: [ceph-users] very high OSD RAM usage values

2016-01-08 Thread Josef Johansson
Maybe changing the number of concurrent back fills could limit the memory usage. On 9 Jan 2016 05:52, "Josef Johansson" wrote: > Hi, > > I would say this is normal. 1GB of ram per 1TB is what we designed the > cluster for, I would believe that an EC-pool demands a lot mo

Re: [ceph-users] very high OSD RAM usage values

2016-01-08 Thread Josef Johansson
Hi, I would say this is normal. 1GB of ram per 1TB is what we designed the cluster for, I would believe that an EC-pool demands a lot more. Buy more ram and start everything 32GB ram is quite little, when the cluster is operating OK you'll see that extra ram getting used as file cache which makes

Re: [ceph-users] KVM problems when rebalance occurs

2016-01-07 Thread Josef Johansson
0 >> """ >> >> I already made a benchmark on our staging setup with the new config and >> fio, but >> did not really get different results than before. >> >> For us it is hardly possible to reproduce the 'stalling' problems on the >>

Re: [ceph-users] KVM problems when rebalance occurs

2016-01-06 Thread Josef Johansson
Hi, Also make sure that you optimize the debug log config. There's a lot on the ML on how to set them all to low values (0/0). Not sure how it's in infernalis but it did a lot in previous versions. Regards, Josef On 6 Jan 2016 18:16, "Robert LeBlanc" wrote: > -BEGIN PGP SIGNED MESSAGE-

Re: [ceph-users] Help! OSD host failure - recovery without rebuilding OSDs

2015-12-28 Thread Josef Johansson
Did you manage to work this out? On 25 Dec 2015 9:33 am, "Josef Johansson" wrote: > Hi > > Someone here will probably lay out a detailed answer but to get you > started, > > All the details for the osd are in the xfs partitions, mirror a new USB > key and change i

Re: [ceph-users] Help! OSD host failure - recovery without rebuilding OSDs

2015-12-25 Thread Josef Johansson
Hi Someone here will probably lay out a detailed answer but to get you started, All the details for the osd are in the xfs partitions, mirror a new USB key and change ip etc and you should be able to recover. If the journal is linked to a /dev/sdx, make sure it's in the same spot as it was befor

Re: [ceph-users] OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

2015-12-12 Thread Josef Johansson
Thanks for sharing your solution as well! Happy holidays /Josef On 12 Dec 2015 12:56 pm, "Claes Sahlström" wrote: > Just to share with the rest of the list, my problems have been solved now. > > > > I got this information from Sergey Malinin who had the same problem: > > 1. Stop OSD daemons on a

Re: [ceph-users] OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

2015-11-16 Thread Josef Johansson
And if you look through the archives Sage did release a version of Infernalis that fixed if you didn’t do it that way as well. > On 16 Nov 2015, at 22:15, David Clarke wrote: > > On 17/11/15 09:46, Claes Sahlström wrote: >> Did some more logging and for some reason it seems like I do have some

Re: [ceph-users] OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

2015-11-16 Thread Josef Johansson
Hi, That piece of code is keeping your OSD from booting. Well you could run the below to check the version as well. Might do that with the mon as well just to be sure. # /usr/bin/ceph-osd --version ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) Regards, /Josef > On 16 Nov 2015,

Re: [ceph-users] OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

2015-11-15 Thread Josef Johansson
cc the list as well > On 15 Nov 2015, at 23:41, Josef Johansson wrote: > > Hi, > > So it’s just frozen at that point? > > You should definatly increase the logging and restart the osd. I believe it’s > debug osd 20 and debug mon 20. > > A quick google b

Re: [ceph-users] OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

2015-11-15 Thread Josef Johansson
Hi, Could you catch any segmentation faults in /var/log/ceph/ceph-osd.11.log ? Regards, Josef > On 15 Nov 2015, at 23:06, Claes Sahlström wrote: > > Sorry to almost double post, I noticed that it seems like one mon is down, > but they do actually seem to be ok, the 11 that are in falls out an

Re: [ceph-users] Potential OSD deadlock?

2015-10-05 Thread Josef Johansson
074.0024.0032.67 > 0.035.006.000.00 5.00 3.00 > sde 0.00 0.000.000.00 0.00 0.00 0.00 > 0.000.000.000.00 0.00 0.00 > sdj 0.00 0.000.00 787.50 0.00 9418.0023.92

Re: [ceph-users] Potential OSD deadlock?

2015-10-04 Thread Josef Johansson
0.00 0.000.000.00 0.00 0.00 0.00 > 0.000.000.000.00 0.00 0.00 > sdj 0.00 0.000.00 787.50 0.00 9418.0023.92 > 0.070.100.000.10 0.09 6.70 > sdk 0.00 0.000.00

Re: [ceph-users] Potential OSD deadlock?

2015-10-03 Thread Josef Johansson
Hi, I don't know what brand those 4TB spindles are, but I know that mine are very bad at doing write at the same time as read. Especially small read write. This has an absurdly bad effect when doing maintenance on ceph. That being said we see a lot of difference between dumpling and hammer in per

Re: [ceph-users] help! failed to start ceph-mon daemon

2015-09-20 Thread Josef Johansson
Hi, No, not myself, did you manage to compile with the --without-ttng flag? On 21 Sep 2015 02:51, "Zhen Wang" wrote: > BTW, did you successfully build the deb package?⊙▽⊙ > > > 发自 网易邮箱大师 <http://u.163.com/signature> > > > On 2015-09-20 17:47 , Josef Johanss

[ceph-users] Fwd: Re: help! failed to start ceph-mon daemon

2015-09-20 Thread Josef Johansson
Posting to the ML as well. -- Forwarded message -- From: "Josef Johansson" Date: 20 Sep 2015 11:47 Subject: Re: [ceph-users] help! failed to start ceph-mon daemon To: "wikison" Cc: Hi, I would assume the deb knows more about the startup system, and the inst

Re: [ceph-users] Check networking first?

2015-08-01 Thread Josef Johansson
Hi, I did a "big-ping" test to verify the network after last major network problem. If anyone wants to take a peek I could share. Cheers Josef lör 1 aug 2015 02:19 Ben Hines skrev: > I encountered a similar problem. Incoming firewall ports were blocked > on one host. So the other OSDs kept mar

Re: [ceph-users] Discuss: New default recovery config settings

2015-05-29 Thread Josef Johansson
Hi, We did it the other way around instead, defining a period where the load is lighter and turn off/on backfill/recover. Then you want the backfill values to be the what is default right now. Also, someone said that (think it was Greg?) If you have problems with backfill, your cluster backing st

Re: [ceph-users] How to improve latencies and per-VM performance and latencies

2015-05-20 Thread Josef Johansson
Hi, Just to add, there’s also a collectd plugin at https://github.com/rochaporto/collectd-ceph . Things to check when you have slow read performance is: *) how much defragmentation on those xfs-partitions? With some workloads you get high values pr

Re: [ceph-users] Find out the location of OSD Journal

2015-05-14 Thread Josef Johansson
I tend to use something along the lines for osd in $(grep osd /etc/mtab | cut -d ' ' -f 2); do echo "$(echo $osd | cut -d '-' -f 2): $(readlink -f $(readlink $osd/journal))";done | sort -k 2 Cheers, Josef > On 08 May 2015, at 02:47, Robert LeBlanc wrote: > > You may also be able to use `ceph

[ceph-users] defragment xfs-backed OSD

2015-04-26 Thread Josef Johansson
Hi, I’m seeing high fragmentation on my OSDs, is it safe to perform xfs_fsr defragmentation? Any guidelines in using it? I would assume doing it in off hours and using a tmp-file for saving the last position for the defrag. Thanks! /Josef ___ ceph-us

[ceph-users] IOWait on SATA-backed with SSD-journals

2015-04-25 Thread Josef Johansson
Hi, With inspiration from all the other performance threads going on here, I started to investigate on my own as well. I’m seeing a lot iowait on the OSD, and the journal utilised at 2-7%, with about 8-30MB/s (mostly around 8MB/s write). This is a dumpling cluster. The goal here is to increase

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Josef Johansson
y, we are going to check those dead SSDs on a pc/laptop or so,just >>> to confirm they are really dead - but this is the way they die, not wear >>> out, but simply show different space instead of real one - thse were 3 >>> months old only when they died... >>> >>&g

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Josef Johansson
If the same chassi/chip/backplane is behind both drives and maybe other drives in the chassi have troubles,it may be a defect there as well. On 18 Apr 2015 09:42, "Steffen W Sørensen" wrote: > > > On 17/04/2015, at 21.07, Andrija Panic wrote: > > > > nahSamsun 850 PRO 128GB - dead after 3mon

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
- 2 of these died... > wearing level is 96%, so only 4% wasted... (yes I know these are not > enterprise,etc... ) > > On 17 April 2015 at 21:01, Josef Johansson wrote: > >> tough luck, hope everything comes up ok afterwards. What models on the >> SSD? >> >> /Josef &g

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
OSDs. > > On 17 April 2015 at 19:01, Josef Johansson wrote: > >> Hi, >> >> Did 6 other OSDs go down when re-adding? >> >> /Josef >> >> On 17 Apr 2015, at 18:49, Andrija Panic wrote: >> >> 12 osds down - I expect less wo

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
Hi, Did 6 other OSDs go down when re-adding? /Josef > On 17 Apr 2015, at 18:49, Andrija Panic wrote: > > 12 osds down - I expect less work with removing and adding osd? > > On Apr 17, 2015 6:35 PM, "Krzysztof Nowicki" > wrote: > Why not just wipe out the

Re: [ceph-users] metadata management in case of ceph object storage and ceph block storage

2015-04-16 Thread Josef Johansson
Hi, Maybe others had your mail going into junk as well, but that is why at least I did not see it. To your question, which I’m not sure I understand completely. In Ceph you have three distinct types of services, Mon, Monitors MDS, Metadata Servers OSD, Object Storage Devices And some other c

Re: [ceph-users] low power single disk nodes

2015-04-10 Thread Josef Johansson
Hi, You have these guys as well, http://www.seagate.com/gb/en/products/enterprise-servers-storage/nearline-storage/kinetic-hdd/ I talked to them during WHD, and they said that it's not fit for ceph if you pack 70 of them in one chassi because of the noise level. I would assume that 1U wirh alot o

Re: [ceph-users] Finding out how much data is in the journal

2015-03-23 Thread Josef Johansson
> On 23 Mar 2015, at 03:58, Haomai Wang wrote: > > On Mon, Mar 23, 2015 at 2:53 AM, Josef Johansson <mailto:jose...@gmail.com>> wrote: >> Hi all! >> >> Trying to figure out how much my journals are used, using SSDs as journals >> and SATA-drives a

[ceph-users] Finding out how much data is in the journal

2015-03-22 Thread Josef Johansson
Hi all! Trying to figure out how much my journals are used, using SSDs as journals and SATA-drives as storage, I dive into perf dump. But I can’t figure out why journal_queue_bytes is at constant 0. The only thing that differs is dirtied in WBThrottle. Maybe I’ve disable that when setting the i

Re: [ceph-users] Uneven CPU usage on OSD nodes

2015-03-21 Thread Josef Johansson
I'm neither a dev or a well informed Cepher. But I've seen posts that the pg count may be set too high, see https://www.mail-archive.com/ceph-users@lists.ceph.com/msg16205.html Also, we use 128GB+ in production on the OSD servers with 10 osd per server because it boosts the read cache,so you may w

Re: [ceph-users] SSD Hardware recommendation

2015-03-20 Thread Josef Johansson
> On 19 Mar 2015, at 08:17, Christian Balzer wrote: > > On Wed, 18 Mar 2015 08:59:14 +0100 Josef Johansson wrote: > >> Hi, >> >>> On 18 Mar 2015, at 05:29, Christian Balzer wrote: >>> >>> >>> Hello, >>> >>>

Re: [ceph-users] World hosting days 2015

2015-03-18 Thread Josef Johansson
nktank and the community) representation last > years. > > see you! > -- > Pawel > > On Tue, Mar 17, 2015 at 6:38 PM, Josef Johansson <mailto:jose...@gmail.com>> wrote: > Hi, > > I was wondering if any cepher where going to WHD this year? > > Cheers, &

Re: [ceph-users] SSD Hardware recommendation

2015-03-18 Thread Josef Johansson
gt; I'm going to use intel s3610 ssd for my production cluster, can't comment > about samsung drive. > > > I'll try to post benchmark results in coming weeks. > > > - Mail original - > De: "Josef Johansson" > À: "ceph-users" >

Re: [ceph-users] SSD Hardware recommendation

2015-03-18 Thread Josef Johansson
Hi, > On 18 Mar 2015, at 05:29, Christian Balzer wrote: > > > Hello, > > On Wed, 18 Mar 2015 03:52:22 +0100 Josef Johansson wrote: > >> Hi, >> >> I’m planning a Ceph SSD cluster, I know that we won’t get the full >> performance from the SSD i

[ceph-users] SSD Hardware recommendation

2015-03-17 Thread Josef Johansson
Hi, I’m planning a Ceph SSD cluster, I know that we won’t get the full performance from the SSD in this case, but SATA won’t cut it as backend storage and SAS is the same price as SSD now. The backend network will be a 10GbE active/passive, but will be used mainly for MySQL, so we’re aiming fo

[ceph-users] World hosting days 2015

2015-03-17 Thread Josef Johansson
Hi, I was wondering if any cepher where going to WHD this year? Cheers, Josef ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD turned itself off

2015-02-16 Thread Josef Johansson
And yeah, it’s the same EIO 5 error. So ok, the errors doesn’t show anything useful to the osd crash. > On 16 Feb 2015, at 21:58, Josef Johansson wrote: > > Well, I knew it had all the correct information since earlier so gave it a > shot :) > > Anyway, I think it

Re: [ceph-users] OSD turned itself off

2015-02-16 Thread Josef Johansson
ote: > > Woah, major thread necromancy! :) > > On Feb 13, 2015, at 3:03 PM, Josef Johansson <mailto:jo...@oderland.se>> wrote: >> >> Hi, >> >> I skimmed the logs again, as we’ve had more of this kinda errors, >> >> I saw a lot of lossy connectio

Re: [ceph-users] OSD turned itself off

2015-02-13 Thread Josef Johansson
2-10 20:20:36.673969 7f6d5b954700 0 -- 10.168.7.23:6819/10217 submit_message osd_op_reply(11088 rbd_data.10b8c82eb141f2.4459 [stat,write 749568~8192] ondisk = 0) v4 remote, 10.168.7.55:0/1005630, failed lossy con, dropping message 0x138db200 Could this have lead to the data being errone

Re: [ceph-users] RBD and HA KVM anybody?

2014-12-15 Thread Josef Johansson
Hi, > On 16 Dec 2014, at 05:00, Christian Balzer wrote: > > > Hello, > > On Mon, 15 Dec 2014 09:23:23 +0100 Josef Johansson wrote: > >> Hi Christian, >> >> We’re using Proxmox that has support for HA, they do it per-vm. >> We’re doing it

Re: [ceph-users] RBD and HA KVM anybody?

2014-12-15 Thread Josef Johansson
Hi Christian, We’re using Proxmox that has support for HA, they do it per-vm. We’re doing it manually right now though, because we like it :). When I looked at it I couldn’t see a way of just allowing a set of hosts in the HA (i.e. not the storage nodes), but that’s probably easy to solve. Che

[ceph-users] Showing package loss in ceph main log

2014-09-12 Thread Josef Johansson
Hi, I've stumpled upon this a couple of times, where Ceph just stops responding, but still works. The cause has been package loss on the network layer, but Ceph doesn't say anything. Is there a debug flag for showing retransmission of package, or someway to see that packages are lost? Regards, J

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
On 07 Sep 2014, at 04:47, Christian Balzer wrote: > On Sat, 6 Sep 2014 19:47:13 +0200 Josef Johansson wrote: > >> >> On 06 Sep 2014, at 19:37, Josef Johansson wrote: >> >>> Hi, >>> >>> Unfortunatly the journal tuning did not do much. That

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
On 06 Sep 2014, at 19:37, Josef Johansson wrote: > Hi, > > Unfortunatly the journal tuning did not do much. That’s odd, because I don’t > see much utilisation on OSDs themselves. Now this leads to a network-issue > between the OSDs right? > To answer my own question. Resta

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Hi, Unfortunatly the journal tuning did not do much. That’s odd, because I don’t see much utilisation on OSDs themselves. Now this leads to a network-issue between the OSDs right? On 06 Sep 2014, at 18:17, Josef Johansson wrote: > Hi, > > On 06 Sep 2014, at 17:59, Christian Balz

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Hi, On 06 Sep 2014, at 17:59, Christian Balzer wrote: > > Hello, > > On Sat, 6 Sep 2014 17:41:02 +0200 Josef Johansson wrote: > >> Hi, >> >> On 06 Sep 2014, at 17:27, Christian Balzer wrote: >> >>> >>> Hello, >>> >>

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Hi, On 06 Sep 2014, at 18:05, Christian Balzer wrote: > > Hello, > > On Sat, 6 Sep 2014 17:52:59 +0200 Josef Johansson wrote: > >> Hi, >> >> Just realised that it could also be with a popularity bug as well and >> lots a small traffic. And seeing

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Johansson wrote: > Hi, > > On 06 Sep 2014, at 17:27, Christian Balzer wrote: > >> >> Hello, >> >> On Sat, 6 Sep 2014 17:10:11 +0200 Josef Johansson wrote: >> >>> We manage to go through the restore, but the performance degradation is >>> s

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Hi, On 06 Sep 2014, at 17:27, Christian Balzer wrote: > > Hello, > > On Sat, 6 Sep 2014 17:10:11 +0200 Josef Johansson wrote: > >> We manage to go through the restore, but the performance degradation is >> still there. >> > Manifesting itself how? >

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
still there afterwards? i.e. if I set back the weight would it move back all the PGs? Regards, Josef On 06 Sep 2014, at 15:52, Josef Johansson wrote: > FWI I did restart the OSDs until I saw a server that made impact. Until that > server stopped doing impact, I didn’t get lower in the

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
it gets to replicating the same PGs that were causing troubles the first time. On 06 Sep 2014, at 15:04, Josef Johansson wrote: > Actually, it only worked with restarting for a period of time to get the > recovering process going. Can’t get passed the 21k object mark. > > I’m

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
, at 14:33, Josef Johansson wrote: > Hi, > > On 06 Sep 2014, at 13:53, Christian Balzer wrote: > >> >> Hello, >> >> On Sat, 6 Sep 2014 13:37:25 +0200 Josef Johansson wrote: >> >>> Also putting this on the list. >>> >&g

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Hi, On 06 Sep 2014, at 13:53, Christian Balzer wrote: > > Hello, > > On Sat, 6 Sep 2014 13:37:25 +0200 Josef Johansson wrote: > >> Also putting this on the list. >> >> On 06 Sep 2014, at 13:36, Josef Johansson wrote: >> >>> Hi, >>&g

Re: [ceph-users] Huge issues with slow requests

2014-09-06 Thread Josef Johansson
Also putting this on the list. On 06 Sep 2014, at 13:36, Josef Johansson wrote: > Hi, > > Same issues again, but I think we found the drive that causes the problems. > > But this is causing problems as it’s trying to do a recover to that osd at > the moment. > >

[ceph-users] Good way to monitor detailed latency/throughput

2014-09-05 Thread Josef Johansson
Hi, How do you guys monitor the cluster to find disks that behave bad, or VMs that impact the Ceph cluster? I'm looking for something where I could get a good bird-view of latency/throughput, that uses something easy like SNMP. Regards, Josef Joha

Re: [ceph-users] Using Ramdisk wi

2014-07-30 Thread Josef Johansson
Hi, Just chippin in, As RAM is pretty cheap right now, it could be an idea to fill all the memory slots in the OSDs, bigger chance that the data you've requested is actually in ram already then. You should go with DC S3700 400GB for the journals at least.. Cheers, Josef On 30/07/14 17:12, Chris

[ceph-users] Recommendation to safely avoid problems with osd-failure

2014-07-28 Thread Josef Johansson
.e. if two OSD hosts dies around the same time I suspect that the clients would suffer greatly. Currently the osd has the following settings osd max backfills = 1 osd recovery max active = 1 Is there any general guidance or recommendation for unexpected outages? Cheers, Josef

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-25 Thread Josef Johansson
Hi, On 25/06/14 00:27, Mark Kirkwood wrote: > On 24/06/14 23:39, Mark Nelson wrote: >> On 06/24/2014 03:45 AM, Mark Kirkwood wrote: >>> On 24/06/14 18:15, Robert van Leeuwen wrote: > All of which means that Mysql performance (looking at you binlog) may > still suffer due to lots of small b

Re: [ceph-users] OSD turned itself off

2014-06-13 Thread Josef Johansson
Thanks for the quick response. Cheers, Josef Gregory Farnum skrev 2014-06-14 02:36: On Fri, Jun 13, 2014 at 5:25 PM, Josef Johansson wrote: Hi Greg, Thanks for the clarification. I believe the OSD was in the middle of a deep scrub (sorry for not mentioning this straight away), so then it

Re: [ceph-users] OSD turned itself off

2014-06-13 Thread Josef Johansson
com | http://ceph.com On Fri, Jun 13, 2014 at 5:16 PM, Josef Johansson wrote: Hey, Just examing what happened to an OSD, that was just turned off. Data has been moved away from it, so hesitating to turned it back on. Got the below in the logs, any clues to what the assert talks about? Cheers, Jo

[ceph-users] OSD turned itself off

2014-06-13 Thread Josef Johansson
Hey, Just examing what happened to an OSD, that was just turned off. Data has been moved away from it, so hesitating to turned it back on. Got the below in the logs, any clues to what the assert talks about? Cheers, Josef -1 os/FileStore.cc: In function 'virtual int FileStore::read(coll_t,

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-06-13 Thread Josef Johansson
atency and whatnot... Christian On Wed, 14 May 2014 21:33:06 +0900 Christian Balzer wrote: Hello! On Wed, 14 May 2014 11:29:47 +0200 Josef Johansson wrote: Hi Christian, I missed this thread, haven't been reading the list that well the last weeks. You already know my setup, since we disc

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-06-13 Thread Josef Johansson
with doing it all virtual? If it's just memory and the same machine, we should see the pure ceph performance right? Anyone done this? Cheers, Josef Stefan Priebe - Profihost AG skrev 2014-05-15 09:58: Am 15.05.2014 09:56, schrieb Josef Johansson: On 15/05/14 09:11, Stefan Priebe - Pr

Re: [ceph-users] Ceph networks, to bond or not to bond?

2014-06-07 Thread Josef Johansson
Hi, Late to the party, but just to be sure, does the switch support mc-lag or mlag by any chance? There could be updates integrating this. Cheers, Josef Sven Budde skrev 2014-06-06 13:06: Hi all, thanks for the replies and heads up for the different bonding options. I'll toy around with th

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-15 Thread Josef Johansson
On 15/05/14 09:11, Stefan Priebe - Profihost AG wrote: > Am 15.05.2014 00:26, schrieb Josef Johansson: >> Hi, >> >> So, apparently tmpfs does not support non-root xattr due to a possible >> DoS-vector. There's configuration set for enabling it as far as

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-14 Thread Josef Johansson
ceph/osd/ceph-50: (22) Invalid argument Cheers, Josef Christian Balzer skrev 2014-05-14 14:33: Hello! On Wed, 14 May 2014 11:29:47 +0200 Josef Johansson wrote: Hi Christian, I missed this thread, haven't been reading the list that well the last weeks. You already know my setup, since we d

Re: [ceph-users] Slow IOPS on RBD compared to journalandbackingdevices

2014-05-14 Thread Josef Johansson
> /Field Storage Support Engineer/** > > Despegar.com - IT Team > > > > > > > > > >> --- Original message --- >> *Asunto:* Re: [ceph-users] Slow IOPS on RBD compared to >> journalandbackingdevices >> *De:* Josef Johansson >> *Par

Re: [ceph-users] Slow IOPS on RBD compared to journal andbackingdevices

2014-05-14 Thread Josef Johansson
Anders* >> /Field Storage Support Engineer/** >> >> Despegar.com - IT Team >> >> >> >> >> >> >> >> >> >> >> --- Original message --- >> *Asunto:* Re: [ceph-users] Slow IOPS on RBD compared to journal &g

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-14 Thread Josef Johansson
Hi Christian, I missed this thread, haven't been reading the list that well the last weeks. You already know my setup, since we discussed it in an earlier thread. I don't have a fast backing store, but I see the slow IOPS when doing randwrite inside the VM, with rbd cache. Still running dumpling

Re: [ceph-users] OSD full - All RBD Volumes stopped responding

2014-04-11 Thread Josef Johansson
On 11/04/14 09:07, Wido den Hollander wrote: > >> Op 11 april 2014 om 8:50 schreef Josef Johansson : >> >> >> Hi, >> >> On 11/04/14 07:29, Wido den Hollander wrote: >>>> Op 11 april 2014 om 7:13 schreef Greg Poirier : >>>> >

Re: [ceph-users] OSD full - All RBD Volumes stopped responding

2014-04-10 Thread Josef Johansson
Hi, On 11/04/14 07:29, Wido den Hollander wrote: > >> Op 11 april 2014 om 7:13 schreef Greg Poirier : >> >> >> One thing to note >> All of our kvm VMs have to be rebooted. This is something I wasn't >> expecting. Tried waiting for them to recover on their own, but that's not >> happening. Reb

Re: [ceph-users] How to detect journal problems

2014-04-09 Thread Josef Johansson
ead out over different objects. Cheers, Josef > Christian > >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> On Wed, Apr 9, 2014 at 3:06 AM, Christian Balzer wrote: >>> On Tue, 8 Apr 2014 09:35:19 -0700 Gregory Farnum wrote: >>

  1   2   >