Re: [ceph-users] radosgw and keystone version 3 domains

2015-09-18 Thread Shinobu Kinjo
What's error message you saw when you tried? Shinobu - Original Message - From: "Abhishek L" To: "Robert Duncan" Cc: ceph-us...@ceph.com Sent: Friday, September 18, 2015 12:29:20 PM Subject: Re: [ceph-users] radosgw and keystone version 3 domains On Fri, Sep 18, 2015 at 4:38 AM, Robert

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Martin Palma
Thanks all for the suggestions. Our storage nodes have plenty of RAM and their only purpose is to host the OSD daemons, so we will not create a swap partition on provisioning. For the OS disk we will then use a software raid 1 to handle eventually disk failures. For provisioning the hosts we use

[ceph-users] Clarification of Cache settings

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Based on some discussion on where promotions were limited to only 10% increased the performance of the cache tier (sorry I can't find that discussion at the moment to reference). I've been reading through http://ceph.com/docs/master/rados/operations/

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If you decide to use swap, be warned that significant parts of the OSD code can be swapped out even without memory pressure. This has caused OSD processes to take 5 minutes to shut down in my experience. I would recommend tuning swappiness in this ca

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread 张冬卯
yes, a raid1 system disk is necessary, from my perspective. And a swap partition is still needed even though the memory is big. Martin Palma mailto:mar...@palma.bz>> 于 2015年9月18日,下午11:07写道:___ ceph-users mailing list ceph-users@lists.ceph.com http://li

Re: [ceph-users] ceph osd won't boot, resource shortage?

2015-09-18 Thread Shinobu Kinjo
Sorry for that. That's my fault. Disclaimer: This is about what I do always to do advanced investigation. This is NOT about common solution. Other experts have different solution. So what you should do to know what's exactly going on on I/O layer. 1.Install fio 2.Change the followin

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Hi, I think I found the problem : a way too large journal. I catch this from logs of an OSD having blocked queries : OSD.15 : 2015-09-19 00:41:12.717062 7fb8a3c44700 1 journal check_for_full at 3548528640 : JOURNAL FULL 3548528640 >= 1376255 (max_size 4294967296 start 3549904896) 2015-09-19 00

Re: [ceph-users] Hammer reduce recovery impact

2015-09-18 Thread Quentin Hartman
I just applied the following settings to my cluster and it resulted in much better behavior in the hosted VMs: osd_backfill_scan_min = 2 osd_backfill_scan_max = 16 osd_recovery_max_active = 1 osd_max_backfills = 1 osd_recovery_threads = 1 osd_recovery_op_priority = 1 On my "canary" VM iowait drop

Re: [ceph-users] multi-datacenter crush map

2015-09-18 Thread Gregory Farnum
On Fri, Sep 18, 2015 at 4:57 AM, Wouter De Borger wrote: > Hi all, > > I have found on the mailing list that it should be possible to have a multi > datacenter setup, if latency is low enough. > > I would like to set this up, so that each datacenter has at least two > replicas and each PG has a re

Re: [ceph-users] Delete pool with cache tier

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Created request http://tracker.ceph.com/issues/13163 - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 18, 2015 at 12:06 PM, John Spray wrote: > On Fri, Sep 18, 2015 at 7:04 PM, Robert

Re: [ceph-users] Using cephfs with hadoop

2015-09-18 Thread Gregory Farnum
On Thu, Sep 17, 2015 at 7:48 PM, Fulin Sun wrote: > Hi, guys > > I am wondering if I am able to deploy ceph and hadoop into different cluster > nodes and I can > > still use cephfs as the backend for hadoop access. > > For example, ceph in cluster 1 and hadoop in cluster 2, while cluster 1 and > c

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Jan Schermer
Hi, > On 18 Sep 2015, at 17:06, Martin Palma wrote: > > Hi, > > Is it a good idea to use a software raid for the system disk (Operating > System) on a Ceph storage node? I mean only for the OS not for the OSD disks. > Yes, absolutely. Or even a hardware RAID if that's what you use elsewhere.

Re: [ceph-users] Delete pool with cache tier

2015-09-18 Thread John Spray
On Fri, Sep 18, 2015 at 7:04 PM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Is there a way to delete a pool with a cache tier without first > evicting the cache tier and removing it (ceph 0.94.3)? > > Something like: > > ceph osd pool delete --delete-cache-tier

[ceph-users] Delete pool with cache tier

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Is there a way to delete a pool with a cache tier without first evicting the cache tier and removing it (ceph 0.94.3)? Something like: ceph osd pool delete --delete-cache-tier --yes-i-really-really-mean-it Evicting the cache tier has taken over

[ceph-users] lttng duplicate registration problem when using librados2 and libradosstriper

2015-09-18 Thread Paul Mansfield
Hello, thanks for your attention. I have started using rados striper library, calling the functions from a C program. As soon as I add libradosstriper to the linking process, I get this error when the program runs, even though I am not calling any functions from the rados striper library (I comme

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-09-18 Thread Quentin Hartman
No, they are dead dead dead. Can't get anything off of them. If you look back further on this thread I think the most noteworthy part of this whole experience is just how far off my write estimates were. The ones that have not died have somewhere between 24 and 32 TB written to them after 9 months

Re: [ceph-users] debian repositories path change?

2015-09-18 Thread Ken Dreyer
On Fri, Sep 18, 2015 at 9:28 AM, Sage Weil wrote: > On Fri, 18 Sep 2015, Alfredo Deza wrote: >> The new locations are in: >> >> >> http://packages.ceph.com/ >> >> For debian this would be: >> >> http://packages.ceph.com/debian-{release} > > Make that download.ceph.com .. the packages url was tempo

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Depends on how easy it is to rebuild an OS from scratch. If you have something like Puppet or Chef that configure a node completely for you, it may not be too much of a pain to forgo the RAID. We run our OSD nodes from a single SATADOM and use Puppet

Re: [ceph-users] debian repositories path change?

2015-09-18 Thread Sage Weil
On Fri, 18 Sep 2015, Alfredo Deza wrote: > The new locations are in: > > > http://packages.ceph.com/ > > For debian this would be: > > http://packages.ceph.com/debian-{release} Make that download.ceph.com .. the packages url was temporary while we got the new site ready and will go away short

[ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Martin Palma
Hi, Is it a good idea to use a software raid for the system disk (Operating System) on a Ceph storage node? I mean only for the OS not for the OSD disks. And what about a swap partition? Is that needed? Best, Martin ___ ceph-users mailing list ceph-use

Re: [ceph-users] missing SRPMs - for librados2 and libradosstriper1?

2015-09-18 Thread Alfredo Deza
On Fri, Sep 18, 2015 at 10:02 AM, Paul Mansfield wrote: > > p.s. this page: >http://docs.ceph.com/docs/giant/install/get-packages/ > > is entirely wrong and the links to >http://ceph.com/packages/ceph-extras/rpm > > are all useless; http://ceph.com/packages seems to have gone away, I > als

Re: [ceph-users] debian repositories path change?

2015-09-18 Thread Alfredo Deza
The new locations are in: http://packages.ceph.com/ For debian this would be: http://packages.ceph.com/debian-{release} Note that ceph-extras is no longer available: the current repos should provide everything/anything that is needed to properly install ceph. Otherwise, please let us know . O

Re: [ceph-users] debian repositories path change?

2015-09-18 Thread Brian Kroth
Hmm, apparently I haven't gotten that far in my email backlog yet. That's good to know too. Thanks, Brian Olivier Bonvalet 2015-09-18 16:02: Hi, not sure if it's related, but there is recent changes because of a security issue : http://ceph.com/releases/important-security-notice-regarding-

Re: [ceph-users] missing SRPMs - for librados2 and libradosstriper1?

2015-09-18 Thread Paul Mansfield
p.s. this page: http://docs.ceph.com/docs/giant/install/get-packages/ is entirely wrong and the links to http://ceph.com/packages/ceph-extras/rpm are all useless; http://ceph.com/packages seems to have gone away, I also tried https. ;-( ___ ceph

Re: [ceph-users] debian repositories path change?

2015-09-18 Thread Olivier Bonvalet
Hi, not sure if it's related, but there is recent changes because of a security issue : http://ceph.com/releases/important-security-notice-regarding-signing-key-and-binary-downloads-of-ceph/ Le vendredi 18 septembre 2015 à 08:45 -0500, Brian Kroth a écrit : > Hi all, we've had the following i

[ceph-users] missing SRPMs - for librados2 and libradosstriper1?

2015-09-18 Thread Paul Mansfield
I was looking to download the SRPMs associated with the packages in http://download.ceph.com/rpm-hammer/rhel6/x86_64/ or http://download.ceph.com/rpm-hammer/rhel7/x86_64/ but there's only a subset; the things I am really looking for are librados2 and libradosstriper1 source rpms. Please can

[ceph-users] debian repositories path change?

2015-09-18 Thread Brian Kroth
Hi all, we've had the following in our /etc/apt/sources.list.d/ceph.list for a while based on some previous docs, # ceph upstream stable (currently giant) release packages for wheezy: deb http://ceph.com/debian/ wheezy main # ceph extras: deb http://ceph.com/packages/ceph-extras/debian wheezy m

Re: [ceph-users] ceph osd won't boot, resource shortage?

2015-09-18 Thread Peter Sabaini
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 18.09.15 14:47, Shinobu Kinjo wrote: > I do not think that it's best practice to increase that number > at the moment. It's kind of lack of consideration. > > We might need to do that as a result. > > But what we should do, first, is to check curr

Re: [ceph-users] ceph osd won't boot, resource shortage?

2015-09-18 Thread Shinobu Kinjo
I do not think that it's best practice to increase that number at the moment. It's kind of lack of consideration. We might need to do that as a result. But what we should do, first, is to check current actual number of aio using: watch -dc cat /proc/sys/fs/aio-nr then increase, if it's necessa

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Le vendredi 18 septembre 2015 à 14:14 +0200, Paweł Sadowski a écrit : > It might be worth checking how many threads you have in your system > (ps > -eL | wc -l). By default there is a limit of 32k (sysctl -q > kernel.pid_max). There is/was a bug in fork() > (https://lkml.org/lkml/2015/2/3/345) repo

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Paweł Sadowski
On 09/18/2015 12:17 PM, Olivier Bonvalet wrote: > Le vendredi 18 septembre 2015 à 12:04 +0200, Jan Schermer a écrit : >>> On 18 Sep 2015, at 11:28, Christian Balzer wrote: >>> >>> On Fri, 18 Sep 2015 11:07:49 +0200 Olivier Bonvalet wrote: >>> Le vendredi 18 septembre 2015 à 10:59 +0200, Jan S

[ceph-users] multi-datacenter crush map

2015-09-18 Thread Wouter De Borger
Hi all, I have found on the mailing list that it should be possible to have a multi datacenter setup, if latency is low enough. I would like to set this up, so that each datacenter has at least two replicas and each PG has a replication level of 3. In this

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Le vendredi 18 septembre 2015 à 12:04 +0200, Jan Schermer a écrit : > > On 18 Sep 2015, at 11:28, Christian Balzer wrote: > > > > On Fri, 18 Sep 2015 11:07:49 +0200 Olivier Bonvalet wrote: > > > > > Le vendredi 18 septembre 2015 à 10:59 +0200, Jan Schermer a écrit > > > : > > > > In that case it

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Jan Schermer
> On 18 Sep 2015, at 11:28, Christian Balzer wrote: > > On Fri, 18 Sep 2015 11:07:49 +0200 Olivier Bonvalet wrote: > >> Le vendredi 18 septembre 2015 à 10:59 +0200, Jan Schermer a écrit : >>> In that case it can either be slow monitors (slow network, slow >>> disks(!!!) or a CPU or memory prob

Re: [ceph-users] help! Ceph Manual Depolyment

2015-09-18 Thread Max A. Krasilnikov
Здравствуйте! On Thu, Sep 17, 2015 at 11:59:47PM +0800, wikison wrote: > Is there any detailed manual deployment document? I downloaded the source and > built ceph, then installed ceph on 7 computers. I used three as monitors and > four as OSD. I followed the official document on ceph.com. Bu

Re: [ceph-users] C example of using libradosstriper?

2015-09-18 Thread Paul Mansfield
Hello sorry for delay in replying. I have found your example code very useful. My problem now is that I am using LTTNG to trace my program and it seems that libradosstriper also uses LTTNG and both try and initialise it and the program exits. I don't really want to trip out all my trace and debug

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
On Fri, 18 Sep 2015 11:07:49 +0200 Olivier Bonvalet wrote: > Le vendredi 18 septembre 2015 à 10:59 +0200, Jan Schermer a écrit : > > In that case it can either be slow monitors (slow network, slow > > disks(!!!) or a CPU or memory problem). > > But it still can also be on the OSD side in the form

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
Hello, On Fri, 18 Sep 2015 10:35:37 +0200 Olivier Bonvalet wrote: > Le vendredi 18 septembre 2015 à 17:04 +0900, Christian Balzer a écrit : > > Hello, > > > > On Fri, 18 Sep 2015 09:37:24 +0200 Olivier Bonvalet wrote: > > > > > Hi, > > > > > > sorry for missing informations. I was to avoid pu

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Le vendredi 18 septembre 2015 à 10:59 +0200, Jan Schermer a écrit : > In that case it can either be slow monitors (slow network, slow > disks(!!!) or a CPU or memory problem). > But it still can also be on the OSD side in the form of either CPU > usage or memory pressure - in my case there were lo

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Jan Schermer
In that case it can either be slow monitors (slow network, slow disks(!!!) or a CPU or memory problem). But it still can also be on the OSD side in the form of either CPU usage or memory pressure - in my case there were lots of memory used for pagecache (so for all intents and purposes consider

[ceph-users] Strange rbd hung with non-standard crush location

2015-09-18 Thread Max A. Krasilnikov
Hello! I have 3-node ceph cluster under ubuntu 14.04.3 with hammer 0.94.2 from ubuntu-cloud repository. My config and crush map is attached below. After adding a volume with cinder any of my openstack instances hung after a small period of time with "[sda]: abort" message in VM's kernel log. When

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
mmm good point. I don't see CPU or IO problem on mons, but in logs, I have this : 2015-09-18 01:55:16.921027 7fb951175700 0 log [INF] : pgmap v86359128: 6632 pgs: 77 inactive, 1 remapped, 10 active+remapped+wait_backfill, 25 peering, 5 active+remapped, 6 active+remapped+backfilling, 6499 active+

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Le vendredi 18 septembre 2015 à 17:04 +0900, Christian Balzer a écrit : > Hello, > > On Fri, 18 Sep 2015 09:37:24 +0200 Olivier Bonvalet wrote: > > > Hi, > > > > sorry for missing informations. I was to avoid putting too much > > inappropriate infos ;) > > > Nah, everything helps, there are kno

Re: [ceph-users] erasure pool, ruleset-root

2015-09-18 Thread Loic Dachary
On 18/09/2015 09:00, Loic Dachary wrote: > Hi Tom, > > Could you please share command you're using and their output ? A dump of the > crush rules would also be useful to figure out why it did not work as > expected. > s/command/the commands/ > Cheers > > On 18/09/2015 01:01, Deneau, Tom wr

Re: [ceph-users] help! Ceph Manual Depolyment

2015-09-18 Thread Henrik Korkuc
On 15-09-17 18:59, wikison wrote: Is there any detailed manual deployment document? I downloaded the source and built ceph, then installed ceph on 7 computers. I used three as monitors and four as OSD. I followed the official document on ceph.com. But it didn't work and it seemed to be out-da

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
Hello, On Fri, 18 Sep 2015 09:37:24 +0200 Olivier Bonvalet wrote: > Hi, > > sorry for missing informations. I was to avoid putting too much > inappropriate infos ;) > Nah, everything helps, there are known problems with some versions, kernels, file systems, etc. Speaking of which, what FS are

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Jan Schermer
Could this be caused by monitors? In my case lagging monitors can also cause slow requests (because of slow peering). Not sure if that's expected or not, but it of course doesn't show on the OSDs as any kind of bottleneck when you try to investigate... Jan > On 18 Sep 2015, at 09:37, Olivier B

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-09-18 Thread Jan Schermer
"850 PRO" is a workstation drive. You shouldn't put it in the server... But it should not just die either way, so don't tell them you use it for Ceph next time. Do the drives work when replugged? Can you get anything from SMART? Jan > On 18 Sep 2015, at 02:57, James (Fei) Liu-SSI > wrote: >

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
Hi, sorry for missing informations. I was to avoid putting too much inappropriate infos ;) Le vendredi 18 septembre 2015 à 12:30 +0900, Christian Balzer a écrit : > Hello, > > On Fri, 18 Sep 2015 02:43:49 +0200 Olivier Bonvalet wrote: > > The items below help, but be a s specific as possible,

[ceph-users] ESXI 5.5 Update 3 and LIO

2015-09-18 Thread Nick Fisk
Hi All, Just browsing through the release notes of the latest ESXi update and can see this During transient error conditions, I/O to a device might repeatedly fail and not failover to an alternate working path During transient error conditions like BUS BUSY, QFULL, HOST ABORTS, HOST RETRY a

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
But yes, I will try to increase OSD verbosity. Le jeudi 17 septembre 2015 à 20:28 -0700, GuangYang a écrit : > Which version are you using? > > My guess is that the request (op) is waiting for lock (might be > ondisk_read_lock of the object, but a debug_osd=20 should be helpful > to tell what hap

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Olivier Bonvalet
I use Ceph 0.80.10. I see IO wait is near 0 thanks to iostat, htop (in detailed mode), and rechecked with Zabbix supervisor. Le jeudi 17 septembre 2015 à 20:28 -0700, GuangYang a écrit : > Which version are you using? > > My guess is that the request (op) is waiting for lock (might be > ondisk_

Re: [ceph-users] erasure pool, ruleset-root

2015-09-18 Thread Loic Dachary
Hi Tom, Could you please share command you're using and their output ? A dump of the crush rules would also be useful to figure out why it did not work as expected. Cheers On 18/09/2015 01:01, Deneau, Tom wrote: > I see that I can create a crush rule that only selects osds > from a certain node