Re: [ceph-users] Error with ceph to cloudstack integration.

2017-03-05 Thread Wido den Hollander
> Op 6 maart 2017 om 6:26 schreef frank : > > > Hi, > > We have setup a ceph server and cloudstack server. All the osds are up > with ceph status currently OK. > > > > [root@admin-ceph ~]# ceph status > cluster ebac75fc-e631-4c9f-a310-880cbcdd1d25 >

Re: [ceph-users] How to hide internal ip on ceph mount

2017-03-01 Thread Wido den Hollander
> Op 1 maart 2017 om 16:57 schreef Sage Weil <s...@newdream.net>: > > > On Wed, 1 Mar 2017, Wido den Hollander wrote: > > > Op 1 maart 2017 om 15:40 schreef Xiaoxi Chen <superdebu...@gmail.com>: > > > > > > > > > Well , I thi

Re: [ceph-users] How to hide internal ip on ceph mount

2017-03-01 Thread Wido den Hollander
> Op 1 maart 2017 om 15:40 schreef Xiaoxi Chen : > > > Well , I think the argument here is not all about security gain, it just > NOT a user friendly way to let "df" show out 7 IPs of monitorsMuch > better if they seeing something like "mycephfs.mydomain.com". >

Re: [ceph-users] RADOS as a simple object storage

2017-02-28 Thread Wido den Hollander
> Op 27 februari 2017 om 15:59 schreef Jan Kasprzak : > > > Hello, > > Gregory Farnum wrote: > : On Mon, Feb 20, 2017 at 11:57 AM, Jan Kasprzak wrote: > : > Gregory Farnum wrote: > : > : On Mon, Feb 20, 2017 at 6:46 AM, Jan Kasprzak

Re: [ceph-users] Can Cloudstack really be HA when using CEPH?

2017-02-25 Thread Wido den Hollander
thout having to update CloudStack's configuration. Wido > > On Feb 25, 2017 6:56 AM, "Wido den Hollander" <w...@42on.com> wrote: > > > > Op 24 februari 2017 om 19:48 schreef Adam Carheden < > adam.carhe...@gmail.com>: > > > > > > From th

Re: [ceph-users] Can Cloudstack really be HA when using CEPH?

2017-02-25 Thread Wido den Hollander
> Op 24 februari 2017 om 19:48 schreef Adam Carheden : > > > From the docs for each project: > > "When a primary storage outage occurs the hypervisor immediately stops > all VMs stored on that storage >

Re: [ceph-users] ceph-disk and mkfs.xfs are hanging on SAS SSD

2017-02-24 Thread Wido den Hollander
> Op 24 februari 2017 om 9:12 schreef Rajesh Kumar : > > > Hi, > > I am using Ceph Jewel on Ubuntu 16.04 Xenial, with SAS SSD and > driver=megaraid_sas > > > "/usr/bin/python /usr/sbin/ceph-disk prepare --osd-uuid --fs-type xfs > /dev/sda3" is hanging. This command is

Re: [ceph-users] PG stuck peering after host reboot

2017-02-24 Thread Wido den Hollander
arted is 0 and empty is 1. The other OSDs are reporting > last_epoch_started 16806 and empty 0. > > I noticed that too and was wondering why it never completed recovery and > joined > > > If you stop osd.307 and maybe mark it as out, does that help? > > No, I see the s

Re: [ceph-users] mgr active s01 reboot

2017-02-22 Thread Wido den Hollander
y. Check that first if the local user is allow to write to that file. > Must we first umount the filesystem? No, not required. Wido > > Regards, Arnoud. > > From: Wido den Hollander [w...@42on.com] > Sent: Wednesday, February 22, 2017 2

Re: [ceph-users] PG stuck peering after host reboot

2017-02-22 Thread Wido den Hollander
is 1. The other OSDs are reporting last_epoch_started 16806 and empty 0. My EC PG knowledge is not sufficient here to exactly tell you what is going on, but that's the only thing I noticed so far. If you stop osd.307 and maybe mark it as out, does that help? Wido > ___

Re: [ceph-users] PG stuck peering after host reboot

2017-02-22 Thread Wido den Hollander
> Op 21 februari 2017 om 15:35 schreef george.vasilaka...@stfc.ac.uk: > > > I have noticed something odd with the ceph-objectstore-tool command: > > It always reports PG X not found even on healthly OSDs/PGs. The 'list' op > works on both and unhealthy PGs. > Are you sure you are supplying

Re: [ceph-users] PG stuck peering after host reboot

2017-02-21 Thread Wido den Hollander
> Op 20 februari 2017 om 17:52 schreef george.vasilaka...@stfc.ac.uk: > > > Hi Wido, > > Just to make sure I have everything straight, > > > If the PG still doesn't recover do the same on osd.307 as I think that > > 'ceph pg X query' still hangs? > > > The info from ceph-objectstore-tool

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-19 Thread Wido den Hollander
ture, but for > > now, I would strongly recommend against SMR. > > > > Go for normal SATA drives with only slightly higher price/capacity ratios. > > > > - mike > > > >> On 2/3/17 2:46 PM, Stillwell, Bryan J wrote: > >> On 2/3/17, 3:23 AM,

Re: [ceph-users] Disable debug logging: best practice or not?

2017-02-17 Thread Wido den Hollander
> Op 17 februari 2017 om 17:44 schreef Kostis Fardelas : > > > Hi, > I keep reading recommendations about disabling debug logging in Ceph > in order to improve performance. There are two things that are unclear > to me though: > > a. what do we lose if we decrease default

Re: [ceph-users] PG stuck peering after host reboot

2017-02-17 Thread Wido den Hollander
hat osd.307 in on the same host as osd.595. > > We’ll have a look on osd.595 like you suggested. > If the PG still doesn't recover do the same on osd.307 as I think that 'ceph pg X query' still hangs? The info from ceph-objectstore-tool might shed some more light on this PG. Wido &g

Re: [ceph-users] PG stuck peering after host reboot

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 14:55 schreef george.vasilaka...@stfc.ac.uk: > > > Hi folks, > > I have just made a tracker for this issue: > http://tracker.ceph.com/issues/18960 > I used ceph-post-file to upload some logs from the primary OSD for the > troubled PG. > > Any help would be

Re: [ceph-users] KVM/QEMU rbd read latency

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 21:38 schreef Steve Taylor > : > > > You might try running fio directly on the host using the rbd ioengine (direct > librbd) and see how that compares. The major difference between that and the > krbd test will be the page cache

Re: [ceph-users] kraken-bluestore 11.2.0 memory leak issue

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 7:19 schreef Muthusamy Muthiah > : > > > Thanks IIya Letkowski for the information we will change this value > accordingly. > What I understand from yesterday's performance meeting is that this seems like a bug. Lowering this buffer

Re: [ceph-users] bcache vs flashcache vs cache tiering

2017-02-14 Thread Wido den Hollander
> Op 14 februari 2017 om 11:14 schreef Nick Fisk : > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Dongsheng Yang > > Sent: 14 February 2017 09:01 > > To: Sage Weil > > Cc:

Re: [ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
olve many > problems as the XFS journal is rewritten often and SMR disks don't like > rewrites. > I think that is one reason why btrfs works smoother with those disks. > > Hope this helps > > Bernhard > > Wido den Hollander <w...@42on.com> schrieb am Mo., 13

Re: [ceph-users] 1 PG stuck unclean (active+remapped) after OSD replacement

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 16:03 schreef Eugen Block : > > > Hi experts, > > I have a strange situation right now. We are re-organizing our 4 node > Hammer cluster from LVM-based OSDs to HDDs. When we did this on the > first node last week, everything went smoothly, I removed

Re: [ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
r, wasn't aware that SMR disks have that. SMR shouldn't be used in Ceph without proper support in Bluestore or XFS aware SMR. Wido > > On 02/13/17 15:49, Wido den Hollander wrote: > > Hi, > > > > I have a odd case with SMR disks in a Ceph clust

[ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
Hi, I have a odd case with SMR disks in a Ceph cluster. Before I continue, yes, I am fully aware of SMR and Ceph not playing along well, but there is something happening which I'm not able to fully explain. On a 2x replica cluster with 8TB Seagate SMR disks I can write with about 30MB/sec to

Re: [ceph-users] OSDs cannot match up with fast OSD map changes (epochs) during recovery

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 12:57 schreef Muthusamy Muthiah > : > > > Hi All, > > We also have same issue on one of our platforms which was upgraded from > 11.0.2 to 11.2.0 . The issue occurs on one node alone where CPU hits 100% > and OSDs of that node marked down.

Re: [ceph-users] - permission denied on journal after reboot

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 12:06 schreef Piotr Dzionek : > > > Hi, > > I am running ceph Jewel 10.2.5 with separate journals - ssd disks. It > runs pretty smooth, however I stumble upon an issue after system reboot. > Journal disks become owned by root and ceph failed

Re: [ceph-users] CephFS root squash?

2017-02-10 Thread Wido den Hollander
> Op 10 februari 2017 om 9:02 schreef Robert Sander > : > > > On 09.02.2017 20:11, Jim Kilborn wrote: > > > I am trying to figure out how to allow my users to have sudo on their > > workstation, but not have that root access to the ceph kernel mounted > >

Re: [ceph-users] Radosgw scaling recommendation?

2017-02-09 Thread Wido den Hollander
> Op 9 februari 2017 om 19:34 schreef Mark Nelson : > > > I'm not really an RGW expert, but I'd suggest increasing the > "rgw_thread_pool_size" option to something much higher than the default > 100 threads if you haven't already. RGW requires at least 1 thread per >

Re: [ceph-users] Speeding Up "rbd ls -l " output

2017-02-09 Thread Wido den Hollander
l-refresh 01b375db-d3f5-33c1-9389-8bf226c887e8 > Pool 01b375db-d3f5-33c1-9389-8bf226c887e8 refreshed > > > real 0m22.504s > user 0m0.012s > sys 0m0.004s > > Thanks > Özhan > > > On Thu, Feb 9, 2017 at 11:30 AM, Wido den Hollander <w...@42on.com> wrote: >

Re: [ceph-users] Speeding Up "rbd ls -l " output

2017-02-09 Thread Wido den Hollander
> Op 9 februari 2017 om 9:13 schreef Özhan Rüzgar Karaman > : > > > Hi; > I am using Hammer 0.49.9 release on my Ceph Storage, today i noticed that > listing an rbd pool takes to much time then the old days. If i have more > rbd images on pool it takes much more time.

Re: [ceph-users] would people mind a slow osd restart during luminous upgrade?

2017-02-08 Thread Wido den Hollander
> Op 9 februari 2017 om 4:09 schreef Sage Weil : > > > Hello, ceph operators... > > Several times in the past we've had to do some ondisk format conversion > during upgrade which mean that the first time the ceph-osd daemon started > after upgrade it had to spend a few

Re: [ceph-users] ceph df : negative numbers

2017-02-06 Thread Wido den Hollander
> Op 6 februari 2017 om 11:10 schreef Florent B : > > > # ceph -v > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > > (officiel Ceph packages for Jessie) > > > Yes I recently adjusted pg_num, but all objects were correctly rebalanced. > > Then a

Re: [ceph-users] CephFS read IO caching, where it is happining?

2017-02-03 Thread Wido den Hollander
is. > " > > On Thu, Feb 2, 2017 at 9:30 PM, Shinobu Kinjo <ski...@redhat.com> wrote: > > > You may want to add this in your FIO recipe. > > > > * exec_prerun=echo 3 > /proc/sys/vm/drop_caches > > > > Regards, > > > > On F

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-03 Thread Wido den Hollander
63 or PM863 or a Intel DC series. All pools by default should go to those OSDs. Only the RGW buckets data pool should go to the big SMR drives. However, again, expect very, very low performance of those disks. Wido > Cheers, > Maxime > > On 03/02/17 09:40, "ceph-users on be

Re: [ceph-users] RGW authentication fail with AWS S3 v4

2017-02-03 Thread Wido den Hollander
> Op 3 februari 2017 om 9:52 schreef Khang Nguyễn Nhật > : > > > Hi all, > I'm using Ceph Object Gateway with S3 API (ceph-radosgw-10.2.5-0.el7.x86_64 > on CentOS Linux release 7.3.1611) and I use generate_presigned_url method > of boto3 to create rgw url. This

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-03 Thread Wido den Hollander
> Op 3 februari 2017 om 8:39 schreef Christian Balzer : > > > > Hello, > > On Fri, 3 Feb 2017 10:30:28 +0300 Irek Fasikhov wrote: > > > Hi, Maxime. > > > > Linux SMR is only starting with version 4.9 kernel. > > > What Irek said. > > Also, SMR in general is probably a bad

Re: [ceph-users] CephFS read IO caching, where it is happining?

2017-02-02 Thread Wido den Hollander
> Op 2 februari 2017 om 15:35 schreef Ahmed Khuraidah : > > > Hi all, > > I am still confused about my CephFS sandbox. > > When I am performing simple FIO test into single file with size of 3G I > have too many IOps: > > cephnode:~ # fio payloadrandread64k3G > test:

Re: [ceph-users] [Ceph-mirrors] rsync service download.ceph.com partially broken

2017-01-31 Thread Wido den Hollander
> Op 31 januari 2017 om 13:46 schreef Björn Lässig : > > > Hi cephers, > > since some time i get errors while rsyncing from the ceph download server: > > download.ceph.com: > > rsync: send_files failed to open "/debian-jewel/db/lockfile" (in ceph): > Permission

Re: [ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Wido den Hollander
> Op 31 januari 2017 om 10:22 schreef Martin Palma : > > > Hi all, > > our cluster is currently performing a big expansion and is in recovery > mode (we doubled in size and osd# from 600 TB to 1,2 TB). > Yes, that is to be expected. When not all PGs are active+clean the MONs

Re: [ceph-users] ceph rados gw, select objects by metadata

2017-01-30 Thread Wido den Hollander
> Op 30 januari 2017 om 10:29 schreef Johann Schwarzmeier > : > > > Hello, > I’m quite new to ceph and radosgw. With the python API, I found calls > for writing objects via boto API. It’s also possible to add metadata’s > to our objects. But now I have a question:

Re: [ceph-users] systemd and ceph-mon autostart on Ubuntu 16.04

2017-01-25 Thread Wido den Hollander
> Op 25 januari 2017 om 20:25 schreef Patrick Donnelly <pdonn...@redhat.com>: > > > On Wed, Jan 25, 2017 at 2:19 PM, Wido den Hollander <w...@42on.com> wrote: > > Hi, > > > > I thought this issue was resolved a while ago, but while testing Kraken >

Re: [ceph-users] Replacing an mds server

2017-01-24 Thread Wido den Hollander
> Op 24 januari 2017 om 22:08 schreef Goncalo Borges > : > > > Hi Jorge > Indeed my advice is to configure your high memory mds as a standby mds. Once > you restart the service in the low memory mds, the standby one should take > over without downtime and the

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Wido den Hollander
> Op 20 januari 2017 om 17:17 schreef Kai Storbeck : > > > Hello ceph users, > > My graphs of several counters in our Ceph cluster are showing abnormal > behaviour after changing the pg_num and pgp_num respectively. What counters exactly? Like pg information? It could be that

Re: [ceph-users] rgw static website docs 404

2017-01-20 Thread Wido den Hollander
e the dev didn't want to write docs, he/she forgot or just didn't get to it yet. It would be very much appreciated if you would send a PR with the updated documentation :) Wido > -Ben > > On Thu, Jan 19, 2017 at 1:56 AM, Wido den Hollander <w...@42on.com> wrote: > &g

Re: [ceph-users] Change Partition Schema on OSD Possible?

2017-01-14 Thread Wido den Hollander
> Op 14 januari 2017 om 11:05 schreef Hauke Homburg : > > > Hello, > > In our Ceph Cluster are our HDD in the OSD with 50% DATA in GPT > Partitions configured. Can we change this Schema to have more Data Storage? > How do you mean? > Our HDD are 5TB so i hope to

Re: [ceph-users] All SSD cluster performance

2017-01-14 Thread Wido den Hollander
Roy <somnath@sandisk.com> wrote: > > > > > > Also, there are lot of discussion about SSDs not suitable for Ceph write > > > workload (with filestore) in community as those are not good for > > > odirect/odsync kind of writes. Hope your SSDs are tolerant

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 20:33 schreef Mohammed Naser <mna...@vexxhost.com>: > > > > > On Jan 13, 2017, at 1:34 PM, Wido den Hollander <w...@42on.com> wrote: > > > >> > >> Op 13 januari 2017 om 18:50 schreef Mohammed Naser <mna...@vexx

Re: [ceph-users] rgw leaking data, orphan search loop

2017-01-13 Thread Wido den Hollander
> Op 24 december 2016 om 13:47 schreef Wido den Hollander <w...@42on.com>: > > > > > Op 23 december 2016 om 16:05 schreef Wido den Hollander <w...@42on.com>: > > > > > > > > > Op 22 december 2016 om 19:00 schreef Orit Wasse

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:50 schreef Mohammed Naser <mna...@vexxhost.com>: > > > > > On Jan 13, 2017, at 12:41 PM, Wido den Hollander <w...@42on.com> wrote: > > > > > >> Op 13 januari 2017 om 18:39 schreef Mohammed Naser <mna...@vexxhos

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:39 schreef Mohammed Naser <mna...@vexxhost.com>: > > > > > On Jan 13, 2017, at 12:37 PM, Wido den Hollander <w...@42on.com> wrote: > > > > > >> Op 13 januari 2017 om 18:18 schreef Mohammed Naser <mna...@vexx

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:18 schreef Mohammed Naser : > > > Hi everyone, > > We have a deployment with 90 OSDs at the moment which is all SSD that’s not > hitting quite the performance that it should be in my opinion, a `rados > bench` run gives something along these

Re: [ceph-users] Write back cache removal

2017-01-12 Thread Wido den Hollander
> Op 10 januari 2017 om 22:05 schreef Nick Fisk <n...@fisk.me.uk>: > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Stuart Harland > Sent: 10 January 2017 11:58 > To: Wido den Hollander <w...@42on.com> > Cc: ceph new <ceph

Re: [ceph-users] bluestore upgrade 11.0.2 to 11.1.1 failed

2017-01-11 Thread Wido den Hollander
> Op 11 januari 2017 om 12:24 schreef Jayaram R : > > > Hello, > > > > We from Nokia are validating bluestore on 3 node cluster with EC 2+1 > > > > While upgrading our cluster from Kraken 11.0.2 to 11.1.1 with bluesotre , > the cluster affected more than half of

Re: [ceph-users] Write back cache removal

2017-01-10 Thread Wido den Hollander
> Op 10 januari 2017 om 9:52 schreef Nick Fisk <n...@fisk.me.uk>: > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Wido den Hollander > > Sent: 10 January 2017 07:54 > > To: ceph ne

Re: [ceph-users] Write back cache removal

2017-01-09 Thread Wido den Hollander
> Op 9 januari 2017 om 13:02 schreef Stuart Harland > : > > > Hi, > > We’ve been operating a ceph storage system storing files using librados > (using a replicated pool on rust disks). We implemented a cache over the top > of this with SSDs, however we now

Re: [ceph-users] RGW pool usage is higher that total bucket size

2017-01-05 Thread Wido den Hollander
> Op 5 januari 2017 om 10:08 schreef Luis Periquito : > > > Hi, > > I have a cluster with RGW in which one bucket is really big, so every > so often we delete stuff from it. > > That bucket is now taking 3.3T after we deleted just over 1T from it. > That was done last

Re: [ceph-users] Ceph all-possible configuration options

2017-01-03 Thread Wido den Hollander
> Op 3 januari 2017 om 13:05 schreef Rajib Hossen > : > > > Hello, > I am exploring ceph and installed a mini cluster with 1 mon, 3 osd node(3 > osd daemon each node). For that I wrote a ceph.conf file with only needed > configuration options(see below) > >

Re: [ceph-users] Migrate cephfs metadata to SSD in running cluster

2017-01-03 Thread Wido den Hollander
l might be good to do since it's not that much data thus recovery will go quickly, but don't expect a CephFS performance improvement. Wido > Mike > > On 1/2/17 11:50 AM, Wido den Hollander wrote: > > > >> Op 2 januari 2017 om 10:33 schreef Shinobu Kinjo <ski...@redha

Re: [ceph-users] Cluster pause - possible consequences

2017-01-02 Thread Wido den Hollander
> Op 2 januari 2017 om 15:43 schreef Matteo Dacrema : > > > Increasing pg_num will lead to several slow requests and cluster freeze, but > due to creating pgs operation , for what I’ve seen until now. > During the creation period all the request are frozen , and the

Re: [ceph-users] Migrate cephfs metadata to SSD in running cluster

2017-01-02 Thread Wido den Hollander
> Op 2 januari 2017 om 10:33 schreef Shinobu Kinjo : > > > I've never done migration of cephfs_metadata from spindle disks to > ssds. But logically you could achieve this through 2 phases. > > #1 Configure CRUSH rule including spindle disks and ssds > #2 Configure CRUSH

Re: [ceph-users] linux kernel version for clients

2016-12-31 Thread Wido den Hollander
> Op 31 december 2016 om 6:56 schreef Manuel Sopena Ballesteros > : > > > Hi, > > I have several questions regarding kernel running on client machines: > > > * Why is kernel 3.10 considered an old kernel to run ceph clients? > Development in the Ceph world

Re: [ceph-users] Unbalanced OSD's

2016-12-30 Thread Wido den Hollander
> Op 30 december 2016 om 11:06 schreef Kees Meijs : > > > Hi Asley, > > We experience (using Hammer) a similar issue. Not that I have a perfect > solution to share, but I felt like mentioning a "me too". ;-) > > On a side note: we configured correct weight per drive as well. >

Re: [ceph-users] How to know if an object is stored in clients?

2016-12-29 Thread Wido den Hollander
> Op 28 december 2016 om 12:58 schreef Jaemyoun Lee : > > > Hello, > > I executed the RADOS tool to store an object as follows: > ``` > user@ClientA:~$ rados put -p=rbd objectA a.txt > ``` > > I wonder how the client knows a completion of storing the object in some >

Re: [ceph-users] Java librados issue

2016-12-27 Thread Wido den Hollander
1b472f6bb599e8ac0c22cf7d0d2e1949a;hb=HEAD#l90 it doesn't read a config file. Wido > Thanks, > Bogdan > > > On Tue, Dec 27, 2016 at 3:11 PM, Wido den Hollander <w...@42on.com> wrote: > > > > > > Op 26 december 2016 om 19:24 schreef Bogdan SOLGA <

Re: [ceph-users] Java librados issue

2016-12-27 Thread Wido den Hollander
> Op 26 december 2016 om 19:24 schreef Bogdan SOLGA : > > > Hello, everyone! > > I'm trying to integrate the Java port of librados > into our app, using this >

Re: [ceph-users] Atomic Operations?

2016-12-24 Thread Wido den Hollander
> Op 23 december 2016 om 21:14 schreef Kent Borg : > > > Hello, a newbie here! > > Doing some playing with Python and librados, and it is mostly easy to > use, but I am confused about atomic operations. The documentation isn't > clear to me, and Google isn't giving me

Re: [ceph-users] rgw leaking data, orphan search loop

2016-12-24 Thread Wido den Hollander
> Op 23 december 2016 om 16:05 schreef Wido den Hollander <w...@42on.com>: > > > > > Op 22 december 2016 om 19:00 schreef Orit Wasserman <owass...@redhat.com>: > > > > > > HI Maruis, > > > > On Thu, Dec 22, 2016 at 12:00 PM

Re: [ceph-users] BlueStore with v11.1.0 Kraken

2016-12-24 Thread Wido den Hollander
> Op 23 december 2016 om 14:34 schreef Eugen Leitl <eu...@leitl.org>: > > > > Hi Wido, > > thanks for your comments. > > On Fri, Dec 23, 2016 at 02:00:44PM +0100, Wido den Hollander wrote: > > > > My original layout was using 2x single Xeon nod

Re: [ceph-users] rgw leaking data, orphan search loop

2016-12-23 Thread Wido den Hollander
> Op 22 december 2016 om 19:00 schreef Orit Wasserman : > > > HI Maruis, > > On Thu, Dec 22, 2016 at 12:00 PM, Marius Vaitiekunas > wrote: > > On Thu, Dec 22, 2016 at 11:58 AM, Marius Vaitiekunas > > wrote: > >> >

Re: [ceph-users] BlueStore with v11.1.0 Kraken

2016-12-23 Thread Wido den Hollander
> Op 22 december 2016 om 14:36 schreef Eugen Leitl : > > > Hi guys, > > I'm building a first test cluster for homelab, and would like to start > using BlueStore since data loss is not critical. However, there are > obviously no official documentation for basic best usage

Re: [ceph-users] What is pauserd and pausewr status?

2016-12-23 Thread Wido den Hollander
> Op 23 december 2016 om 10:31 schreef Stéphane Klein > <cont...@stephane-klein.info>: > > > 2016-12-22 18:09 GMT+01:00 Wido den Hollander <w...@42on.com>: > > > > > > Op 22 december 2016 om 17:55 schreef Stéphane Klein < > > cont...@ste

Re: [ceph-users] What is pauserd and pausewr status?

2016-12-22 Thread Wido den Hollander
> Op 22 december 2016 om 17:55 schreef Stéphane Klein > : > > > Hi, > > I have this status: > > bash-4.2# ceph status > cluster 7ecb6ebd-2e7a-44c3-bf0d-ff8d193e03ac > health HEALTH_WARN > pauserd,pausewr,sortbitwise,require_jewel_osds flag(s)

Re: [ceph-users] cannot commit period: period does not have a master zone of a master zonegroup

2016-12-22 Thread Wido den Hollander
> Op 20 december 2016 om 18:06 schreef Orit Wasserman <owass...@redhat.com>: > > > On Tue, Dec 20, 2016 at 5:39 PM, Wido den Hollander <w...@42on.com> wrote: > > > >> Op 15 december 2016 om 17:10 schreef Orit Wasserman <owass...@redhat.com>: > >

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2016-12-21 Thread Wido den Hollander
> Op 21 december 2016 om 2:39 schreef Christian Balzer : > > > > Hello, > > I just (manually) added 1 OSD each to my 2 cache-tier nodes. > The plan was/is to actually do the data-migration at the least busiest day > in Japan, New Years (the actual holiday is January 2nd this

Re: [ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-20 Thread Wido den Hollander
> Op 20 december 2016 om 17:13 schreef Francois Lafont > <francois.lafont.1...@gmail.com>: > > > On 12/20/2016 10:02 AM, Wido den Hollander wrote: > > > I think it is commit 0cdf3bc875447c87fdc0fed29831554277a3774b: > >

Re: [ceph-users] tracker.ceph.com

2016-12-20 Thread Wido den Hollander
> Op 20 december 2016 om 17:31 schreef Nathan Cutler : > > > > Looks like it was trying to send mail over IPv6 and failing. > > > > I switched back to postfix, disabled IPv6, and show a message was > > recently queued for delivery to you. Please confirm you got it. > > Got

Re: [ceph-users] Upgrading from Hammer

2016-12-20 Thread Wido den Hollander
how to make sure we really didn't? > If you didn't touch them nothing happened. You can download the CRUSHMap and check the tunables set on top after decompiling it. See: http://docs.ceph.com/docs/master/rados/operations/crush-map/#tunables Wido > Regards, > Kees > > On 20-12-16 10

Re: [ceph-users] Upgrading from Hammer

2016-12-20 Thread Wido den Hollander
t. A Hammer/Jewel client can talk to a Hammer/Jewel cluster. One thing, don't change any CRUSH tunables if the cluster runs Jewel and the client is still on Hammer. The librados/librbd version is what matters. If you upgrade the cluster to Jewel and leave the client on Hammer it works. Wido &g

Re: [ceph-users] How exactly does rgw work?

2016-12-20 Thread Wido den Hollander
> Op 20 december 2016 om 3:24 schreef Gerald Spencer : > > > Hello all, > > We're currently waiting on a delivery of equipment for a small 50TB proof > of concept cluster, and I've been lurking/learning a ton from you. Thanks > for how active everyone is. > >

Re: [ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-20 Thread Wido den Hollander
> Op 20 december 2016 om 0:52 schreef Francois Lafont > : > > > Hi, > > On 12/19/2016 09:58 PM, Ken Dreyer wrote: > > > I looked into this again on a Trusty VM today. I set up a single > > mon+osd cluster on v10.2.3, with the following: > > > > # status

Re: [ceph-users] CephFS metdata inconsistent PG Repair Problem

2016-12-19 Thread Wido den Hollander
> Op 19 december 2016 om 18:14 schreef Sean Redmond : > > > Hi Ceph-Users, > > I have been running into a few issue with cephFS metadata pool corruption > over the last few weeks, For background please see > tracker.ceph.com/issues/17177 > > # ceph -v > ceph version

Re: [ceph-users] ceph and rsync

2016-12-16 Thread Wido den Hollander
> Op 16 december 2016 om 9:26 schreef Alessandro Brega > : > > > Hi guys, > > I'm running a ceph cluster using 0.94.9-1trusty release on XFS for RBD > only. I'd like to replace some SSDs because they are close to their TBW. > > I know I can simply shutdown the

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-12-15 Thread Wido den Hollander
> Op 7 november 2016 om 13:17 schreef Wido den Hollander <w...@42on.com>: > > > > > Op 4 november 2016 om 2:05 schreef Joao Eduardo Luis <j...@suse.de>: > > > > > > On 11/03/2016 06:18 PM, w...@42on.com wrote: > > > > > >>

[ceph-users] cannot commit period: period does not have a master zone of a master zonegroup

2016-12-15 Thread Wido den Hollander
Hi, On a Ceph cluster running Jewel 10.2.5 I'm running into a problem. I want to change the amount of shards: # radosgw-admin zonegroup-map get > zonegroup.json # nano zonegroup.json # radosgw-admin zonegroup-map set --infile zonegroup.json # radosgw-admin period update --commit Now, the error

Re: [ceph-users] Upgrading from Hammer

2016-12-13 Thread Wido den Hollander
> Op 13 december 2016 om 9:05 schreef Kees Meijs : > > > Hi guys, > > In the past few months, I've read some posts about upgrading from > Hammer. Maybe I've missed something, but I didn't really read something > on QEMU/KVM behaviour in this context. > > At the moment, we're

Re: [ceph-users] Crush rule check

2016-12-12 Thread Wido den Hollander
ize = 4. Wido > thanks, > Adrian > > > > -Original Message- > > From: Wido den Hollander [mailto:w...@42on.com] > > Sent: Monday, 12 December 2016 7:07 PM > > To: ceph-users@lists.ceph.com; Adrian Saul > > Subject: Re: [ceph-users] Crush rule chec

Re: [ceph-users] Crush rule check

2016-12-12 Thread Wido den Hollander
> Op 10 december 2016 om 12:45 schreef Adrian Saul > : > > > > Hi Ceph-users, > I just want to double check a new crush ruleset I am creating - the intent > here is that over 2 DCs, it will select one DC, and place two copies on > separate hosts in that DC.

Re: [ceph-users] 2x replication: A BIG warning

2016-12-11 Thread Wido den Hollander
> Op 9 december 2016 om 22:31 schreef Oliver Humpage <oli...@watershed.co.uk>: > > > > > On 7 Dec 2016, at 15:01, Wido den Hollander <w...@42on.com> wrote: > > > > I would always run with min_size = 2 and manually switch to min_size = 1 if > >

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
to 2 I've seen many data loss situations and that is why I started this thread in the first place. min_size is just a additional protection mechanism against data loss. Wido > I guess it’s just where you want to put that needle on the spectrum of > availability vs integrity. > > On 12/

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
is a manual operation. Without doing anything the PG will be marked as down+incomplete Wido > > Mit freundlichen Grüßen / best regards, > Kevin Olbrich. > > 2016-12-07 21:10 GMT+01:00 Wido den Hollander <w...@42on.com>: > > > > > >

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
You managed to get #2 back, but it doesn't have the changes which #3 had. The result is corrupted data. Does this make sense? Wido > On 12/7/16, 9:11 AM, "ceph-users on behalf of LOIC DEVULDER" > <ceph-users-boun...@lists.ceph.com on behalf of loic.devul...@mpsa.com> wrote:

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 20:54 schreef John Spray <jsp...@redhat.com>: > > > On Wed, Dec 7, 2016 at 7:47 PM, Wido den Hollander <w...@42on.com> wrote: > > > >> Op 7 december 2016 om 16:53 schreef John Spray <jsp...@redhat.com>: > >> > >

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 16:53 schreef John Spray <jsp...@redhat.com>: > > > On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander <w...@42on.com> wrote: > > > >> Op 7 december 2016 om 16:38 schreef John Spray <jsp...@redhat.com>: > >> > >

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 16:38 schreef John Spray <jsp...@redhat.com>: > > > On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander <w...@42on.com> wrote: > > (I think John knows the answer, but sending to ceph-users for archival > > purposes) > > >

[ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
(I think John knows the answer, but sending to ceph-users for archival purposes) Hi John, A Ceph cluster lost a PG with CephFS metadata in there and it is currently doing a CephFS disaster recovery as described here: http://docs.ceph.com/docs/master/cephfs/disaster-recovery/ This data pool

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 15:54 schreef LOIC DEVULDER : > > > Hi Wido, > > > As a Ceph consultant I get numerous calls throughout the year to help people > > with getting their broken Ceph clusters back online. > > > > The causes of downtime vary vastly, but one of the

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
e data you do not want to loose. Wido > Thanks! > > Regards, > Kees > > On 07-12-16 09:08, Wido den Hollander wrote: > > As a Ceph consultant I get numerous calls throughout the year to help > > people with getting their broken Ceph clusters back online. > >

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
2, > provided we can get the needed performance. > > > > On Wed, Dec 7, 2016 at 9:08 AM, Wido den Hollander <w...@42on.com> wrote: > > Hi, > > > > As a Ceph consultant I get numerous calls throughout the year to help > > people with getting their

[ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
Hi, As a Ceph consultant I get numerous calls throughout the year to help people with getting their broken Ceph clusters back online. The causes of downtime vary vastly, but one of the biggest causes is that people use replication 2x. size = 2, min_size = 1. In 2016 the amount of cases I have

Re: [ceph-users] CEPH mirror down again

2016-11-25 Thread Wido den Hollander
> Op 26 november 2016 om 5:13 schreef "Andrus, Brian Contractor" > : > > > Hmm. Apparently download.ceph.com = us-west.ceph.com > And there is no repomd.xml on us-east.ceph.com > You could check http://us-east.ceph.com/timestamp to see how far behind it is on

Re: [ceph-users] Missing heartbeats, OSD spending time reconnecting - possible bug?

2016-11-11 Thread Wido den Hollander
> Op 11 november 2016 om 14:23 schreef Trygve Vea > : > > > Hi, > > We recently experienced a problem with a single OSD. This occurred twice. > > The problem manifested itself thus: > > - 8 placement groups stuck peering, all of which had the problematic OSD

<    1   2   3   4   5   6   7   8   9   10   >