[ceph-users] corrupted rbd filesystems since jewel

2017-05-03 Thread Stefan Priebe - Profihost AG
Hello, since we've upgraded from hammer to jewel 10.2.7 and enabled exclusive-lock,object-map,fast-diff we've problems with corrupting VM filesystems. Sometimes the VMs are just crashing with FS errors and a restart can solve the problem. Sometimes the whole VM is not even bootable and we need

Re: [ceph-users] Ceph newbie thoughts and questions

2017-05-03 Thread David Turner
The clients will need to be able to contact the mons and the osds. NEVER use 2 mons. Mons are a quorum and work best with odd numbers (1, 3, 5, etc). 1 mon is better than 2 mons. It is better to remove the raid and put the individual disks as OSDs. Ceph handles the redundancy through replica

Re: [ceph-users] RGW 10.2.5->10.2.7 authentication fail?

2017-05-03 Thread Łukasz Jagiełło
Hi Radek, I can confirm, v10.2.7 without 2 commits you mentioned earlier works as expected. Best, On Wed, May 3, 2017 at 2:59 AM, Radoslaw Zarzynski wrote: > Hello Łukasz, > > Thanks for your testing and sorry for my mistake. It looks that two commits > need to be

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Haomai Wang
refer to https://github.com/ceph/ceph/pull/5013 On Thu, May 4, 2017 at 7:56 AM, Brad Hubbard wrote: > +ceph-devel to get input on whether we want/need to check the value of > /dev/cpu_dma_latency (platform dependant) at startup and issue a > warning, or whether documenting

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Brad Hubbard
+ceph-devel to get input on whether we want/need to check the value of /dev/cpu_dma_latency (platform dependant) at startup and issue a warning, or whether documenting this would suffice? Any doc contribution would be welcomed. On Wed, May 3, 2017 at 7:18 PM, Blair Bethwaite

Re: [ceph-users] kernel BUG at fs/ceph/inode.c:1197

2017-05-03 Thread Brad Hubbard
+ceph-devel On Thu, May 4, 2017 at 12:51 AM, James Poole wrote: > Hello, > > We currently have a ceph cluster supporting an Openshift cluster using > cephfs and dynamic rbd provisioning. The client nodes appear to be > triggering a kernel bug and are rebooting

[ceph-users] Ceph newbie thoughts and questions

2017-05-03 Thread Marcus Pedersén
Hello everybody! I am a newbie on ceph and I really like it and want to try it out. I have a couple of thoughts and questions after reading documentation and need some help to see that I am on the right path. Today I have two file servers in production that I want to start my ceph fs on and

Re: [ceph-users] RBD behavior for reads to a volume with no data written

2017-05-03 Thread Prashant Murthy
Thanks for the detailed explanation, Jason. It makes sense that such operations would end up being a few metadata lookups only (and the metadata lookups will hit the disk only if they are not cached in-memory). Prashant On Tue, May 2, 2017 at 11:29 AM, Jason Dillaman wrote:

Re: [ceph-users] Changing replica size of a running pool

2017-05-03 Thread David Turner
Those are both things that people have done and both work. Neither is optimal, but both options work fine. The best option is to definitely just get a third node now as you aren't going to be getting it for additional space from it later. Your usable space between a 2 node size 2 cluster and a

[ceph-users] Changing replica size of a running pool

2017-05-03 Thread Maximiliano Venesio
Guys hi. I have a Jewel Cluster composed by two storage servers which are configured on the crush map as different buckets to store data. I've to configure two new pools on this cluster with the certainty that i'll have to add more servers in a short term. Taking into account that the

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread yiming xie
./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit --rgw-zonegroup=default --rgw-zone=default 2017-05-03 05:34:23.886966 7f9e6e0036c0 0 failed reading zonegroup info: ret -2 (2) No such file or directory couldn't init storage provider I will commit this issue. Thanks for your

[ceph-users] kernel BUG at fs/ceph/inode.c:1197

2017-05-03 Thread James Poole
Hello, We currently have a ceph cluster supporting an Openshift cluster using cephfs and dynamic rbd provisioning. The client nodes appear to be triggering a kernel bug and are rebooting unexpectedly with the same message each time. Clients are running CentOS 7: KERNEL:

[ceph-users] Spurious 'incorrect nilfs2 checksum' breaking ceph OSD

2017-05-03 Thread Matthew Vernon
Hi, This has bitten us a couple of times now (such that we're considering re-building util-linux with the nilfs2 code commented out), so I'm wondering if anyone else has seen it [and noting the failure mode in case anyone else is confused in future] We see this with our setup of rotating media

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread Orit Wasserman
On Wed, May 3, 2017 at 12:13 PM, Orit Wasserman wrote: > > > On Wed, May 3, 2017 at 12:05 PM, yiming xie wrote: > >> Cluster c2 have not *zone:us-1* >> >> ./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit >> --rgw-zonegroup=us

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-03 Thread James Eckersall
Hi David, Thanks for the reply, it's appreciated. We're going to upgrade the cluster to Kraken and see if that fixes the metadata issue. J On 2 May 2017 at 17:00, David Zafman wrote: > > James, > > You have an omap corruption. It is likely caused by a bug which has >

[ceph-users] CDM tonight @ 9p EDT

2017-05-03 Thread Patrick McGarry
Hey cephers, Just a reminder that the monthly Ceph Developer call is tonight at 9p EDT as we're on an APAC-friendly month. http://wiki.ceph.com/Planning Please add any ongoing work to the will so that we can discuss. Thanks! ___ ceph-users mailing

Re: [ceph-users] Increase PG or reweight OSDs?

2017-05-03 Thread Luis Periquito
TL;DR: add the OSDs and then split the PGs They are different commands for different situations... changing the weight is to have a bigger number of nodes/devices. Depending on the size of cluster, the size of the devices, how busy it is and by how much you're growing it will have some different

Re: [ceph-users] Increase PG or reweight OSDs?

2017-05-03 Thread M Ranga Swami Reddy
+ Ceph-devel On Tue, May 2, 2017 at 11:46 AM, M Ranga Swami Reddy wrote: > Hello, > I have added 5 new Ceph OSD nodes to my ceph cluster. Here, I wanted > to increase PG/PGP numbers of pools based new OSDs count. Same time > need to increase the newly added OSDs weight from

Re: [ceph-users] RGW 10.2.5->10.2.7 authentication fail?

2017-05-03 Thread Radoslaw Zarzynski
Hello Łukasz, Thanks for your testing and sorry for my mistake. It looks that two commits need to be reverted to get the previous behaviour: The already mentioned one: https://github.com/ceph/ceph/commit/c9445faf7fac2ccb8a05b53152c0ca16d7f4c6d0 Its dependency:

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Blair Bethwaite
On 3 May 2017 at 19:07, Dan van der Ster wrote: > Whether cpu_dma_latency should be 0 or 1, I'm not sure yet. I assume > your 30% boost was when going from throughput-performance to > dma_latency=0, right? I'm trying to understand what is the incremental > improvement from 1

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread Orit Wasserman
On Wed, May 3, 2017 at 12:05 PM, yiming xie wrote: > Cluster c2 have not *zone:us-1* > > ./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit > --rgw-zonegroup=us --rgw-zone=us-1 > try --rgw-zonegroup==default --rgw-zone=default. Could you open a tracker

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Blair Bethwaite > Sent: 03 May 2017 09:53 > To: Dan van der Ster > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Intel power tuning - 30% throughput

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Dan van der Ster
On Wed, May 3, 2017 at 10:52 AM, Blair Bethwaite wrote: > On 3 May 2017 at 18:38, Dan van der Ster wrote: >> Seems to work for me, or? > > Yeah now that I read the code more I see it is opening and > manipulating /dev/cpu_dma_latency in response to

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread yiming xie
Cluster c2 have not zone:us-1 ./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit --rgw-zonegroup=us --rgw-zone=us-1 2017-05-03 05:01:30.219721 7efcff2606c0 1 Cannot find zone id= (name=us-1), switching to local zonegroup configuration 2017-05-03 05:01:30.222956 7efcff2606c0 -1

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread Orit Wasserman
On Wed, May 3, 2017 at 11:51 AM, yiming xie wrote: > I run > ./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit > > try adding --rgw-zonegroup=us1 --rgw-zone=us-1 > the error: > 2017-05-03 04:46:10.298103 7fdb2e4226c0 1 Cannot find zone >

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Blair Bethwaite
On 3 May 2017 at 18:38, Dan van der Ster wrote: > Seems to work for me, or? Yeah now that I read the code more I see it is opening and manipulating /dev/cpu_dma_latency in response to that option, so the TODO comment seems to be outdated. I verified tuned latency-performance

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread yiming xie
I run ./bin/radosgw-admin -c ./run/c2/ceph.conf period update --commit the error: 2017-05-03 04:46:10.298103 7fdb2e4226c0 1 Cannot find zone id=0cae32e6-82d5-489f-adf5-99e92c70f86f (name=us-2), switching to local zonegroup configuration 2017-05-03 04:46:10.300145 7fdb2e4226c0 -1 Cannot find

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread Orit Wasserman
On Wed, May 3, 2017 at 11:36 AM, yiming xie wrote: > Hi Orit: > Thanks for your reply. > > when I recreate secondary zone group, there is still a error! > > radosgw-admin realm pull --url=http://localhost:8001 > --access-key=$SYSTEM_ACCESS_KEY >

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Dan van der Ster
On Wed, May 3, 2017 at 10:32 AM, Blair Bethwaite wrote: > On 3 May 2017 at 18:15, Dan van der Ster wrote: >> It looks like el7's tuned natively supports the pmqos interface in >> plugins/plugin_cpu.py. > > Ahha, you are right, but I'm sure I tested

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread yiming xie
Hi Orit: Thanks for your reply. when I recreate secondary zone group, there is still a error! radosgw-admin realm pull --url=http://localhost:8001 --access-key=$SYSTEM_ACCESS_KEY --secret=$SYSTEM_SECRET_KEY --default radosgw-admin period pull

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Blair Bethwaite
On 3 May 2017 at 18:15, Dan van der Ster wrote: > It looks like el7's tuned natively supports the pmqos interface in > plugins/plugin_cpu.py. Ahha, you are right, but I'm sure I tested tuned and it did not help. Thanks for pointing out this script, I had not noticed it

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Blair Bethwaite
Hi Dan, On 3 May 2017 at 17:43, Dan van der Ster wrote: > We use cpu_dma_latency=1, because it was in the latency-performance profile. > And indeed by setting cpu_dma_latency=0 on one of our OSD servers, > powertop now shows the package as 100% in turbo mode. I tried both 0

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Dan van der Ster
On Wed, May 3, 2017 at 9:13 AM, Blair Bethwaite wrote: > We did the latter using the pmqos_static.py, which was previously part of > the RHEL6 tuned latency-performance profile, but seems to have been dropped > in RHEL7 (don't yet know why), It looks like el7's tuned

Re: [ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread Orit Wasserman
Hi, On Wed, May 3, 2017 at 11:00 AM, yiming xie wrote: > Hi orit: > I try to create multiple zonegroups in single realm, but failed. Pls tell > me the correct way about creating multiple zonegroups > Tks a lot!! > > 1.create the firstr zone group on the c1 cluster >

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-05-03 Thread Willem Jan Withagen
On 02-05-17 23:53, David Turner wrote: > I was only interjecting on the comment "So that is 5 . Which is real > easy to obtain" and commenting on what the sustained writes into a > cluster of 2,000 OSDs would require to actually sustain that 5 MBps on > each SSD journal. Reading your calculation

[ceph-users] Help! how to create multiple zonegroups in single realm?

2017-05-03 Thread yiming xie
Hi orit: I try to create multiple zonegroups in single realm, but failed. Pls tell me the correct way about creating multiple zonegroups Tks a lot!! >> 1.create the firstr zone group on the c1 cluster >> ./bin/radosgw-admin -c ./run/c1/ceph.conf realm create --rgw-realm=earth >> --default >>

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Luis Periquito
One of the things I've noticed in the latest (3+ years) batch of CPUs is that they ignore more the cpu scaler drivers and do what they want. More than that interfaces like the /proc/cpuinfo are completely incorrect. I keep checking the real frequencies using applications like the "i7z", and it

[ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Blair Bethwaite
Hi all, We recently noticed that despite having BIOS power profiles set to performance on our RHEL7 Dell R720 Ceph OSD nodes, that CPU frequencies never seemed to be getting into the top of the range, and in fact spent a lot of time in low C-states despite that BIOS option supposedly disabling

[ceph-users] Help! create the secondary zone group failed!

2017-05-03 Thread yiming xie
>I try to create two zone groups, but I have a problem. >I do not know where there was a mistake in this process. > > 1.create the firstr zone group on the c1 cluster > ./bin/radosgw-admin -c ./run/c1/ceph.conf realm create --rgw-realm=earth > --default > ./bin/radosgw-admin -c