Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found

[ceph-users] weird 'ceph-deploy disk list nodename' command output, Invalid partition data

2014-12-05 Thread 张帆
hi, all When I run command 'ceph-deploy disk list nodename', there are some warning messages indicate partition table error, but the ceph cluster is working normally. What is the problem, should I run sgdisk command to repair the partition table? Below is the warning messages:

Re: [ceph-users] Giant osd problems - loss of IO

2014-12-05 Thread Andrei Mikhailovsky
Jake, very usefull indeed. It looks like I had a similar problem regarding the heartbeat and as you' have mentioned, I've not seen such issues on Firefly. However, i've not seen any osd crashes. Could you please let me know where you got the sysctrl.conf tunings from? Was it recommended

[ceph-users] AWS SDK and MultiPart Problem

2014-12-05 Thread Georgios Dimitrakakis
Hi all! I am using AWS SDK JS v.2.0.29 to perform a multipart upload into Radosgw with ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) and I am getting a 403 error. I believe that the id which is send to all requests and has been urlencoded by the aws-sdk-js doesn't match

Re: [ceph-users] AWS SDK and MultiPart Problem

2014-12-05 Thread Georgios Dimitrakakis
For example if I try to perform the same multipart upload at an older version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) I can see the upload ID in the apache log as: PUT /test/.dat?partNumber=25uploadId=I3yihBFZmHx9CCqtcDjr8d-RhgfX8NW HTTP/1.1 200 - -

[ceph-users] Chinese translation of Ceph Documentation

2014-12-05 Thread Drunkard Zhang
Hi, I have migrated my Chinese translation from PDF to Ceph official doc build system. Just replace doc/ in ceph repository with this repo, the building should be work. The official doc build guide should be work for this too. Old PDF: https://github.com/drunkard/docs_zh New:

Re: [ceph-users] AWS SDK and MultiPart Problem

2014-12-05 Thread Georgios Dimitrakakis
It would be nice to see where and how uploadId is being calculated... Thanks, George For example if I try to perform the same multipart upload at an older version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) I can see the upload ID in the apache log as: PUT

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster on Ubuntu 14.04 LTS, and so far what I have found

[ceph-users] Erasure Encoding Chunks

2014-12-05 Thread Nick Fisk
Hi All, Does anybody have any input on what the best ratio + total numbers of Data + Coding chunks you would choose? For example I could create a pool with 7 data chunks and 3 coding chunks and get an efficiency of 70%, or I could create a pool with 17 data chunks and 3 coding chunks and

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Nick Fisk
This is probably due to the Kernel RBD client not being recent enough. Have you tried upgrading your kernel to a newer version? 3.16 should contain all the relevant features required by Giant. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread David Moreau Simard
What are the kernel versions involved ? We have Ubuntu precise clients talking to a Ubuntu trusty cluster without issues - with tunables optimal. 0.88 (Giant) and 0.89 has been working well for us as far the client and Openstack are concerned. This link provides some insight as to the possible

Re: [ceph-users] Virtual machines using RBD remount read-only on OSD slow requests

2014-12-05 Thread Paulo Almeida
Hi, I recently e-mailed ceph-users about a problem with virtual machine RBD disks remounting read-only because of OSD slow requests[1]. I just wanted to report that although I'm still seeing OSDs from one particular machine going down sometimes (probably some hardware problem on that node), the

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:25 PM, David Moreau Simard dmsim...@iweb.com wrote: What are the kernel versions involved ? We have Ubuntu precise clients talking to a Ubuntu trusty cluster without issues - with tunables optimal. 0.88 (Giant) and 0.89 has been working well for us as far the client

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote: This is probably due to the Kernel RBD client not being recent enough. Have you tried upgrading your kernel to a newer version? 3.16 should contain all the relevant features required by Giant. I would rather tune the tunables, as

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread James Devine
http://kernel.ubuntu.com/~kernel-ppa/mainline/ I'm running 3.17 on my trusty clients without issue On Fri, Dec 5, 2014 at 9:37 AM, Antonio Messina antonio.mess...@s3it.uzh.ch wrote: On Fri, Dec 5, 2014 at 4:25 PM, Nick Fisk n...@fisk.me.uk wrote: This is probably due to the Kernel RBD

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Nick Fisk
Ok sorry, I thought you had a need for some of the features in Giant, using tunables is probably easier in that case. However if you do want to upgrade there are debs available:- http://kernel.ubuntu.com/~kernel-ppa/mainline/ and I believe 3.16 should be available in the 14.04.2 release, which

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
Thank you James and Nick, On Fri, Dec 5, 2014 at 4:46 PM, Nick Fisk n...@fisk.me.uk wrote: Ok sorry, I thought you had a need for some of the features in Giant, using tunables is probably easier in that case. I'm not sure :) I never played with the tunables before (still running a testbed

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Sage Weil
On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very interesting to me too: we are going to deploy a large ceph cluster

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-12-05 Thread David Moreau Simard
I've flushed everything - data, pools, configs and reconfigured the whole thing. I was particularly careful with cache tiering configurations (almost leaving defaults when possible) and it's not locking anymore. It looks like the cache tiering configuration I had was causing the problem ? I

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote: On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant into production instead of Firefly? This is very

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
Hi all, just an update After setting chooseleaf_vary_r to 0 _and_ removing an pool with erasure coding, I was able to run rbd map. Thank you all for the help .a. On Fri, Dec 5, 2014 at 5:07 PM, Antonio Messina antonio.mess...@s3it.uzh.ch wrote: On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil

Re: [ceph-users] Erasure Encoding Chunks

2014-12-05 Thread Loic Dachary
On 05/12/2014 16:21, Nick Fisk wrote: Hi All, Does anybody have any input on what the best ratio + total numbers of Data + Coding chunks you would choose? For example I could create a pool with 7 data chunks and 3 coding chunks and get an efficiency of 70%, or I could create

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Sage Weil
On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 4:59 PM, Sage Weil s...@newdream.net wrote: On Fri, 5 Dec 2014, Antonio Messina wrote: On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote: Hi Cephers, Have anyone of you decided to put Giant

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Antonio Messina
On Fri, Dec 5, 2014 at 5:24 PM, Sage Weil s...@newdream.net wrote: The v2 rule means you have a crush rule for erasure coding. Do you have an EC pool in your cluster? Yes indeed. I didn't know EC pool was incompatible with the current kernel, I only tested it with rados bench and VMs, I guess.

Re: [ceph-users] Virtual machines using RBD remount read-only on OSD slow requests

2014-12-05 Thread Haomai Wang
I hope you can provide with more runtime infos like logs On Fri, Dec 5, 2014 at 11:32 PM, Paulo Almeida palme...@igc.gulbenkian.pt wrote: Hi, I recently e-mailed ceph-users about a problem with virtual machine RBD disks remounting read-only because of OSD slow requests[1]. I just wanted to

Re: [ceph-users] Erasure Encoding Chunks

2014-12-05 Thread Loic Dachary
On 05/12/2014 17:41, Nick Fisk wrote: Hi Loic, Thanks for your response. The idea for this cluster will be for our VM Replica storage in our secondary site. Initially we are planning to have a 40 disk EC pool sitting behind a cache pool of around 1TB post replica size. This storage

Re: [ceph-users] AWS SDK and MultiPart Problem

2014-12-05 Thread Yehuda Sadeh
It looks like a bug. Can you open an issue on tracker.ceph.com, describing what you see? Thanks, Yehuda On Fri, Dec 5, 2014 at 7:17 AM, Georgios Dimitrakakis gior...@acmac.uoc.gr wrote: It would be nice to see where and how uploadId is being calculated... Thanks, George For example

[ceph-users] experimental features

2014-12-05 Thread Sage Weil
A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready for production. In retrospect, I don't think '_dev' was

Re: [ceph-users] experimental features

2014-12-05 Thread David Champion
* On 05 Dec 2014, Sage Weil wrote: adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: -

Re: [ceph-users] experimental features

2014-12-05 Thread Mark Nelson
On 12/05/2014 11:39 AM, Gregory Farnum wrote: On Fri, Dec 5, 2014 at 9:36 AM, Sage Weil sw...@redhat.com wrote: A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under

Re: [ceph-users] experimental features

2014-12-05 Thread Mark Nelson
On 12/05/2014 11:47 AM, David Champion wrote: * On 05 Dec 2014, Sage Weil wrote: adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks.

[ceph-users] Radosgw with SSL enabled

2014-12-05 Thread lakshmi k s
Hello  -  I have rados gateway setup working with http. But when I enable SSL on gateway node, I am having trouble making successful swift requests over https.  root@hrados:~# swift -V 1.0 -A https://hrados1.ex.com/auth/v1.0 -U s3User:swiftUser -K 8fJfd6YW2poqhvBI+uUYJZE1uscnmrDncRXrkjHR

Re: [ceph-users] experimental features

2014-12-05 Thread Gregory Farnum
On Fri, Dec 5, 2014 at 9:36 AM, Sage Weil sw...@redhat.com wrote: A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2014-12-05 Thread Gregory Farnum
On Thu, Dec 4, 2014 at 7:03 PM, Christian Balzer ch...@gol.com wrote: Hello, This morning I decided to reboot a storage node (Debian Jessie, thus 3.16 kernel and Ceph 0.80.7, HDD OSDs with SSD journals) after applying some changes. It came back up one OSD short, the last log lines before

Re: [ceph-users] experimental features

2014-12-05 Thread Robert LeBlanc
I prefer the third option (enumeration). I don't see a point where we would enable experimental features on our production clusters, but it would be nice to have the same bits and procedures between our dev/beta and production clusters. On Fri, Dec 5, 2014 at 10:36 AM, Sage Weil sw...@redhat.com

Re: [ceph-users] experimental features

2014-12-05 Thread Nigel Williams
On Sat, Dec 6, 2014 at 4:36 AM, Sage Weil sw...@redhat.com wrote: - enumerate experiemntal options we want to enable ... This has the property that no config change is necessary when the feature drops its experimental status. It keeps the risky options in one place too so easier to spot.

Re: [ceph-users] Old OSDs on new host, treated as new?

2014-12-05 Thread Udo Lembke
Hi, perhaps an stupid question, but why you change the hostname? Not tried, but I guess if you boot the node with an new hostname, the old hostname are in the crush map, but without any OSDs - because they are on the new host. Don't know ( I guess not) if the degration level stay also on 5% if

Re: [ceph-users] cephfs survey results

2014-12-05 Thread Lorieri
Hi, if I have a situation when each node in a cluster writes their own files in cephfs, is it safe to use multiple MDS ? I mean, is the problem using multiple MDS related to nodes writing same files ? thanks, -lorieri On Tue, Nov 4, 2014 at 9:47 PM, Shain Miley smi...@npr.org wrote: +1 for

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2014-12-05 Thread Christian Balzer
On Fri, 5 Dec 2014 11:23:19 -0800 Gregory Farnum wrote: On Thu, Dec 4, 2014 at 7:03 PM, Christian Balzer ch...@gol.com wrote: Hello, This morning I decided to reboot a storage node (Debian Jessie, thus 3.16 kernel and Ceph 0.80.7, HDD OSDs with SSD journals) after applying some