Re: [ceph-users] Ceph expansion/deploy via ansible

2019-04-17 Thread Francois Lafont
Hi, +1 for ceph-ansible too. ;) -- François (flaf) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] radosgw in Nautilus: message "client_io->complete_request() returned Broken pipe"

2019-04-17 Thread Francois Lafont
Hi @ll, I have a Nautilus Ceph cluster UP with radosgw in a zonegroup. I'm using the web frontend Beast (the default in Nautilus). All seems to work fine but in the log of radosgw I have this message: Apr 17 14:02:56 rgw-m-1 ceph-m-rgw.rgw-m-1.rgw0[888]: 2019-04-17 14:02:56.410

Re: [ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-17 Thread Francois Lafont
Hi Matt, On 4/17/19 1:08 AM, Matt Benjamin wrote: Why is using an explicit unix socket problematic for you? For what it does, that decision has always seemed sensible. In fact, I don't understand why the "ops" logs have a different way from the logs of the process radosgw itself.

Re: [ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-16 Thread Francois Lafont
Hi @all, On 4/9/19 12:43 PM, Francois Lafont wrote: I have tried this config: - rgw enable ops log  = true rgw ops log socket path = /tmp/opslog rgw log http headers    = http_x_forwarded_for - and I have logs in the socket /tmp/opslog like this: - {"bucket":&qu

Re: [ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-09 Thread Francois Lafont
On 4/9/19 12:43 PM, Francois Lafont wrote: 2. In my Docker container context, is it possible to put the logs above in the file "/var/log/syslog" of my host, in other words is it possible to make sure to log this in stdout of the daemon "radosgw"? In brief, is it poss

Re: [ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-09 Thread Francois Lafont
Hi, On 4/9/19 5:02 AM, Pavan Rallabhandi wrote: Refer "rgw log http headers" under http://docs.ceph.com/docs/nautilus/radosgw/config-ref/ Or even better in the code https://github.com/ceph/ceph/pull/7639 Ok, thx for your help Pavan. I have progressed but I have already some problems.

[ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-08 Thread Francois Lafont
Hi @all, I'm using Ceph rados gateway installed via ceph-ansible with the Nautilus version. The radosgw are behind a haproxy which add these headers (checked via tcpdump): X-Forwarded-Proto: http X-Forwarded-For: 10.111.222.55 where 10.111.222.55 is the IP address of the client. The

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-10-21 Thread Francois Lafont
Hi @all, On 02/08/2017 08:45 PM, Jim Kilborn wrote: > I have had two ceph monitor nodes generate swap space alerts this week. > Looking at the memory, I see ceph-mon using a lot of memory and most of the > swap space. My ceph nodes have 128GB mem, with 2GB swap (I know the > memory/swap ratio

Re: [ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-19 Thread Francois Lafont
Hi, On 12/19/2016 09:58 PM, Ken Dreyer wrote: > I looked into this again on a Trusty VM today. I set up a single > mon+osd cluster on v10.2.3, with the following: > > # status ceph-osd id=0 > ceph-osd (ceph/0) start/running, process 1301 > > #ceph daemon osd.0 version >

Re: [ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-13 Thread Francois Lafont
On 12/13/2016 12:42 PM, Francois Lafont wrote: > But, _by_ _principle_, in the specific case of ceph (I know it's not the > usual case of packages which provide daemons), I think it would be more > safe and practical that the ceph packages don't manage the restart of > daemons. And

[ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-13 Thread Francois Lafont
Hi @all, I have a little remark concerning at least the Trusty ceph packages (maybe it concerns another distributions, I don't know). I'm pretty sure that before the 10.2.5 version, the restart of the daemons wasn't managed during the packages upgrade and with the 10.2.5 version it's the case. I

Re: [ceph-users] 10.2.4 Jewel released

2016-12-09 Thread Francois Lafont
On 12/09/2016 06:39 PM, Alex Evonosky wrote: > Sounds great. May I asked what procedure you did to upgrade? Of course. ;) It's here: https://shaman.ceph.com/repos/ceph/wip-msgr-jewel-fix2/ (I think this link was pointed by Greg Farnum or Sage Weil in a previous message). Personally I use

Re: [ceph-users] 10.2.4 Jewel released

2016-12-09 Thread Francois Lafont
Hi, Just for information, after the upgrade to the version 10.2.4-1-g5d3c76c (5d3c76c1c6e991649f0beedb80e6823606176d9e) of all my cluster (osd, mon and mds) since ~30 hours, I have no problem (my cluster is a small cluster with 5 nodes and 4 osds per nodes and 3 monitors and I just use cephfs).

Re: [ceph-users] 10.2.4 Jewel released

2016-12-08 Thread Francois Lafont
On 12/08/2016 11:24 AM, Ruben Kerkhof wrote: > I've been running this on one of my servers now for half an hour, and > it fixes the issue. It's the same for me. ;) ~$ ceph -v ceph version 10.2.4-1-g5d3c76c (5d3c76c1c6e991649f0beedb80e6823606176d9e) Thanks for the help. Bye.

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Francois Lafont
On 12/08/2016 12:38 AM, Gregory Farnum wrote: > Yep! Ok, thanks for the confirmations Greg. Bye. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Francois Lafont
On 12/08/2016 12:06 AM, Sage Weil wrote: > Please hold off on upgrading to this release. It triggers a bug in > SimpleMessenger that causes threads for broken connections to spin, eating > CPU. > > We're making sure we understand the root cause and preparing a fix. Waiting for the fix and

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
On 12/07/2016 11:33 PM, Ruben Kerkhof wrote: > Thanks, l'll check how long it takes for this to happen on my cluster. > > I did just pause scrub and deep-scrub. Are there scrubs running on > your cluster now by any chance? Yes but normally not currently because I have: osd scrub begin hour =

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
On 12/07/2016 11:16 PM, Steve Taylor wrote: > I'm seeing the same behavior with very similar perf top output. One server > with 32 OSDs has a load average approaching 800. No excessive memory usage > and no iowait at all. Exactly! And another interesting information (maybe). I have ceph-osd

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
Hi, On 12/07/2016 01:21 PM, Abhishek L wrote: > This point release fixes several important bugs in RBD mirroring, RGW > multi-site, CephFS, and RADOS. > > We recommend that all v10.2.x users upgrade. Also note the following when > upgrading from hammer Well... little warning: after upgrade

[ceph-users] Keep previous versions of ceph in the APT repository

2016-11-29 Thread Francois Lafont
Hi @all, Ceph teaem, could it be possible to keep the previous versions of ceph* packages in the APT repository? Indeed, for instance for Ubuntu Trusty, currently we have: ~$ curl -s http://download.ceph.com/debian-jewel/dists/trusty/main/binary-amd64/Packages | grep -A 1 '^Package:

Re: [ceph-users] ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2

2016-08-30 Thread Francois Lafont
Hi, On 08/29/2016 08:30 PM, Gregory Farnum wrote: > Ha, yep, that's one of the bugs Giancolo found: > > ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) > 1: (()+0x299152) [0x7f91398dc152] > 2: (()+0x10330) [0x7f9138bbb330] > 3: (Client::get_root_ino()+0x10) [0x7f91397df6c0] >

Re: [ceph-users] ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2

2016-08-27 Thread Francois Lafont
On 08/27/2016 12:01 PM, Francois Lafont wrote: > I had exactly the same error in my production ceph client node with > Jewel 10.2.1 in my case. I have forgotten to say that the ceph cluster was perfectly HEALTH_OK before, during and after the error in the client side. R

Re: [ceph-users] ceph-fuse "Transport endpoint is not connected" on Jewel 10.2.2

2016-08-27 Thread Francois Lafont
Hi, I had exactly the same error in my production ceph client node with Jewel 10.2.1 in my case. In the client node : - Ubuntu 14.04 - kernel 3.13.0-92-generic - ceph 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) - cephfs via _ceph-fuse_ In the cluster node : - Ubuntu 14.04 - kernel

Re: [ceph-users] ceph-fuse, fio largely better after migration Infernalis to Jewel, is my bench relevant?

2016-06-06 Thread Francois Lafont
On 06/06/2016 18:41, Gregory Farnum wrote: > We had several metadata caching improvements in ceph-fuse recently which I > think went in after Infernalis. That could explain it. Ok, in this case, it could be good news. ;) I had doubts concerning my fio bench. I know that benchs can be tricky

[ceph-users] ceph-fuse, fio largely better after migration Infernalis to Jewel, is my bench relevant?

2016-06-06 Thread Francois Lafont
Hi, I have a little Ceph cluster in production with 5 cluster nodes and 2 client nodes. The clients are using cephfs via fuse.ceph. Recently, I have upgraded my cluster from Infernalis to Jewel (servers _and_ clients). When the cluster was in Infernalis version the fio command below gave me

[ceph-users] A radosgw keyring with the minimal rights, which pools have I to create?

2016-06-04 Thread Francois Lafont
Hi, In a from scratch Jewel cluster, I'm searching the exact list of pools I have to create and the minimal rights that I can set for the keyring used by the radosgw instance. This is for the default zone. I intend to just use the S3 API of the radosgw. a) I have read the doc here

Re: [ceph-users] jewel upgrade and sortbitwise

2016-06-03 Thread Francois Lafont
Hi, On 03/06/2016 16:29, Samuel Just wrote: > Sorry, I should have been more clear. The bug actually is due to a > difference in an on disk encoding from hammer. An infernalis cluster would > never had had such encodings and is fine. Ah ok, fine. ;) Thanks for the answer. Bye. -- François

Re: [ceph-users] jewel upgrade and sortbitwise

2016-06-03 Thread Francois Lafont
Hi, On 03/06/2016 05:39, Samuel Just wrote: > Due to http://tracker.ceph.com/issues/16113, it would be best to avoid > setting the sortbitwise flag on jewel clusters upgraded from previous > versions until we get a point release out with a fix. > > The symptom is that setting the sortbitwise

Re: [ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-03 Thread Francois Lafont
Hi, On 02/06/2016 04:44, Francois Lafont wrote: > ~# grep ceph /etc/fstab > id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyring,client_mountpoint=/ > /mnt/ fuse.ceph noatime,nonempty,defaults,_netdev 0 0 [...] > And I have rebooted. After the reboot, big surprise with this:

Re: [ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Francois Lafont
Now, I have a explanation and it's _very_ strange, absolutely not related to a problem of Unix rights. For memory, my client node is an updated Ubuntu Trusty and I use ceph-fuse. Here is my fstab line: ~# grep ceph /etc/fstab

[ceph-users] Infernalis => Jewel: ceph-fuse regression concerning the automatic mount at boot?

2016-06-01 Thread Francois Lafont
Hi, I have a Jewel Ceph cluster in OK state and I have a "ceph-fuse" Ubuntu Trusty client with ceph Infernalis. The cephfs is mounted automatically and perfectly during the boot via ceph-fuse and this line in /etc/fstab : ~# grep ceph /etc/fstab

Re: [ceph-users] Meaning of the "host" parameter in the section [client.radosgw.{instance-name}] in ceph.conf?

2016-05-28 Thread Francois Lafont
Hi, On 26/05/2016 23:46, Francois Lafont wrote: > a) My first question is perfectly summarized in the title. ;) > Indeed, here is a typical section [client.radosgw.{instance-name}] in > the ceph.conf of a radosgw serve

[ceph-users] Meaning of the "host" parameter in the section [client.radosgw.{instance-name}] in ceph.conf?

2016-05-26 Thread Francois Lafont
Hi, a) My first question is perfectly summarized in the title. ;) Indeed, here is a typical section [client.radosgw.{instance-name}] in the ceph.conf of a radosgw server "rgw-01": -- # The instance-name is "gateway" here.

Re: [ceph-users] ZFS or BTRFS for performance?

2016-03-20 Thread Francois Lafont
Hello, On 20/03/2016 04:47, Christian Balzer wrote: > That's not protection, that's an "uh-oh, something is wrong, you better > check it out" notification, after which you get to spend a lot of time > figuring out which is the good replica In fact, I have never been confronted to this case so

Re: [ceph-users] Change Unix rights of /var/lib/ceph/{osd, mon}/$cluster-$id/ directories on Infernalis?

2016-03-14 Thread Francois Lafont
Hi David, On 14/03/2016 18:33, David Casier wrote: > "usermod -aG ceph snmp" is better ;) After thinking, the solution to add "snmp" in the "ceph" group seems to me better too... _if_ the "ceph" group has never the "w" right in /var/lib/ceph/ (which seems to be the case). So thanks to comfort

[ceph-users] Change Unix rights of /var/lib/ceph/{osd, mon}/$cluster-$id/ directories on Infernalis?

2016-03-10 Thread Francois Lafont
Hi, I have a ceph cluster on Infernalis and I'm using a snmp agent to retrieve data and generate generic graphs concerning each cluster node. Currently, I can see in the syslog of each node this kind of lines (every 5 minutes): Mar 11 03:15:26 ceph01 snmpd[16824]: Cannot statfs

Re: [ceph-users] Cache tier operation clarifications

2016-03-04 Thread Francois Lafont
Hello, On 04/03/2016 09:17, Christian Balzer wrote: > Unlike the subject may suggest, I'm mostly going to try and explain how > things work with cache tiers, as far as I understand them. > Something of a reference to point to. [...] I'm currently unqualified concerning cache tiering but I'm

Re: [ceph-users] Cannot mount cephfs after some disaster recovery

2016-03-01 Thread Francois Lafont
On 01/03/2016 18:14, John Spray wrote: >> And what is the meaning of the first and the second number below? >> >> mdsmap e21038: 1/1/0 up {0=HK-IDC1-10-1-72-160=up:active} >>^ ^ > > Your whitespace got lost here I think, but I guess you're talking > about the 1/1 part.

Re: [ceph-users] Upgrade to INFERNALIS

2016-03-01 Thread Francois Lafont
Hi, On 02/03/2016 00:12, Garg, Pankaj wrote: > I have upgraded my cluster from 0.94.4 as recommended to the just released > Infernalis (9.2.1) Update directly (skipped 9.2.0). > I installed the packaged on each system, manually (.deb files that I built). > > After that I followed the steps : >

Re: [ceph-users] Cannot mount cephfs after some disaster recovery

2016-03-01 Thread Francois Lafont
Hi, On 01/03/2016 10:32, John Spray wrote: > As Zheng has said, that last number is the "max_mds" setting. And what is the meaning of the first and the second number below? mdsmap e21038: 1/1/0 up {0=HK-IDC1-10-1-72-160=up:active} ^ ^ -- François Lafont

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-20 Thread Francois Lafont
Hi, On 19/01/2016 07:24, Adam Tygart wrote: > It appears that with --apparent-size, du adds the "size" of the > directories to the total as well. On most filesystems this is the > block size, or the amount of metadata space the directory is using. On > CephFS, this size is fabricated to be the

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-20 Thread Francois Lafont
On 21/01/2016 03:40, Francois Lafont wrote: > Ah ok, interesting. I have tested and I have noticed however that size > of a directory is not updated immediately. For instance, if I change > the size of the regular file in a directory (of cephfs) the size of the > size doesn't change

Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-18 Thread Francois Lafont
Hi, I have not well followed this thread, so sorry in advance if I'm a little out of topic. Personally I'm using this udev rule and it works well (servers are Ubuntu Trusty): ~# cat /etc/udev/rules.d/90-ceph.rules ENV{ID_PART_ENTRY_SCHEME}=="gpt",

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-18 Thread Francois Lafont
On 19/01/2016 05:19, Francois Lafont wrote: > However, I still have a question. Since my previous message, supplementary > data have been put in the cephfs and the values have changes as you can see: > > ~# du -sh /mnt/cephfs/ > 1.2G /mnt/cephfs/ > > ~# d

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-18 Thread Francois Lafont
Hi, On 18/01/2016 05:00, Adam Tygart wrote: > As I understand it: I think you understand well. ;) > 4.2G is used by ceph (all replication, metadata, et al) it is a sum of > all the space "used" on the osds. I confirm that. > 958M is the actual space the data in cephfs is using (without

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-17 Thread Francois Lafont
On 18/01/2016 04:19, Francois Lafont wrote: > ~# du -sh /mnt/cephfs > 958M /mnt/cephfs > > ~# df -h /mnt/cephfs/ > Filesystem Size Used Avail Use% Mounted on > ceph-fuse55T 4.2G 55T 1% /mnt/cephfs Even with the option --appar

[ceph-users] Infernalis, cephfs: difference between df and du

2016-01-17 Thread Francois Lafont
Hello, Can someone explain me the difference between df and du commands concerning the data used in my cephfs? And which is the correct value, 958M or 4.2G? ~# du -sh /mnt/cephfs 958M/mnt/cephfs ~# df -h /mnt/cephfs/ Filesystem Size Used Avail Use% Mounted on

[ceph-users] cephfs (ceph-fuse) and file-layout: "operation not supported" in a client Ubuntu Trusty

2016-01-08 Thread Francois Lafont
Hi @all, I'm using ceph Infernalis (9.2.0) in the client and cluster side. I have a Ubuntu Trusty client where cephfs is mounted via ceph-fuse and I would like to put a sub-directory of cephfs in a specific pool (a ssd pool). In the cluster, I have: ~# ceph auth get client.cephfs exported

Re: [ceph-users] cephfs (ceph-fuse) and file-layout: "operation not supported" in a client Ubuntu Trusty

2016-01-08 Thread Francois Lafont
Hi, Some news... On 08/01/2016 12:42, Francois Lafont wrote: > ~# mkdir /mnt/cephfs/ssd > > ~# setfattr -n ceph.dir.layout.pool -v poolssd /mnt/cephfs/ssd/ > setfattr: /mnt/cephfs/ssd/: Operation not supported > > ~# getfattr -n ceph.dir.layout /mnt/cephfs/ > /mnt/cep

Re: [ceph-users] cephfs, low performances

2016-01-02 Thread Francois Lafont
olc, I think you haven't posted in the ceph-users list. On 31/12/2015 15:39, olc wrote: > Same model _and_ same firmware (`smartctl -i /dev/sdX | grep Firmware`)? As > far as I've been told, this can make huge differences. Good idea indeed. I have checked, the versions are the same. Finally,

Re: [ceph-users] cephfs, low performances

2016-01-02 Thread Francois Lafont
Hi, On 31/12/2015 15:30, Robert LeBlanc wrote: > Because Ceph is not perfectly distributed there will be more PGs/objects in > one drive than others. That drive will become a bottleneck for the entire > cluster. The current IO scheduler poses some challenges in this regard. > I've implemented a

Re: [ceph-users] In production - Change osd config

2016-01-02 Thread Francois Lafont
Hi, On 03/01/2016 02:16, Sam Huracan wrote: > I try restart all osd but not efficient. > Is there anyway to apply this change transparently to client? You can use this command (it's an example): # In a cluster node where the admin account is available. ceph tell 'osd.*' injectargs

Re: [ceph-users] cephfs, low performances

2015-12-29 Thread Francois Lafont
Hi, On 28/12/2015 09:04, Yan, Zheng wrote: >> Ok, so in a client node, I have mounted cephfs (via ceph-fuse) and a rados >> block device formatted in XFS. If I have well understood, cephfs uses sync >> IO (not async IO) and, with ceph-fuse, cephfs can't make O_DIRECT IO. So, I >> have tested

Re: [ceph-users] cephfs, low performances

2015-12-27 Thread Francois Lafont
Hi, Sorry for my late answer. On 23/12/2015 03:49, Yan, Zheng wrote: >>> fio tests AIO performance in this case. cephfs does not handle AIO >>> properly, AIO is actually SYNC IO. that's why cephfs is so slow in >>> this case. >> >> Ah ok, thanks for this very interesting information. >> >> So,

Re: [ceph-users] cephfs, low performances

2015-12-22 Thread Francois Lafont
Hello, On 21/12/2015 04:47, Yan, Zheng wrote: > fio tests AIO performance in this case. cephfs does not handle AIO > properly, AIO is actually SYNC IO. that's why cephfs is so slow in > this case. Ah ok, thanks for this very interesting information. So, in fact, the question I ask myself is:

Re: [ceph-users] cephfs, low performances

2015-12-20 Thread Francois Lafont
On 20/12/2015 21:06, Francois Lafont wrote: > Ok. Please, can you give us your configuration? > How many nodes, osds, ceph version, disks (SSD or not, HBA/controller), RAM, > CPU, network (1Gb/10Gb) etc.? And I add this: with cephfs-fuse, did you have some specific conf in the cli

Re: [ceph-users] cephfs, low performances

2015-12-20 Thread Francois Lafont
On 20/12/2015 22:51, Don Waterloo wrote: > All nodes have 10Gbps to each other Even the link client node <---> cluster nodes? > OSD: > $ ceph osd tree > ID WEIGHT TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 5.48996 root default > -2 0.8 host nubo-1 > 0 0.8

Re: [ceph-users] cephfs, low performances

2015-12-20 Thread Francois Lafont
Hello, On 18/12/2015 23:26, Don Waterloo wrote: > rbd -p mypool create speed-test-image --size 1000 > rbd -p mypool bench-write speed-test-image > > I get > > bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern seq > SEC OPS OPS/SEC BYTES/SEC > 1 79053

Re: [ceph-users] cephfs, low performances

2015-12-20 Thread Francois Lafont
Hi, On 20/12/2015 19:47, Don Waterloo wrote: > I did a bit more work on this. > > On cephfs-fuse, I get ~700 iops. > On cephfs kernel, I get ~120 iops. > These were both on 4.3 kernel > > So i backed up to 3.16 kernel on the client. And observed the same results. > > So ~20K iops w/ rbd,

Re: [ceph-users] cephfs, low performances

2015-12-18 Thread Francois Lafont
Hi Christian, On 18/12/2015 04:16, Christian Balzer wrote: >> It seems to me very bad. > Indeed. > Firstly let me state that I don't use CephFS and have no clues how this > influences things and can/should be tuned. Ok, no problem. Anyway, thanks for your answer. ;) > That being said, the

[ceph-users] cephfs, low performances

2015-12-17 Thread Francois Lafont
Hi, I have ceph cluster currently unused and I have (to my mind) very low performances. I'm not an expert in benchs, here an example of quick bench: --- # fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1

Re: [ceph-users] about PG_Number

2015-11-13 Thread Francois Lafont
Hi, On 13/11/2015 09:13, Vickie ch wrote: > If you have a large amount of OSDs but less pg number. You will find your > data write unevenly. > Some OSD have no change to write data. > In the other side, pg number too large but OSD number too small that have a > chance to cause data lost. Data

Re: [ceph-users] v9.2.0 Infernalis released

2015-11-09 Thread Francois Lafont
Oops, sorry Dan, I would like to send my message to the list. Sorry. > On Mon, Nov 9, 2015 at 11:55 AM, Francois Lafont >> >> 1. Ok, so, the rank of my monitors are 0, 1, 2 but the its ID are 1, 2, 3 >> (ID chosen by himself because the hosts are called ceph01, ceph02 and &

Re: [ceph-users] v9.2.0 Infernalis released

2015-11-08 Thread Francois Lafont
On 09/11/2015 06:28, Francois Lafont wrote: > I have just upgraded a cluster to 9.2.0 from 9.1.0. > All seems to be well except I have this little error > message : > > ~# ceph tell mon.* version --format plain > mon.1: ceph version 9.2.0 (17df5d2948d929e997b9d320b228caffc8314

Re: [ceph-users] v9.2.0 Infernalis released

2015-11-08 Thread Francois Lafont
Hi, I have just upgraded a cluster to 9.2.0 from 9.1.0. All seems to be well except I have this little error message : ~# ceph tell mon.* version --format plain mon.1: ceph version 9.2.0 (17df5d2948d929e997b9d320b228caffc8314e58) mon.2: ceph version 9.2.0

Re: [ceph-users] v0.94.4 Hammer released

2015-10-20 Thread Francois Lafont
Hi, On 20/10/2015 20:11, Stefan Eriksson wrote: > A change like this below, where we have to change ownership was not add to a > point release for hammer right? Right. ;) I have upgraded my ceph cluster from 0.94.3 to 0.94.4 today without any problem. The daemons used in 0.94.3 and currently

Re: [ceph-users] v9.1.0 Infernalis release candidate released

2015-10-14 Thread Francois Lafont
Hi and thanks at all for this good news, ;) On 13/10/2015 23:01, Sage Weil wrote: >#. Fix the data ownership during the upgrade. This is the preferred > option, > but is more work. The process for each host would be to: > > #. Upgrade the ceph package. This creates the ceph

Re: [ceph-users] v9.1.0 Infernalis release candidate released

2015-10-14 Thread Francois Lafont
Sorry, another remark. On 13/10/2015 23:01, Sage Weil wrote: > The v9.1.0 packages are pushed to the development release repositories:: > > http://download.ceph.com/rpm-testing > http://download.ceph.com/debian-testing I don't see the 9.1.0 available for Ubuntu Trusty :

Re: [ceph-users] CephFS file to rados object mapping

2015-10-14 Thread Francois Lafont
Hi, On 14/10/2015 06:45, Gregory Farnum wrote: >> Ok, however during my tests I had been careful to replace the correct >> file by a bad file with *exactly* the same size (the content of the >> file was just a little string and I have changed it by a string with >> exactly the same size). I had

Re: [ceph-users] CephFS file to rados object mapping

2015-10-09 Thread Francois Lafont
Hi, Thanks for your answer Greg. On 09/10/2015 04:11, Gregory Farnum wrote: > The size of the on-disk file didn't match the OSD's record of the > object size, so it rejected it. This works for that kind of gross > change, but it won't catch stuff like a partial overwrite or loss of > data

Re: [ceph-users] v0.94.2 Hammer released

2015-06-11 Thread Francois Lafont
Hi, On 11/06/2015 19:34, Sage Weil wrote: Bug #11442 introduced a change that made rgw objects that start with underscore incompatible with previous versions. The fix to that bug reverts to the previous behavior. In order to be able to access objects that start with an underscore and were

Re: [ceph-users] Cephfs: one ceph account per directory?

2015-06-08 Thread Francois Lafont
Hi, Gregory Farnum wrote: 1. Can you confirm to me that currently it's impossible to restrict the read and write access of a ceph account to a specific directory of a cephfs? It's sadly impossible to restrict access to the filesystem hierarchy at this time, yes. By making use of the file

Re: [ceph-users] Complete freeze of a cephfs client (unavoidable hard reboot)

2015-06-08 Thread Francois Lafont
Hi, On 27/05/2015 22:34, Gregory Farnum wrote: Sorry for the delay; I've been traveling. No problem, me too, I'm not really fast to answer. ;) Ok, I see. According to the online documentation, the way to close a cephfs client session is: ceph daemon mds.$id session ls # to get

Re: [ceph-users] Mount options nodcache and nofsc

2015-05-21 Thread Francois Lafont
Hi, Yan, Zheng wrote: fsc means fs-cache. it's a kernel facility by which a network filesystem can cache data locally, trading disk space to gain performance improvements for access to slow networks and media. cephfs does not use fs-cache by default. So enable this option can improve

Re: [ceph-users] How to backup hundreds or thousands of TB

2015-05-17 Thread Francois Lafont
Hi, Wido den Hollander wrote: Aren't snapshots something that should protect you against removal? IF snapshots work properly in CephFS you could create a snapshot every hour. Are you talking about the .snap/ directory in a cephfs directory? If yes, does it work well? Because, with Hammer, if

Re: [ceph-users] Complete freeze of a cephfs client (unavoidable hard reboot)

2015-05-17 Thread Francois Lafont
Hi, Sorry for my late answer. Gregory Farnum wrote: 1. Is this kind of freeze normal? Can I avoid these freezes with a more recent version of the kernel in the client? Yes, it's normal. Although you should have been able to do a lazy and/or force umount. :) Ah, I haven't tried it. Maybe

Re: [ceph-users] Complete freeze of a cephfs client (unavoidable hard reboot)

2015-05-17 Thread Francois Lafont
John Spray wrote: Greg's response is pretty comprehensive, but for completeness I'll add that the specific case of shutdown blocking is http://tracker.ceph.com/issues/9477 Yes indeed, during the freeze, INFO: task sync:3132 blocked for more than 120 seconds... was exactly the message I have

[ceph-users] Complete freeze of a cephfs client (unavoidable hard reboot)

2015-05-14 Thread Francois Lafont
Hi, I had a problem with a cephfs freeze in a client. Impossible to re-enable the mountpoint. A simple ls /mnt command totally blocked (of course impossible to umount-remount etc.) and I had to reboot the host. But even a normal reboot didn't work, the host didn't stop. I had to do a hard reboot

Re: [ceph-users] Find out the location of OSD Journal

2015-05-07 Thread Francois Lafont
Hi, Patrik Plank wrote: i cant remember on which drive I install which OSD journal :-|| Is there any command to show this? It's probably not the answer you hope, but why don't use a simple: ls -l /var/lib/ceph/osd/ceph-$id/journal ? -- François Lafont

Re: [ceph-users] Some more numbers - CPU/Memory suggestions for OSDs and Monitors

2015-04-22 Thread Francois Lafont
Hi, Christian Balzer wrote: thanks for the feedback regarding the network questions. Currently I try to solve the question of how much memory, cores and GHz for OSD nodes and Monitors. My research so far: OSD nodes: 2 GB RAM, 2 GHz, 1 Core (?) per OSD RAM is enough, but more helps

Re: [ceph-users] Some more numbers - CPU/Memory suggestions for OSDs and Monitors

2015-04-22 Thread Francois Lafont
Mark Nelson wrote: I'm not sure who came up with the 1GB for each 1TB of OSD daemons rule, but frankly I don't think it scales well at the extremes. You can't get by with 256MB of ram for OSDs backed by 256GB SSDs, nor do you need 6GB of ram per OSD for 6TB spinning disks. 2-4GB of RAM

[ceph-users] Cephfs: proportion of data between data pool and metadata pool

2015-04-22 Thread Francois Lafont
Hi, When I want to have an estimation of the pg_num of a new pool, I use this very useful page: http://ceph.com/pgcalc/. In the table, I must give the %data of a pool. For instance, for a rados gateway only use case, I can see that, by default, the page gives: - .rgw.buckets = 96.90% of data -

[ceph-users] Radosgw and mds hardware configuration

2015-04-22 Thread Francois Lafont
Hi Cephers, :) I would like to know if there are some rules to estimate (approximatively) the need of CPU and RAM for: 1. a radosgw server (for instance with Hammer and civetweb). 2. a mds server If I am not mistaken, for these 2 types of server, there is no need concerning the storage. For a

Re: [ceph-users] decrease pg number

2015-04-22 Thread Francois Lafont
Hi, Pavel V. Kaygorodov wrote: I have updated my cluster to Hammer and got a warning too many PGs per OSD (2240 max 300). I know, that there is no way to decrease number of page groups, so I want to re-create my pools with less pg number, move all my data to them, delete old pools and

Re: [ceph-users] What is a dirty object

2015-04-20 Thread Francois Lafont
Hi, John Spray wrote: As far as I can see, this is only meaningful for cache pools, and object is dirty in the sense of having been created or modified since their its last flush. For a non-cache-tier pool, everything is logically dirty since it is never flushed. I hadn't noticed

Re: [ceph-users] Questions about an example of ceph infrastructure

2015-04-19 Thread Francois Lafont
Hi, Christian Balzer wrote: For starters, make that 5 MONs. It won't really help you with your problem of keeping a quorum when loosing a DC, but being able to loose more than 1 monitor will come in handy. Note that MONs don't really need to be dedicated nodes, if you know what you're

[ceph-users] Questions about an example of ceph infrastructure

2015-04-18 Thread Francois Lafont
Hi, We are thinking about a ceph infrastructure and I have questions. Here is the conceived (but not yet implemented) infrastructure: (please, be careful to read the schema with a monospace font ;)) +-+ | users | |(browser)|

[ceph-users] What is a dirty object

2015-04-18 Thread Francois Lafont
Hi, With my testing cluster (Hammer on Ubuntu 14.04), I have this: -- ~# ceph df detail GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 4073G 3897G 176G 4.33 23506 POOLS: NAME

Re: [ceph-users] Upgrade from Firefly to Hammer

2015-04-14 Thread Francois Lafont
Hi, Garg, Pankaj wrote: I have a small cluster of 7 machines. Can I just individually upgrade each of them (using apt-get upgrade) from Firefly to Hammer release, or there more to it than that? Not exactly, this is individually which is not correct. ;) You should indeed apt-get upgrade on

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Francois Lafont
Robert LeBlanc wrote: HmmmI've been deleting the OSD (ceph osd rm X; ceph osd crush rm osd.X) along with removing the auth key. This has caused data movement, Maybe but if the flag noout is set, removing an OSD of the cluster doesn't trigger at all data movement (I have tested with

Re: [ceph-users] Purpose of the s3gw.fcgi script?

2015-04-13 Thread Francois Lafont
Hi, Yehuda Sadeh-Weinraub wrote: You're not missing anything. The script was only needed when we used the process manager of the fastcgi module, but it has been very long since we stopped using it. Just to be sure, so if I understand well, these parts of the documentation: 1.

Re: [ceph-users] Radosgw: upgrade Firefly to Hammer, impossible to create bucket

2015-04-13 Thread Francois Lafont
Hi, Yehuda Sadeh-Weinraub wrote: The 405 in this case usually means that rgw failed to translate the http hostname header into a bucket name. Do you have 'rgw dns name' set correctly? Ah, I have found and indeed it concerned rgw dns name as also Karan thought. ;) But it's a little

Re: [ceph-users] norecover and nobackfill

2015-04-13 Thread Francois Lafont
Hi, Robert LeBlanc wrote: What I'm trying to achieve is minimal data movement when I have to service a node to replace a failed drive. [...] I will perhaps say something stupid but it seems to me that it's the goal of the noout flag, isn't it? 1. ceph osd set noout 2. an old OSD disk failed,

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-13 Thread Francois Lafont
Joao Eduardo wrote: To be more precise, it's the lowest IP:PORT combination: 10.0.1.2:6789 = rank 0 10.0.1.2:6790 = rank 1 10.0.1.3:6789 = rank 3 and so on. Ok, so if there is 2 possible quorum, the quorum with the lowest IP:PORT will be chosen. But what happens if, in the 2 possible

Re: [ceph-users] Radosgw: upgrade Firefly to Hammer, impossible to create bucket

2015-04-13 Thread Francois Lafont
Karan Singh wrote: Things you can check * Is RGW node able to resolve bucket-2.ostore.athome.priv , try ping bucket-2.ostore.athome.priv Yes, my DNS configuration is ok. In fact, I test s3cmd directly on my radosgw (its hostname is ceph-radosgw1 but its fqdn is ostore.athome.priv):

[ceph-users] Radosgw: upgrade Firefly to Hammer, impossible to create bucket

2015-04-12 Thread Francois Lafont
Hi, On a testing cluster, I have a radosgw on Firefly and the other nodes, OSDs and monitors, are on Hammer. The nodes are installed with puppet in personal VM, so I can reproduce the problem. Generally, I use s3cmd to check the radosgw. While radosgw is on Firefly, I can create bucket, no

Re: [ceph-users] [a bit off-topic] Power usage estimation of hardware for Ceph

2015-04-12 Thread Francois Lafont
Christian Balzer wrote: Simply put, a RAID1 of SSDs will require you to get twice as many SSDs as otherwise needed. And most people don't want to spend that money. In addition to that DC level SSDs tend to very reliable and your cluster will have to be able to withstand losses like this

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-12 Thread Francois Lafont
Gregory Farnum wrote: If: (more clear with a schema in the head ;)) 1. mon.1 and mon.2 can talk together (in dc1) and can talk with mon.5 (via the VPN) but can't talk with mon.3 and mon.4 (in dc2) 2. mon.3 and mon.4 can talk together (in dc2) and can talk with mon.5 (via

Re: [ceph-users] [a bit off-topic] Power usage estimation of hardware for Ceph

2015-04-12 Thread Francois Lafont
Hi, Christian Balzer wrote: I'm not sure to well understand: the model that I indicated in the link above (page 2, model SSG-6027R-OSD040H in the table) already have hotswap bays in the back, for OS drives. Yes, but that model is pre-configured: 2x 2.5 400GB SSDs, 10x 3.5 4TB SATA3 HDDs

Re: [ceph-users] [a bit off-topic] Power usage estimation of hardware for Ceph

2015-04-12 Thread Francois Lafont
Chris Kitzmiller wrote: Just as a single data point I can speak to my own nodes. I'm using SM 847A [1] chassis. They're 4U, 36 x 3.5 hot swap bays with 2 internal 2.5 bays. So: 30 x 7200 RPM SATA 6 x SSD Journals 2 x SSD OS / Mon 2 x E5-2620 2.0GHz With the

  1   2   >