[ceph-users] RGW: Multi Part upload and resulting objects

2014-06-04 Thread Sylvain Munaut
Hi, During a multi part upload you can't upload parts smaller than 5M, and radosgw also slices object in slices of 4M. Having those two being different is a bit unfortunate because if you slice your files in the minimum chunk size you end up with a main file of 4M and a shadowfile of 1M for each

Re: [ceph-users] RGW: Multi Part upload and resulting objects

2014-06-05 Thread Sylvain Munaut
Hello, Huh. We took the 5MB limit from S3, but it definitely is unfortunate in combination with our 4MB chunking. You can change the default slice size using a config option, though. I believe you want to change rgw_obj_stripe_size (default: 4 20). There might be some other considerations

[ceph-users] civetweb frontend issue with HEAD.

2014-06-05 Thread Sylvain Munaut
Hi, I was running some tests on the new civetweb frontend, hoping to get rid of the lighttpd we have in front of it and found an issue. If you execute a HEAD on something that returns an error, the _body_ of the error will be sent, which is incorrect for a HEAD. In a keepalive scenario this

[ceph-users] civetweb frontend issue with spurious response

2014-06-16 Thread Sylvain Munaut
Hi, We're testing the civetweb frontend currently and there is an intermitent issue where the response will contain an extraneous 'ERROR 500' response mixed in. This is a tcpdump from such an occurent. - GET / HTTP/1.1 User-Agent: curl/7.30.0 Host: s3.svc:81 Accept: */* HTTP/1.1 200 OK

Re: [ceph-users] civetweb frontend issue with spurious response

2014-06-16 Thread Sylvain Munaut
Hi, The second response is a response to an expected follow up request on the same connection (using keep alive). If you're seeing it as part of the response on the first request then it's either an issue with the client not handling keep alive connections correctly, or an issue with the

Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Sylvain Munaut
Hi, Based on the debug log, radosgw is definitely the software that's incorrectly parsing the URL. For example: 2014-06-25 17:30:37.383134 7f7c6cfa9700 20 REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb 2014-06-25 17:30:37.383199 7f7c6cfa9700 10

Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Sylvain Munaut
URL decoding the path is not the correct behavior. Yes it is ... If you remove that then every other special char file will be broken because % encoding will not be applied. From the rfc3986, the path must be split in its components first (split on '/') then url-decoded component per component.

[ceph-users] Replacing an OSD

2014-07-01 Thread Sylvain Munaut
Hi, As an exercise, I killed an OSD today, just killed the process and removed its data directory. To recreate it, I recreated an empty data dir, then ceph-osd -c /etc/ceph/ceph.conf -i 3 --monmap /tmp/monmap --mkfs (I tried with and without giving the monmap). I then restored the keyring

Re: [ceph-users] Replacing an OSD

2014-07-01 Thread Sylvain Munaut
Hi, And then I start the process, and it starts fine. http://pastebin.com/TPzNth6P I even see one active tcp connection to a mon from that process. But the osd never becomes up or do anything ... I suppose there are error messages in logs somewhere regarding the fact that monitors and

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi, Does OSD 3 show when you ceph pg dump ? If so I would look in the logs of an OSD which is participating in the same PG. It appears at the end but not in any PG, it's now been marked out and all was redistributed. osdstat kbused kbavail kb hb in hb out 0 15602352

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Ah, I finally fond something that looks like an error message : 2014-07-02 11:07:57.817269 7f0692e3a700 7 mon.a@0(leader).osd e1147 preprocess_boot from osd.3 10.192.2.70:6807/9702 clashes with existing osd: different fsid (ours: e44c914a-23e9-4756-9713-166de401dec6 ; theirs:

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Just for future reference, you actually do need to remove the OSD even if you're going to re-add it like 10 sec later ... $ ceph osd rm 3 removed osd.3 $ ceph osd create 3 Then it works fine. No need to remove from crusmap or remove the auth key (you can re-use both), but you need to remove/add

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi Loic, By restoring the fsid file from the back, presumably. I did not think of that when you showed the ceph-osd mkfs line, but it makes sense. This is not the ceph fsid. Yeah, I though about that and I saw fsid and ceph_fsid, but I wasn't just that just replacing the file would be

[ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)

2014-07-02 Thread Sylvain Munaut
Hi, I'm having a couple of issues during this update. On the test cluster it went fine, but when running it on production I have a few issues. (I guess there is some subtle difference I missed, I updated the test one back when emperor came out). For reference, I'm on ubuntu precise, I use

Re: [ceph-users] Replacing an OSD

2014-07-02 Thread Sylvain Munaut
Hi, Did you also recreate the journal?! It was a journal file and got re-created automatically. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Issues upgrading from 0.72.x (emperor) to 0.81.x (firefly)

2014-07-02 Thread Sylvain Munaut
Hi, I can't help you with packaging issues, but i can tell you that the rbdmap executable got moved to a different package at some point, but I believe the official ones handle it properly. I'll see tonight when doing the other nodes. Maybe it's a result of using dist-upgrade rather than

[ceph-users] emperor - firefly : Significant increase in RAM usage

2014-07-04 Thread Sylvain Munaut
Hi, Yesterday I finally updated our cluster to emperor (lastest stable commit) and what's fairly apparent is a much higher RAM usage on the OSD: http://i.imgur.com/qw9iKSV.png Has anyone noticed the same ? I mean 25% sudden increase in the idle ram usage is hard to ignore ... Those OSD are

Re: [ceph-users] emperor - firefly : Significant increase in RAM usage

2014-07-07 Thread Sylvain Munaut
Hi, Yesterday I finally updated our cluster to emperor (lastest stable commit) and what's fairly apparent is a much higher RAM usage on the OSD: http://i.imgur.com/qw9iKSV.png Has anyone noticed the same ? I mean 25% sudden increase in the idle ram usage is hard to ignore ... So no one

Re: [ceph-users] emperor - firefly : Significant increase in RAM usage

2014-07-07 Thread Sylvain Munaut
Hi, We actually saw a decrease in memory usage after upgrading to Firefly, though we did reboot the nodes after the upgrade while we had the maintenance window. This is with 216 OSDs total (32-40 per node): http://i.imgur.com/BC7RuXJ.png Interesting. Is that cluster for RBD or RGW ? My

Re: [ceph-users] nginx (tengine) and radosgw

2014-07-07 Thread Sylvain Munaut
Hi, if anyone else is looking to run radosgw without having to run apache, I would recommend you look into tengine :) Just as a side note, you can now try the civetweb backend of rgw so you don't need fastcgi at all. We started running it that way and so far, it's working pretty good.

[ceph-users] Release notes for firefly not very clear wrt the tunables

2014-07-07 Thread Sylvain Munaut
Hi, In the release notes for firefly, more precisely in the upgrading from emperor section, you can find theses two notes : * The default CRUSH rules and layouts are now using the latest and greatest tunables and defaults. Clusters using the old values will now present with a health WARN

Re: [ceph-users] Release notes for firefly not very clear wrt the tunables

2014-07-07 Thread Sylvain Munaut
Hi Sage, Thanks for pointing this out. Is this clearer? Yes. Although it would probably be useful to say that using 'ceph osd crush tunables bobtail' will be enough to get rid of the warning and will not break compatibility too much (3.15 isn't that common, there is not even any longterm

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Sylvain Munaut
On Wed, Jul 16, 2014 at 10:50 AM, James Harper ja...@ejbdigital.com.au wrote: Can you offer some comments on what the impact is likely to be to the data in an affected cluster? Should all data now be treated with suspicion and restored back to before the firefly upgrade? Yes, I'd definitely

[ceph-users] OSD port usage

2014-01-21 Thread Sylvain Munaut
Hi, I noticed in the documentation that the OSD should use 3 ports per OSD daemon running and so when I setup the cluster, I originally opened enough port to accomodate this (with a small margin so that restart could proceed even is ports aren't released immediately). However today I just

[ceph-users] One specific OSD process using much more CPU than all the others

2014-01-21 Thread Sylvain Munaut
Hi, I have a cluster that contains 16 OSDs spread over 4 physical machines. Each machines runs 4 OSD process. Among those, one isue periodically using 100% of the CPU. if you aggregate the total CPU time of the process over long periods, you can clearly see it uses roughtly 6x more CPU than any

Re: [ceph-users] recreate bucket error

2014-01-22 Thread Sylvain Munaut
Hi, On Sat, Dec 7, 2013 at 6:34 PM, Yehuda Sadeh yeh...@inktank.com wrote: Sounds like disabling the cache triggers some bug. I'll open a relevant ticket. Any news on this ? I have the same issue, but the cache only masks the problem. If you restart radosgw, you'll get it again (once for

Re: [ceph-users] One specific OSD process using much more CPU than all the others

2014-01-23 Thread Sylvain Munaut
Hi, because debian wheezy libleveldb1 is also quite old http://packages.debian.org/wheezy/libleveldb1 libleveldb1 (0+20120530.gitdd0d562-1) Yes, that version is buggy and was causing the issue. I took the source deb from debian sid and rebuilt it for precise in my case:

Re: [ceph-users] One specific OSD process using much more CPU than all the others

2014-01-23 Thread Sylvain Munaut
On Thu, Jan 23, 2014 at 6:27 PM, Alexandre DERUMIER aderum...@odiso.com wrote: Thanks. Does It need to rebuild the whole ceph packages with libleveldb-dev ? Or can I simply backport libleveldb1 and use ceph packages from intank repository ? I had to rebuild ceph because the old one is a

Re: [ceph-users] OSD port usage

2014-01-24 Thread Sylvain Munaut
Hi, At some point (if the cluster gets big enough), could this degrade the network performance? Will it make sense to have a separate network for this? So in addition to public and storage we will have an heartbeat network, so we could pin it to a specific network link. I think the whole

[ceph-users] mon IO usage

2013-05-21 Thread Sylvain Munaut
Hi, I've just added some monitoring to the IO usage of mon (trying to track down that growing mon issue), and I'm kind of surprised by the amount of IO generated by the monitor process. I get continuous 4 Mo/s / 75 iops with added big spikes at each compaction every 3 min or so. Is there a

Re: [ceph-users] mon IO usage

2013-05-21 Thread Sylvain Munaut
Hi, So, AFAICT, the bulk of the write would be writing out the pgmap to disk every second or so. It should be writing out the full map only every N commits... see 'paxos stash full interval', which defaults to 25. But doesn't it also write it in full when there is a new pgmap ? I have a

Re: [ceph-users] mon IO usage

2013-05-21 Thread Sylvain Munaut
Hi, Hmm. Can you generate a log with 'debug mon = 20', 'debug paxos = 20', 'debug ms = 1' for a few minutes over which you see a high data rate and send it my way? It sounds like there is something wrong with the stash_full logic. Mm, actually I may have been fooled by the instrumentation

[ceph-users] Backfill/recivory very slow

2013-07-04 Thread Sylvain Munaut
Hi, I'm doing some test on a cluster, or at least part of it. I have several crush rule reparting data over different types of OSD. The only part of interest today is composed of 768 PGs distributed over 4 servers and 8 OSDs (2 OSDs per server). This pool is almost empty there is like 5-10 Go of

Re: [ceph-users] Backfill/recivory very slow

2013-07-04 Thread Sylvain Munaut
Hi, What's the average object size? It looks like you've got 27 PGs which are actively doing recovery and they're each doing about 3 recoveries per second. That's about the right PG count given the small number of OSDs in the pool (based on the tunable recovery values), so it's just the speed

[ceph-users] Including pool_id in the crush hash ? FLAG_HASHPSPOOL ?

2013-07-11 Thread Sylvain Munaut
Hi, I'd like the pool_id to be included in the hash used for the PG, to try and improve the data distribution. (I have 10 pool). I see that there is a flag named FLAG_HASHPSPOOL. Is it possible to enable it on existing pool ? Cheers, Sylvain

Re: [ceph-users] Num of PGs

2013-07-15 Thread Sylvain Munaut
Hi, I'm curious what would be the official recommendation for when you have multiple pools. In total we have 21 pools and that lead to around 12000 PGs for only 24 OSDs. The 'data' and 'metadata' pools are actually unused, and then we have 9 pools of 'rgw' meta data ( .rgw, .rgw.control,

Re: [ceph-users] Including pool_id in the crush hash ? FLAG_HASHPSPOOL ?

2013-07-16 Thread Sylvain Munaut
Hi, And when creating a new pool (in cuttlefish) is there a way to specify it ? (I know it's the default now, but that wasn't the case for the last cuttlefish release AFAIK). It's a monitor config option you can set: osd_pool_default_flag_hashpspool That config option doesn't exist yet in

Re: [ceph-users] Including pool_id in the crush hash ? FLAG_HASHPSPOOL ?

2013-07-16 Thread Sylvain Munaut
Hi, There is however a 'osd_pool_default_flags' so setting this to '1' (which is value of FLAG_HASHPSPOOL) could work I guess. Seems to work if I put it in the config but not with injectargs, wich is a bit annoying, Cheers, Sylvain ___

Re: [ceph-users] large memory leak on scrubbing

2013-08-16 Thread Sylvain Munaut
Hi, Is this a known bug in this version? Yes. (Do you know some workaround to fix this?) Upgrade. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph + Xen - RBD io hang

2013-08-28 Thread Sylvain Munaut
Hi, I use Ceph 0.61.8 and Xen 4.2.2 (Debian) in production, and can't use kernel 3.10.* on dom0, which hang very soon. But it's visible in kernel logs of the dom0, not the domU. Weird. I'm using 3.10.0 without issue here. What's the issue you're hitting ? Cheers, Sylvain

[ceph-users] Spurious MON re-elections

2015-04-01 Thread Sylvain Munaut
are available at http://ge.tt/2hMgZTD2 Any explanation of what's happening and how to prevent it ? I can post more info on request. I'm also available on IRC ( nick 'tnt' ) for live debug if needed :p Cheers, Sylvain Munaut ___ ceph-users

Re: [ceph-users] Spurious MON re-elections

2015-04-03 Thread Sylvain Munaut
Hi, And indeed there's nothing in the log for mon.a between 17:49:32.77602 and 17:50:10.929258, which seems not great. I'd look and see if something is happening with your disks, maybe? Mmm, indeed. I had checked all the disk with SMART and the RAID controller wasn't reporting any as

[ceph-users] Increased writes to OSD after Giant - Hammer upgrade

2015-07-08 Thread Sylvain Munaut
Hi, I performed an upgrade of our ceph cluster from Giant to the latest Hammer (hammer branch in git). And although it seemed to work fine at first, looking at the graphs this morning, I've noticed a much increased write activity on the drives, mostly the ones storing RGW buckets (although that

[ceph-users] ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 on OSDs since Hammer upgrade

2015-07-09 Thread Sylvain Munaut
Hi, Since I upgraded to Hammer last weekend, I see errors suchs as 7eff5322d700 0 cls cls/rgw/cls_rgw.cc:1947: ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 in the logs. What's going on ? Can this be related to the unexplained write activity I see on my OSDs ? Cheers, Sylvain

Re: [ceph-users] ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 on OSDs since Hammer upgrade

2015-07-10 Thread Sylvain Munaut
Hi, Some of our users have experienced this as well: https://github.com/deis/deis/issues/3969 One of our other users suggested performing a deep scrub of all PGs - the suspicion is that this is caused by a corrupt file on the filesystem. That somehow appeared right when I upgraded to hammer