Re: [ceph-users] Striping

2014-06-12 Thread David
Hi, Depends what you mean with a ”user”. You can set up pools with different replication / erasure coding etc: http://ceph.com/docs/master/rados/operations/pools/ Kind Regards, David Majchrzak 12 jun 2014 kl. 10:22 skrev : > Hi All, > > > I have a ceph cluster. If a use

[ceph-users] Backfilling, latency and priority

2014-06-12 Thread David
priority? 2. I’m running with default on these settings. Does anyone else have any experience changing those? Kind Regards, David Majchrzak ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Backfilling, latency and priority

2014-06-12 Thread David
ave enough IO to handle both recovery and client IOs? What’s the easiest/best way to add more IOs to a current cluster if you don’t want to scale? Add more RAM to OSD servers or add a SSD backed r/w cache tier? Kind Regards, David Majchrzak 12 jun 2014 kl. 14:42 skrev Mark Nelson : > On 06/12/2

Re: [ceph-users] spiky io wait within VMs running on rbd

2014-06-12 Thread David
Hi Simon, Did you check iostat on the OSDs to check their utilization? What does your ceph -w say - pehaps you’re maxing your cluster’s IOPS? Also, are you running any monitoring of your VMs iostats? We’ve often found some culprits overusing IOs.. Kind Regards, David Majchrzak 12 jun 2014 kl

[ceph-users] Taking down one OSD node (10 OSDs) for maintenance - best practice?

2014-06-13 Thread David
Hi, We’re going to take down one OSD node for maintenance (add cpu + ram) which might take 10-20 minutes. What’s the best practice here in a production cluster running dumpling 0.67.7-1~bpo70+1? Kind Regards, David Majchrzak ___ ceph-users mailing

Re: [ceph-users] Taking down one OSD node (10 OSDs) for maintenance - best practice?

2014-06-13 Thread David
? Kind Regards, David Majchrzak 13 jun 2014 kl. 11:13 skrev Wido den Hollander : > On 06/13/2014 10:56 AM, David wrote: >> Hi, >> >> We’re going to take down one OSD node for maintenance (add cpu + ram) which >> might take 10-20 minutes. >> What’s the best prac

Re: [ceph-users] Taking down one OSD node (10 OSDs) for maintenance - best practice?

2014-06-13 Thread David
Alright, thanks! :) Kind Regards, David Majchrzak 13 jun 2014 kl. 11:21 skrev Wido den Hollander : > On 06/13/2014 11:18 AM, David wrote: >> Thanks Wido, >> >> So during no out data will be degraded but not resynced, which won’t >> interrupt operations ( runni

Re: [ceph-users] Taking down one OSD node (10 OSDs) for maintenance - best practice?

2014-06-19 Thread David
during the recovery process, nothing too disruptive for our workload (since we mostly have high workload during daytime and did this during the night). Kind Regards, David Majchrzak 19 jun 2014 kl. 19:58 skrev Gregory Farnum : > No, you definitely don't need to shut down the whole cluster.

Re: [ceph-users] How to enable the writeback qemu rbd cache

2014-07-08 Thread David
Do you set cache=writeback in your vm’s qemu conf for that disk? // david 8 jul 2014 kl. 14:34 skrev lijian : > Hello, > > I want to enable the qemu rbd writeback cache, the following is the settings > in /etc/ceph/ceph.conf > [client] > rbd_cache = true > rbd_cache_wri

[ceph-users] Possible to schedule deep scrub to nights?

2014-07-18 Thread David
Is there any known workarounds to schedule deep scrubs to run nightly? Latency does go up a little bit when it runs so I’d rather that it didn’t affect our daily activities. Kind Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Possible to schedule deep scrub to nights?

2014-07-20 Thread David
scrub)? Kind Regards, David 18 jul 2014 kl. 20:04 skrev Gregory Farnum : > There's nothing built in to the system but I think some people have > had success with scripts that set nobackfill during the day, and then > trigger them regularly at night. Try searching the list archiv

Re: [ceph-users] Using Crucial MX100 for journals or cache pool

2014-08-01 Thread David
going to run it in production I’d go with the intel one. Kind Regards, David 1 aug 2014 kl. 10:38 skrev Andrei Mikhailovsky : > Hello guys, > > Was wondering if anyone has tried using the Crucial MX100 ssds either for osd > journals or cache pool? It seems like a good cost effective al

[ceph-users] Huge issues with slow requests

2014-09-04 Thread David
b8e9b3d1b58ba.5c00 [stat,write 2457600~16384] 3.47dbbb97 e13901) v4 currently waiting for subops from [12,29] Kind Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Huge issues with slow requests

2014-09-04 Thread David
to double check SMART and also see if we can off load some of the cluster in any way. If you have any other advice that’d be appreciated :) Thanks for your help! Kind Regards, David 5 sep 2014 kl. 07:30 skrev Martin B Nielsen : > Just echoing what Christian said. > > Also, iirc the &qu

Re: [ceph-users] Huge issues with slow requests

2014-09-05 Thread David
/benchmarking. Kind Regards, David 5 sep 2014 kl. 09:05 skrev Christian Balzer : > > Hello, > > On Fri, 5 Sep 2014 08:26:47 +0200 David wrote: > >> Hi, >> >> Sorry for the lack of information yesterday, this was "solved" after >> some 30 minutes

Re: [ceph-users] Introducing "Learning Ceph" : The First ever Book on Ceph

2015-02-13 Thread David
Thanks, just bought a paperback copy :) Always great to have as a reference, even if ceph is still evolving quickly. Cheers! 13 feb 2015 kl. 09:43 skrev Karan Singh : > Here is the new link for sample book : > https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0 > > >

[ceph-users] Shutting down a cluster fully and powering it back up

2015-02-28 Thread David
Hypervizors) 5. Run ceph osd unset noout Kind Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread David
ttp://tracker.ceph.com/issues/15628> Don’t know how many other are affected by it. We stop and start the osd to bring it up again but it’s quite annoying. I’m guessing this affects Jewel as well? Kind Regards, David Majchrzak ___ ceph-users mailing list ceph

Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread David
lding > latest tcmalloc and try to see if this is happening there. > Ceph is not packaging tcmalloc it is using the tcmalloc available with distro. > > Thanks & Regards > Somnath > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David > Se

[ceph-users] openSuse Leap 42.1, slow krbd, max_sectors_kb = 127

2016-05-23 Thread David
Hi All I'm doing some testing with OpenSUSE Leap 42.1, it ships with kernel 4.1.12 but I've also tested with 4.1.24 When I map an image with the kernel RBD client, max_sectors_kb = 127. I'm unable to increase: # echo 4096 > /sys/block/rbd0/queue/max_sectors_kb -bash: echo: write error: Invalid a

[ceph-users] CephFS: slow writes over NFS when fs is mounted with kernel driver but fast with Fuse

2016-05-30 Thread David
Hi All I'm having an issue with slow writes over NFS (v3) when cephfs is mounted with the kernel driver. Writing a single 4K file from the NFS client is taking 3 - 4 seconds, however a 4K write (with sync) into the same folder on the server is fast as you would expect. When mounted with ceph-fuse,

Re: [ceph-users] CephFS: slow writes over NFS when fs is mounted with kernel driver but fast with Fuse

2016-06-03 Thread David
ginal server. Noted on sync vs async, I plan on sticking with sync. On Fri, Jun 3, 2016 at 5:03 AM, Yan, Zheng wrote: > On Mon, May 30, 2016 at 10:29 PM, David wrote: > > Hi All > > > > I'm having an issue with slow writes over NFS (v3) when cephfs is mounted > >

Re: [ceph-users] CephFS in the wild

2016-06-03 Thread David
I'm hoping to implement cephfs in production at some point this year so I'd be interested to hear your progress on this. Have you considered SSD for your metadata pool? You wouldn't need loads of capacity although even with reliable SSD I'd probably still do x3 replication for metadata. I've been

Re: [ceph-users] CephFS in the wild

2016-06-06 Thread David
On Mon, Jun 6, 2016 at 7:06 AM, Christian Balzer wrote: > > Hello, > > On Fri, 3 Jun 2016 15:43:11 +0100 David wrote: > > > I'm hoping to implement cephfs in production at some point this year so > > I'd be interested to hear your progress on this. >

Re: [ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-13 Thread David
"rbd ls" does work with 4.6 (just tested with 4.6.1-1.el7.elrepo.x86_64). That's against a 10.2.0 cluster with ceph-common-10.2.0-0 What's the error you're getting? Are you using default rbd pool or specifying pool with '-p'? I'd recommend checking your ceph-common package. Thanks, On Fri, Jun 1

Re: [ceph-users] ceph benchmark

2016-06-16 Thread David
I'm probably misunderstanding the question but if you're getting 3GB/s from your dd, you're already caching. Can you provide some more detail on what you're trying to achieve. On 16 Jun 2016 21:53, "Patrick McGarry" wrote: > Moving this over to ceph-user where it’ll get the eyeballs you need. > >

Re: [ceph-users] Performance Testing

2016-06-17 Thread David
On 17 Jun 2016 3:33 p.m., "Carlos M. Perez" wrote: > > Hi, > > > > I found the following on testing performance - http://tracker.ceph.com/projects/ceph/wiki/Benchmark_Ceph_Cluster_Performance and have a few questions: > > > > - By testing the block device Do the performance tests take t

Re: [ceph-users] cluster ceph -s error

2016-06-18 Thread David
Is this a test cluster that has never been healthy or a working cluster which has just gone unhealthy? Have you changed anything? Are all hosts, drives, network links working? More detail please. Any/all of the following would help: ceph health detail ceph osd stat ceph osd tree Your ceph.conf Yo

Re: [ceph-users] Should I use different pool?

2016-06-27 Thread David
Yes you should definitely create different pools for different HDD types. Another decision you need to make is whether you want dedicated nodes for SSD or want to mix them in the same node. You need to ensure you have sufficient CPU and fat enough network links to get the most out of your SSD's. Y

Re: [ceph-users] OSD Cache

2016-06-28 Thread David
Hi, Please clarify what you mean by "osd cache". Raid controller cache or Ceph's cache tiering feature? On Tue, Jun 28, 2016 at 10:21 AM, Mohd Zainal Abidin Rabani < zai...@nocser.net> wrote: > Hi, > > > > We have using osd on production. SSD as journal. We have test io and show > good result. W

Re: [ceph-users] How many nodes/OSD can fail

2016-06-28 Thread David
Hi, This is probably the min_size on your cephfs data and/or metadata pool. I believe the default is 2, if you have less than 2 replicas available I/O will stop. See: http://docs.ceph.com/docs/master/rados/operations/pools/#set-the-number-of-object-replicas On Tue, Jun 28, 2016 at 10:23 AM, willi

Re: [ceph-users] CEPH Replication

2016-07-01 Thread David
It will work but be aware 2x replication is not a good idea if your data is important. The exception would be if the OSD's are DC class SSD's that you monitor closely. On Fri, Jul 1, 2016 at 1:09 PM, Ashley Merrick wrote: > Hello, > > Perfect, I want to keep on separate node's, so wanted to make

Re: [ceph-users] 40Gb fileserver/NIC suggestions

2016-07-13 Thread David
Aside from the 10GbE vs 40GbE question, if you're planning to export an RBD image over smb/nfs I think you are going to struggle to reach anywhere near 1GB/s in a single threaded read. This is because even with readahead cranked right up you're still only going be hitting a handful of disks at a ti

[ceph-users] CephFS | Recursive stats not displaying with GNU ls

2016-07-18 Thread David
Hi all Recursive statistics on directories are no longer showing on an ls -l output but getfattr is accurate: # ls -l total 0 drwxr-xr-x 1 root root 3 Jul 18 12:42 dir1 drwxr-xr-x 1 root root 0 Jul 18 12:42 dir2 ]# getfattr -d -m ceph.dir.* dir1 # file: dir1 ceph.dir.entries="3" ceph.dir.files="

Re: [ceph-users] CephFS | Recursive stats not displaying with GNU ls

2016-07-18 Thread David
wrote: > Hi, > > Is this disabled because its not a stable feature or just user preference? > > Thanks > > On Mon, Jul 18, 2016 at 2:37 PM, Yan, Zheng wrote: > >> On Mon, Jul 18, 2016 at 9:00 PM, David wrote: >> > Hi all >> > >> > Recursive st

Re: [ceph-users] mon_osd_nearfull_ratio (unchangeable) ?

2016-07-26 Thread David
Try: ceph pg set_nearfull_ratio 0.9 On 26 Jul 2016 08:16, "Goncalo Borges" wrote: > Hello... > > I do not think that these settings are working properly in jewel. Maybe > someone else can confirm. > > So, to summarize: > > 1./ I've restarted mon and osd services (systemctl restart ceph.target)

Re: [ceph-users] 2TB useable - small business - help appreciated

2016-07-30 Thread David
Hi Richard, It would be useful to know what you're currently using for storage as that would help in recommending a strategy. My guess is an all CephFS set up might be best for your use case. I haven't tested this myself but I'd mount CephFS on the OSD nodes with the Fuse client and export over NF

Re: [ceph-users] How to configure OSD heart beat to happen on public network

2016-07-31 Thread David
The purpose of the cluster network is to isolate the heartbeat (and recovery) traffic. I imagine that is why you are struggling to get the heartbeat traffic on the public network. On 27 Jul 2016 8:32 p.m., "Venkata Manojawa Paritala" wrote: > Hi, > > I have configured the below 2 networks in Cep

[ceph-users] Giant to Jewel poor read performance with Rados bench

2016-08-06 Thread David
ol_default_pgp_num = 4096 osd_crush_chooseleaf_type = 1 max_open_files = 131072 mon_clock_drift_allowed = .15 mon_clock_drift_warn_backoff = 30 mon_osd_down_out_interval = 300 mon_osd_report_timeout = 300 mon_osd_full_ratio = .95 mon_osd_nearfull_ratio = .80 osd_backfill_full_

Re: [ceph-users] Giant to Jewel poor read performance with Rados bench

2016-08-07 Thread David
0.355126 Max latency(s): 2.17366 Min latency(s): 0.00641849 I appreciate this may not be a Ceph config issue but any tips on tracking down this issue would be much appreciated. On Sat, Aug 6, 2016 at 9:38 PM, David wrote: > Hi All > > I've just installed Jewel 10.2.

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread David
That will be down to the pool the rbd was in, the crush rule for that pool will dictate which osd's store objects. In a standard config that rbd will likely have objects on every osd in your cluster. On 8 Aug 2016 9:51 a.m., "Georgios Dimitrakakis" wrote: > Hi, >> >> >> On 08.08.2016 09:58, Geor

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread David
I don't think there's a way of getting the prefix from the cluster at this point. If the deleted image was a similar size to the example you've given, you will likely have had objects on every OSD. If this data is absolutely critical you need to stop your cluster immediately or make copies of all

Re: [ceph-users] Giant to Jewel poor read performance with Rados bench

2016-08-09 Thread David
k Nelson wrote: > Hi David, > > We haven't done any direct giant to jewel comparisons, but I wouldn't > expect a drop that big, even for cached tests. How long are you running > the test for, and how large are the IOs? Did you upgrade anything else at > the same tim

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-09 Thread David
On Mon, Aug 8, 2016 at 9:39 PM, Georgios Dimitrakakis wrote: > Dear David (and all), > > the data are considered very critical therefore all this attempt to > recover them. > > Although the cluster hasn't been fully stopped all users actions have. I > mean services are

[ceph-users] CephFS: cached inodes with active-standby

2016-08-15 Thread David
Hi All When I compare a 'ceph daemon mds.*id* perf dump mds' on my active MDS with my standby-replay MDS, the inodes count on the standby is a lot less than the active. I would expect to see a very similar number of inodes or have I misunderstood this feature? My understanding was the replay daem

Re: [ceph-users] Single-node Ceph & Systemd shutdown

2016-08-20 Thread David
It sounds like the Ceph services are being stopped before it gets to the unmounts. It probably can't unmount the rbd cleanly so shutdowm hangs. Btw mounting with the kernel client on an OSD node isn't recommend. On 20 Aug 2016 6:35 p.m., "Marcus" wrote: > For a home server project I've set up a

Re: [ceph-users] Problems getting nfs-ganesha with cephfs backend to work.

2017-07-18 Thread David
You mentioned the Kernel client works but the Fuse mount would be a better test in relation to the Ganesha FSAL. The following config didn't give me the error you describe in 1) but I'm mounting on the client with NFSv4, not sure about 2), is that dm-nfs? EXPORT { Export_ID = 1; Path = "/

Re: [ceph-users] 答复: How's cephfs going?

2017-07-19 Thread David
On Wed, Jul 19, 2017 at 4:47 AM, 许雪寒 wrote: > Is there anyone else willing to share some usage information of cephfs? > I look after 2 Cephfs deployments, both Jewel, been in production since Jewel went stable so just over a year I think. We've had a really positive experience, I've not experien

Re: [ceph-users] 答复: How's cephfs going?

2017-07-19 Thread David
On Tue, Jul 18, 2017 at 6:54 AM, Blair Bethwaite wrote: > We are a data-intensive university, with an increasingly large fleet > of scientific instruments capturing various types of data (mostly > imaging of one kind or another). That data typically needs to be > stored, protected, managed, share

Re: [ceph-users] 答复: How's cephfs going?

2017-07-20 Thread David
On Wed, Jul 19, 2017 at 7:09 PM, Gregory Farnum wrote: > > > On Wed, Jul 19, 2017 at 10:25 AM David wrote: > >> On Tue, Jul 18, 2017 at 6:54 AM, Blair Bethwaite < >> blair.bethwa...@gmail.com> wrote: >> >>> We are a data-intensive university, with

Re: [ceph-users] Writing data to pools other than filesystem

2017-07-20 Thread David
I think the multiple namespace feature would be more appropriate for your use case. So that would be multiple file systems within the same pools rather than multiple pools in a single filesystem. With that said, that might be overkill for your requirement. You might be able to achieve what you nee

Re: [ceph-users] Ceph MDS Q Size troubleshooting

2017-07-20 Thread David
Hi James On Tue, Jul 18, 2017 at 8:07 AM, James Wilkins wrote: > Hello list, > > I'm looking for some more information relating to CephFS and the 'Q' size, > specifically how to diagnose what contributes towards it rising up > > Ceph Version: 11.2.0.0 > OS: CentOS 7 > Kernel (Ceph Servers): 3.10

Re: [ceph-users] Writing data to pools other than filesystem

2017-07-20 Thread David
have other pools > "To prevent clients from writing or reading data to pools other than those > in use for CephFS, set an OSD authentication capability that restricts > access to the CephFS data pool(s)." > > THX > > > > 20. Juli 2017 14:00, "David"

Re: [ceph-users] oVirt/RHEV and Ceph

2017-07-25 Thread David
My understanding was Cinder is needed to create/delete/manage etc. on volumes but I/O to the volumes is direct from the hypervisors. In theory you could lose your Cinder service and VMs would stay up. On 25 Jul 2017 4:18 a.m., "Brady Deetz" wrote: Thanks for pointing to some documentation. I'd s

Re: [ceph-users] Bad IO performance CephFS vs. NFS for block size 4k/128k

2017-09-04 Thread David
On Mon, Sep 4, 2017 at 4:27 PM, wrote: > Hello! > > I'm validating IO performance of CephFS vs. NFS. > > Therefore I have mounted the relevant filesystems on the same client. > Then I start fio with the following parameters: > action = randwrite randrw > blocksize = 4k 128k 8m > rwmixreadread = 7

[ceph-users] debian-hammer wheezy Packages file incomplete?

2017-09-12 Thread David
lt;https://download.ceph.com/debian-hammer/pool/main/c/ceph/> Is it a known issue or rather a "feature" =D Kind Regards, David Majchrzak___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] debian-hammer wheezy Packages file incomplete?

2017-09-13 Thread David
in the archive like debian-dumpling and debian-firefly? > 13 sep. 2017 kl. 03:09 skrev David : > > Hi! > > Noticed tonight during maintenance that the hammer repo for debian wheezy > only has 2 packages listed in the Packages file. > Thought perhaps it's being moved to a

[ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-13 Thread David
Hi All I did a Jewel -> Luminous upgrade on my dev cluster and it went very smoothly. I've attempted to upgrade on a small production cluster but I've hit a snag. After installing the ceph 12.2.0 packages with "yum install ceph" on the first node and accepting all the dependencies, I found that

Re: [ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-15 Thread David
Hi David I like your thinking! Thanks for the suggestion. I've got a maintenance window later to finish the update so will give it a try. On Thu, Sep 14, 2017 at 6:24 PM, David Turner wrote: > This isn't a great solution, but something you could try. If you stop all > of

Re: [ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-15 Thread David
Happy to report I got everything up to Luminous, used your tip to keep the OSDs running, David, thanks again for that. I'd say this is a potential gotcha for people collocating MONs. It appears that if you're running selinux, even in permissive mode, upgrading the ceph-selinux package

[ceph-users] CephFS Luminous | MDS frequent "replicating dir" message in log

2017-09-25 Thread David
rnel clients: 3.10.0-514.2.2.el7.x86_64 Thanks, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Updating ceps client - what will happen to services like NFS on clients

2017-09-25 Thread David
Hi Götz If you did a rolling upgrade, RBD clients shouldn't have experienced interrupted IO and therefor IO to NFS exports shouldn't have been affected. However, in the past when using kernel NFS over kernel RBD, I did have some lockups when OSDs went down in the cluster so that's something to wat

Re: [ceph-users] nfs-ganesha / cephfs issues

2017-10-01 Thread David
Cephfs does have repair tools but I wouldn't jump the gun, your metadata pool is probably fine. Unless you're getting health errors or seeing errors in your MDS log? Are you exporting a fuse or kernel mount with Ganesha (i.e using the vfs FSAL) or using the Ceph FSAL? Have you tried any tests dire

Re: [ceph-users] Ceph monitoring

2017-10-02 Thread David
If you take Ceph out of your search string you should find loads of tutorials on setting up the popular collectd/influxdb/grafana stack. Once you've got that in place, the Ceph bit should be fairly easy. There's Ceph collectd plugins out there or you could write your own. On Mon, Oct 2, 2017 at

Re: [ceph-users] Calamari ( what a nightmare !!! )

2017-12-11 Thread David
le new cluster without needing to know how to operate it ;) https://croit.io/ <https://croit.io/> The latter isn't open sourced yet as far as I know. Kind Regards, David > 12 dec. 2017 kl. 02:18 skrev DHD.KOHA : > > Hello list, > > Newbie here, > > After managing t

Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-05 Thread David
Hi! nopti or pti=off in kernel options should disable some of the kpti. I haven't tried it yet though, so give it a whirl. https://en.wikipedia.org/wiki/Kernel_page-table_isolation <https://en.wikipedia.org/wiki/Kernel_page-table_isolation> Kind Regards, David Majchrzak > 5 ja

[ceph-users] Migrating filestore to bluestore using ceph-volume

2018-01-26 Thread David
ost int2 1 ssd 0.43159 osd.1up 1.0 1.0 4 ssd 0.43660 osd.4up 1.0 1.0 -4 0.86819 host int3 2 ssd 0.43159 osd.2up 1.0 1.0 5 ssd 0.43660 osd.5up 1.0 1.0 What's the best course of

[ceph-users] CephFS: caps went stale, renewing

2016-09-02 Thread David
health is OK and the MDS server seems pretty happy although I am occasionally seeing some "closing stale session" lines in the ceph-mds log but I think that's a separate issue. Thanks, David ___ ceph-users mailing list ceph-users@l

Re: [ceph-users] CephFS: caps went stale, renewing

2016-09-03 Thread David
g the stale caps errors. Sorry for the noise. Thanks, On Sat, Sep 3, 2016 at 4:14 AM, Yan, Zheng wrote: > On Sat, Sep 3, 2016 at 1:35 AM, Gregory Farnum wrote: > > On Fri, Sep 2, 2016 at 2:58 AM, David wrote: > >> Hi All > >> > >> Kernel client: 4.6.4-1.e

Re: [ceph-users] Raw data size used seems incorrect (version Jewel, 10.2.2)

2016-09-07 Thread David
Could be related to this? http://tracker.ceph.com/issues/13844 On Wed, Sep 7, 2016 at 7:40 AM, james wrote: > Hi, > > Not sure if anyone can help clarify or provide any suggestion on how to > troubleshoot this > > We have a ceph cluster recently build up with ceph version Jewel, 10.2.2. > Based

Re: [ceph-users] NFS gateway

2016-09-07 Thread David
I have clients accessing CephFS over nfs (kernel nfs). I was seeing slow writes with sync exports. I haven't had a chance to investigate and in the meantime I'm exporting with async (not recommended, but acceptable in my environment). I've been meaning to test out Ganesha for a while now @Sean, h

Re: [ceph-users] Cannot start the Ceph daemons using upstart after upgrading to Jewel 10.2.2

2016-09-08 Thread David
Afaik, the daemons are managed by systemd now on most distros e.g: systemctl start ceph-osd@0.service On Thu, Sep 8, 2016 at 3:36 PM, Simion Marius Rad wrote: > Hello, > > Today I upgraded an Infernalis 9.2.1 cluster to Jewel 10.2.2. > All went well until I wanted to restart the daemons using

Re: [ceph-users] I/O freeze while a single node is down.

2016-09-13 Thread David
What froze? Kernel RBD? Librbd? CephFS? Ceph version? On Tue, Sep 13, 2016 at 11:24 AM, Daznis wrote: > Hello, > > > I have encountered a strange I/O freeze while rebooting one OSD node > for maintenance purpose. It was one of the 3 Nodes in the entire > cluster. Before this rebooting or shutti

Re: [ceph-users] Full OSD halting a cluster - isn't this violating the "no single point of failure" promise?

2016-09-19 Thread David
Ceph is pretty awesome but I'm not sure it can be expected to keep I/O going if there is no available capacity. Granted, the osds aren't always balanced evenly but generally if you've got one drive hitting full ratio, you've probably got a lot more not far behind. Although probably not recommend,

[ceph-users] Jewel Docs | error on mount.ceph page

2016-09-20 Thread David
Sorry I don't know the correct way to report this. Potential error on this page: on http://docs.ceph.com/docs/jewel/man/8/mount.ceph/ Currently: rsize int (bytes), max readahead, multiple of 1024, Default: 524288 (512*1024) Should it be something like the following? rsize int (bytes), max rea

Re: [ceph-users] CephFS metadata pool size

2016-09-26 Thread David
Ryan, a team at Ebay recently did some metadata testing, have a search on this list. Pretty sure they found there wasn't a huge benefit it putting the metadata pool on solid. As Christian says, it's all about ram and Cpu. You want to get as many inodes into cache as possible. On 26 Sep 2016 2:09 a

Re: [ceph-users] Ceph Very Small Cluster

2016-09-29 Thread David
Ranjan, If you unmount the file system on both nodes and then gracefully stop the Ceph services (or even yank the network cable for that node), what state is your cluster in? Are you able to do a basic rados bench write and read? How are you mounting CephFS, through the Kernel or Fuse client? Hav

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

2016-10-10 Thread David
Can you provide a 'ceph health detail' On 9 Oct 2016 3:56 p.m., "Mike Jacobacci" wrote: Hi, Yesterday morning I added two more OSD nodes and changed the crushmap from disk to node. It looked to me like everything went ok besides some disks missing that I can re-add later, but the cluster status

[ceph-users] crush map has straw_calc_version=0

2018-06-24 Thread David
new pool on the same OSDs (not sure that's in Mimic yet though?) Kind Regards, David Majchrzak > Moving to straw_calc_version 1 and then adjusting a straw bucket (by adding, > removing, or reweighting an item, or by using the reweight-all command) can > trigger a small to moderate

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-04-25 Thread David
On 19 Apr 2017 18:01, "Adam Carheden" wrote: Does anyone know if XFS uses a single thread to write to it's journal? You probably know this but just to avoid any confusion, the journal in this context isn't the metadata journaling in XFS, it's a separate journal written to by the OSD daemons I

Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread David
> 4 juni 2017 kl. 23:23 skrev Roger Brown : > > I'm a n00b myself, but I'll go on record with my understanding. > > On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa > mailto:benoit.george...@yulpa.io>> wrote: > Hi ceph users, > > Ceph have a very good documentation about technical usa

[ceph-users] CephFS | flapping OSD locked up NFS

2017-06-19 Thread David
with a NFS server exporting a local file system. Also, NFS performance hasn't been great with small reads/writes, particularly writes with the default sync export option, I've had to export with async for the time-being. I haven't had a chance to troubleshoot this in any depth yet

Re: [ceph-users] CephFS | flapping OSD locked up NFS

2017-06-20 Thread David
ite so there is an expectation people will want to use it. Thanks, David On 19 Jun 2017 3:56 p.m., "John Petrini" wrote: > Hi David, > > While I have no personal experience with this; from what I've been told, > if you're going to export cephfs over

[ceph-users] Again - state of Ceph NVMe and SSDs

2016-01-16 Thread David
RAM on it? :D Kind Regards, David Majchrzak___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Again - state of Ceph NVMe and SSDs

2016-01-17 Thread David
Thanks Wido, those are good pointers indeed :) So we just have to make sure the backend storage (SSD/NVMe journals) won’t be saturated (or the controllers) and then go with as many RBD per VM as possible. Kind Regards, David Majchrzak 16 jan 2016 kl. 22:26 skrev Wido den Hollander : > On 01

Re: [ceph-users] Again - state of Ceph NVMe and SSDs

2016-01-17 Thread David
That is indeed great news! :) Thanks for the heads up. Kind Regards, David Majchrzak 17 jan 2016 kl. 21:34 skrev Tyler Bishop : > The changes you are looking for are coming from Sandisk in the ceph "Jewel" > release coming up. > > Based on benchmarks and testi

[ceph-users] Ceph and NFS

2016-01-18 Thread david
Hello All. Does anyone provides Ceph rbd/rgw/cephfs through NFS? I have a requirement about Ceph Cluster which needs to provide NFS service. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-cep

Re: [ceph-users] Ceph and NFS

2016-01-18 Thread david
Hi, Does CephFS stable enough to deploy it in product environments? and Do you compare the performance between nfs-ganesha and standard kernel based NFSd which are based on CephFS? > On Jan 18, 2016, at 20:34, Burkhard Linke > wrote: > > Hi, > > On 18.01.2016

Re: [ceph-users] Ceph and NFS

2016-01-18 Thread david
Hi, Thanks for your answer. Does CephFS stable enough to deploy it in product environments? and Do you compare the performance between nfs-ganesha and standard kernel based NFSd which are based on CephFS? ___ ceph-users mailing list ceph-users@l

Re: [ceph-users] Upgrading Ceph

2016-02-01 Thread david
Hi Vlad, I just upgraded my ceph cluster from firefly to hammer and all right. Please do it according to the manuals in www.ceph.com and restart monitors and then osds. I restarted osds one by one, which means restart one OSD and waits for it runs normal and then r

Re: [ceph-users] Ceph and Google Summer of Code

2016-02-29 Thread David
Great idea! +1 David Majchrzak > 29 feb. 2016 kl. 22:53 skrev Wido den Hollander : > > A long wanted feature is mail storage in RADOS: > http://tracker.ceph.com/issues/12430 > > Would that be a good idea? I'd be more than happy to mentor this one. > > I will

Re: [ceph-users] DSS 7000 for large scale object storage

2016-03-21 Thread David
Sounds like you’ll have a field day waiting for rebuild in case of a node failure or an upgrade of the crush map ;) David > 21 mars 2016 kl. 09:55 skrev Bastian Rosner : > > Hi, > > any chance that somebody here already got hands on Dell DSS 7000 machines? > > 4U chass

Re: [ceph-users] DSS 7000 for large scale object storage

2016-03-21 Thread David
would take approx. 16-17 hours. Usually it takes some x2 or x3 times longer in a normal case and if your controllers or network is limited. // david > 21 mars 2016 kl. 13:13 skrev Bastian Rosner : > > Yes, rebuild in case of a whole chassis failure is indeed an issue. That > depend

[ceph-users] What's the best practice for Erasure Coding

2019-07-07 Thread David
or  clay) should I adopt, and how to choose the combinations of (k,m) (e.g. (k=3,m=2), (k=6,m=3) ). Does anyone share some experience? Thanks for any help. Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://li

[ceph-users] Multi-MDS setup, one MDS stuck in resolve, 3 stuck in standby, can't make another MDS come live

2014-06-05 Thread David Jericho
Hi all, I did a bit of an experiment with multi-mds on firefly, and it worked fine until one of the MDS crashed when rebalancing. It's not the end of the world, and I could just start fresh with the cluster, but I'm keen to see if this can be fixed as running multi-mds is something I would like

Re: [ceph-users] Multi-MDS setup, one MDS stuck in resolve, 3 stuck in standby, can't make another MDS come live

2014-06-05 Thread David Jericho
> -Original Message- > From: Yan, Zheng [mailto:uker...@gmail.com] > looks like you removed mds.0 from the failed list. I don't think there is a > command to add mds the failed back. maybe you can use 'ceph mds setmap > ...' . >From memory, I probably did, misunderstanding how it worked.

[ceph-users] Swift API Authentication Failure

2014-06-06 Thread David Curtiss
e output of "radosgw-admin user info --uid=hive_cache": http://pastebin.com/vwwbyd4c And here's my curl invocation: http://pastebin.com/EfQ8nw8a Any ideas on what might be wrong? Thanks, David ___ ceph-users mailing list ceph-users@l

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-06-11 Thread David Zafman
eater than the osd_deep_scrub_interval, there won't be a deep scrub until osd_scrub_max_interval has elapsed. Please check the 3 interval config values. Verify that your PGs are active+clean just to be sure. David On May 20, 2014, at 5:21 PM, Mike Dawson wrote: > Today I noticed that

Re: [ceph-users] Swift API Authentication Failure

2014-06-11 Thread David Curtiss
ls hive_cache:swift2 hive_cache:swift So everything looked good, as far as I can tell, but I still can't authenticate with the first subuser. (But at least the second one still works.) - David On Wed, Jun 11, 2014 at 5:38 PM, Yehuda Sadeh wrote: > (resending also to list) > Right. So Bas

Re: [ceph-users] What exactly is the kernel rbd on osd issue?

2014-06-12 Thread David Zafman
the point of view of the host kernel, this won’t happen. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jun 12, 2014, at 6:33 PM, lists+c...@deksai.com wrote: > I remember reading somewhere that the kernel ceph clients (rbd/fs) could > not run on the same h

  1   2   3   4   5   6   7   8   9   10   >