Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-02 Thread Chris Taylor
I removed the latest OSD that was respawing (osd.23) and now I having the same problem with osd.30. It looks like they both have pg 3.f9 in common. I tried "ceph pg repair 3.f9" but the OSD is still respawning. Does anyone have any ideas? Thanks, Chris ceph-osd-03:ceph-osd.30.log -29> 20

[ceph-users] ESXi/LIO/RBD repeatable problem, hang when cloning VM

2015-09-02 Thread Alex Gorbachev
e have experienced a repeatable issue when performing the following: Ceph backend with no issues, we can repeat any time at will in lab and production. Cloning an ESXi VM to another VM on the same datastore on which the original VM resides. Practically instantly, the LIO machine becomes unrespon

Re: [ceph-users] libvirt rbd issue

2015-09-02 Thread Rafael Lopez
Hi Jan, Thanks for the advice, hit the nail on the head. I checked the limits and watched the no. of fd's and as it reached the soft limit (1024) thats when the transfer came to a grinding halt and the vm started locking up. After your reply I also did some more googling and found another old th

Re: [ceph-users] osds on 2 nodes vs. on one node

2015-09-02 Thread Christian Balzer
Hello, On Wed, 2 Sep 2015 22:38:12 + Deneau, Tom wrote: > In a small cluster I have 2 OSD nodes with identical hardware, each with > 6 osds. > > * Configuration 1: I shut down the osds on one node so I am using 6 > OSDS on a single node > Shut down how? Just a "service blah stop" or actual

[ceph-users] osds on 2 nodes vs. on one node

2015-09-02 Thread Deneau, Tom
In a small cluster I have 2 OSD nodes with identical hardware, each with 6 osds. * Configuration 1: I shut down the osds on one node so I am using 6 OSDS on a single node * Configuration 2: I shut down 3 osds on each node so now I have 6 total OSDS but 3 on each node. I measure read performa

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Changing to the acpi_idle driver dropped the performance by about 50%. That was an unexpected result. I'm having issues with powertop and the userspace governor, it always shows 100% idle. I downloaded the latest version with the same result. Still

Re: [ceph-users] ceph-deploy: too many argument: --setgroup 10

2015-09-02 Thread Travis Rhoden
Hi Noah, What is the ownership on /var/lib/ceph ? ceph-deploy should only be trying to use --setgroup if /var/lib/ceph is owned by non-root. On a fresh install of Hammer, this should be root:root. The --setgroup flag was added to ceph-deploy in 1.5.26. - Travis On Wed, Sep 2, 2015 at 1:59 PM

[ceph-users] rebalancing taking very long time

2015-09-02 Thread Bob Ababurko
When I lose a disk OR replace a OSD in my POC ceph cluster, it takes a very long time to rebalance. I should note that my cluster is slightly unique in that I am using cephfs(shouldn't matter?) and it currently contains about 310 million objects. The last time I replaced a disk/OSD was 2.5 days a

Re: [ceph-users] Ceph read / write : Terrible performance

2015-09-02 Thread Vickey Singh
Thank You Mark , please see my response below. On Wed, Sep 2, 2015 at 5:23 PM, Mark Nelson wrote: > On 09/02/2015 08:51 AM, Vickey Singh wrote: > >> Hello Ceph Experts >> >> I have a strange problem , when i am reading or writing to Ceph pool , >> its not writing properly. Please notice Cur MB/s

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Dan van der Ster
On Wed, Sep 2, 2015 at 7:23 PM, Sage Weil wrote: > On Wed, 2 Sep 2015, Dan van der Ster wrote: >> On Wed, Sep 2, 2015 at 4:23 PM, Sage Weil wrote: >> > On Wed, 2 Sep 2015, Dan van der Ster wrote: >> >> On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: >> >> > On Wed, 2 Sep 2015, Dan van der Ster

Re: [ceph-users] Corruption of file systems on RBD images

2015-09-02 Thread Lionel Bouton
Le 02/09/2015 18:16, Mathieu GAUTHIER-LAFAYE a écrit : > Hi Lionel, > > - Original Message - >> From: "Lionel Bouton" >> To: "Mathieu GAUTHIER-LAFAYE" , >> ceph-us...@ceph.com >> Sent: Wednesday, 2 September, 2015 4:40:26 PM >> Subject: Re: [ceph-users] Corruption of file systems on RBD i

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Sage Weil
On Wed, 2 Sep 2015, Dan van der Ster wrote: > On Wed, Sep 2, 2015 at 4:23 PM, Sage Weil wrote: > > On Wed, 2 Sep 2015, Dan van der Ster wrote: > >> On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: > >> > On Wed, 2 Sep 2015, Dan van der Ster wrote: > >> >> ... > >> >> Normally I use crushtool --te

[ceph-users] Ask Sage Anything!

2015-09-02 Thread Patrick McGarry
Hey cephers, While I'm sure that most of you probably get your Ceph-related questions answered here on the mailing lists, Sage is doing an "Ask me anything" on Reddit in about an hour: https://www.reddit.com/r/IAmA/comments/3jdnnd/i_am_sage_weil_lead_architect_and_cocreator_of/ You can ask him q

Re: [ceph-users] Corruption of file systems on RBD images

2015-09-02 Thread Mathieu GAUTHIER-LAFAYE
Hi Lionel, - Original Message - > From: "Lionel Bouton" > To: "Mathieu GAUTHIER-LAFAYE" , > ceph-us...@ceph.com > Sent: Wednesday, 2 September, 2015 4:40:26 PM > Subject: Re: [ceph-users] Corruption of file systems on RBD images > > Hi Mathieu, > > Le 02/09/2015 14:10, Mathieu GAUTHIER

[ceph-users] Strange logging behaviour for ceph

2015-09-02 Thread J-P Methot
Hi, We're using Ceph Hammer 0.94.1 on centOS 7. On the monitor, when we set log_to_syslog = true Ceph starts shooting logs at stdout. I thought at first it might be rsyslog that is wrongly configured, but I did not find a rule that could explain this behavior. Can anybody else replicate this? If

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Erming Pei
On 9/2/15, 9:31 AM, Gregory Farnum wrote: [ Re-adding the list. ] On Wed, Sep 2, 2015 at 4:29 PM, Erming Pei wrote: Hi Gregory, Thanks very much for the confirmation and explanation. And I presume you have an MDS cap in there as well? Is there a difference between set this cap and w

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Thanks for the responses. I forgot to include the fio test for completeness: 8 job QD=8 [ext4-test] runtime=150 name=ext4-test readwrite=randrw size=15G blocksize=4k ioengine=sync iodepth=8 numjobs=8 thread group_reporting time_based direct=1 1 j

[ceph-users] Ceph new mon deploy v9.0.3-1355

2015-09-02 Thread German Anders
Hi cephers, trying to deploying a new ceph cluster with master release (v9.0.3) and when trying to create the initial mons and error appears saying that "admin_socket: exception getting command descriptions: [Errno 2] No such file or directory", find the log: ... [ceph_deploy.mon][INFO ] distro

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Sage Weil
On Wed, 2 Sep 2015, Dan van der Ster wrote: > On Wed, Sep 2, 2015 at 4:23 PM, Sage Weil wrote: > > On Wed, 2 Sep 2015, Dan van der Ster wrote: > >> On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: > >> > On Wed, 2 Sep 2015, Dan van der Ster wrote: > >> >> ... > >> >> Normally I use crushtool --te

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Gregory Farnum
[ Re-adding the list. ] On Wed, Sep 2, 2015 at 4:29 PM, Erming Pei wrote: > Hi Gregory, > >Thanks very much for the confirmation and explanation. > >>And I presume you have an MDS cap in there as well? > Is there a difference between set this cap and without setting? Well, I don't think yo

Re: [ceph-users] Is Ceph appropriate for small installations?

2015-09-02 Thread Marcin Przyczyna
On 09/02/2015 02:31 PM, Janusz Borkowski wrote: > Hi! > > Do you have replication factor 2? yes. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Dan van der Ster
On Wed, Sep 2, 2015 at 4:23 PM, Sage Weil wrote: > On Wed, 2 Sep 2015, Dan van der Ster wrote: >> On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: >> > On Wed, 2 Sep 2015, Dan van der Ster wrote: >> >> ... >> >> Normally I use crushtool --test --show-mappings to test rules, but >> >> AFAICT it do

Re: [ceph-users] Corruption of file systems on RBD images

2015-09-02 Thread Lionel Bouton
Hi Mathieu, Le 02/09/2015 14:10, Mathieu GAUTHIER-LAFAYE a écrit : > Hi All, > > We have some troubles regularly with virtual machines using RBD storage. When > we restart some virtual machines, they starts to do some filesystem checks. > Sometime it can rescue it, sometime the virtual machine d

Re: [ceph-users] Ceph read / write : Terrible performance

2015-09-02 Thread Mark Nelson
On 09/02/2015 08:51 AM, Vickey Singh wrote: Hello Ceph Experts I have a strange problem , when i am reading or writing to Ceph pool , its not writing properly. Please notice Cur MB/s which is going up and down --- Ceph Hammer 0.94.2 -- CentOS 6, 2.6 -- Ceph cluster is healthy You might find t

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Sage Weil
On Wed, 2 Sep 2015, Dan van der Ster wrote: > On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: > > On Wed, 2 Sep 2015, Dan van der Ster wrote: > >> ... > >> Normally I use crushtool --test --show-mappings to test rules, but > >> AFAICT it doesn't let you simulate an out osd, i.e. with reweight = 0

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Dan van der Ster
On Wed, Sep 2, 2015 at 4:11 PM, Sage Weil wrote: > On Wed, 2 Sep 2015, Dan van der Ster wrote: >> ... >> Normally I use crushtool --test --show-mappings to test rules, but >> AFAICT it doesn't let you simulate an out osd, i.e. with reweight = 0. >> Any ideas how to test this situation without uplo

Re: [ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Sage Weil
On Wed, 2 Sep 2015, Dan van der Ster wrote: > Hi all, > > We just ran into a small problem where some PGs wouldn't backfill > after an OSD was marked out. Here's the relevant crush rule; being a > non-trivial example I'd like to test different permutations of the > crush map (e.g. increasing choos

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-02 Thread Alfredo Deza
As of yesterday we are now ready to start providing Debian Jessie packages. They will be present by default for the upcoming Ceph release (Infernalis). For other releases (e.g. Firefly, Hammer, Giant) it means that there will be a Jessie package for them for new versions only. Let me know if you

[ceph-users] testing a crush rule against an out osd

2015-09-02 Thread Dan van der Ster
Hi all, We just ran into a small problem where some PGs wouldn't backfill after an OSD was marked out. Here's the relevant crush rule; being a non-trivial example I'd like to test different permutations of the crush map (e.g. increasing choose_total_tries): rule critical { ruleset 4

[ceph-users] Ceph read / write : Terrible performance

2015-09-02 Thread Vickey Singh
Hello Ceph Experts I have a strange problem , when i am reading or writing to Ceph pool , its not writing properly. Please notice Cur MB/s which is going up and down --- Ceph Hammer 0.94.2 -- CentOS 6, 2.6 -- Ceph cluster is healthy One interesting thing is when every i start rados bench comman

Re: [ceph-users] CephFS with cache tiering - reading files are filled with 0s

2015-09-02 Thread Arthur Liu
Hi John and Zheng, Thanks for the quick replies! I'm using kernel 4.2. I'll test out that fix. Arthur On Wed, Sep 2, 2015 at 10:29 PM, Yan, Zheng wrote: > probably caused by http://tracker.ceph.com/issues/12551 > > On Wed, Sep 2, 2015 at 7:57 PM, Arthur Liu wrote: > > Hi, > > > > I am experie

Re: [ceph-users] CephFS with cache tiering - reading files are filled with 0s

2015-09-02 Thread Yan, Zheng
probably caused by http://tracker.ceph.com/issues/12551 On Wed, Sep 2, 2015 at 7:57 PM, Arthur Liu wrote: > Hi, > > I am experiencing an issue with CephFS with cache tiering where the kernel > clients are reading files filled entirely with 0s. > > The setup: > ceph 0.94.3 > create cephfs_metadata

Re: [ceph-users] Is Ceph appropriate for small installations?

2015-09-02 Thread Janusz Borkowski
Hi! Do you have replication factor 2? To test recovery e.g. kill one OSD process, observe when ceph notices it and starts moving data. Reformat the OSD partition, remove the killed OSD from cluster, then add a new OSD using the freshly formatted partition. When you have again 3 OSDs, observe w

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Janusz Borkowski
Hi! Thanks for the explanation. The behaviour (overwriting) was puzzling and suggesting serious filesystem corruption. Once we identified the scenario, we can try workarounds. Regards, J. On 02.09.2015 11:50, Yan, Zheng wrote: >> On Sep 2, 2015, at 17:11, Gregory Farnum wrote: >> >> Whoops, f

[ceph-users] Corruption of file systems on RBD images

2015-09-02 Thread Mathieu GAUTHIER-LAFAYE
Hi All, We have some troubles regularly with virtual machines using RBD storage. When we restart some virtual machines, they starts to do some filesystem checks. Sometime it can rescue it, sometime the virtual machine die (Linux or Windows). We have move from Firefly to Hammer the last month. I

[ceph-users] CephFS with cache tiering - reading files are filled with 0s

2015-09-02 Thread Arthur Liu
Hi, I am experiencing an issue with CephFS with cache tiering where the kernel clients are reading files filled entirely with 0s. The setup: ceph 0.94.3 create cephfs_metadata replicated pool create cephfs_data replicated pool cephfs was created on the above two pools, populated with files, then:

Re: [ceph-users] Troubleshooting rgw bucket list

2015-09-02 Thread Sam Wouters
Thanks! Playing around with max_keys in bucket listing retrieval actually gives me results or not, this gives me a way to list the content until the bug is fixed. Is it possible somehow to copy the objects to a new bucket (with versioning disabled), and rename the current one? I don't think the

[ceph-users] Jessie repo for ceph hammer?

2015-09-02 Thread Rottmann Jonas
Hi, When it can be expected that there will be a Jessie repo for ceph hammer available? Thanks! Mit freundlichen Grüßen/Kind regards Jonas Rottmann Systems Engineer FIS-ASP Application Service Providing und IT-Outsourcing GmbH Röthleiner Weg 4 D-97506 Grafenrheinfeld Phone: +49 (9723) 9188-568

Re: [ceph-users] Is Ceph appropriate for small installations?

2015-09-02 Thread Marcin Przyczyna
On 08/31/2015 09:39 AM, Wido den Hollander wrote: > True, but your performance is greatly impacted during recovery. So a > three node cluster might work well when the skies are clear and the sun > is shining, but it has a hard time dealing with a complete node failure. The question of "how tiny a

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Yan, Zheng
> On Sep 2, 2015, at 17:11, Gregory Farnum wrote: > > Whoops, forgot to add Zheng. > > On Wed, Sep 2, 2015 at 10:11 AM, Gregory Farnum wrote: >> On Wed, Sep 2, 2015 at 10:00 AM, Janusz Borkowski >> wrote: >>> Hi! >>> >>> I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). >>> >

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Yan, Zheng
> On Sep 2, 2015, at 16:44, Gregory Farnum wrote: > > On Tue, Sep 1, 2015 at 9:20 PM, Erming Pei wrote: >> Hi, >> >> I tried to set up a read-only permission for a client but it looks always >> writable. >> >> I did the following: >> >> ==Server end== >> >> [client.cephfs_data_ro] >>

Re: [ceph-users] how to improve ceph cluster capacity usage

2015-09-02 Thread huang jun
After search the source code, i found ceph_psim tool which can simulate objects distribution, but it seems a little simple. 2015-09-01 22:58 GMT+08:00 huang jun : > hi,all > > Recently, i did some experiments on OSD data distribution, > we set up a cluster with 72 OSDs,all 2TB sata disk, > and c

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Nick Fisk
I think this may be related to what I had to do, it rings a bell at least. http://unix.stackexchange.com/questions/153693/cant-use-userspace-cpufreq-governor-and-set-cpu-frequency The P-state drive doesn't support userspace, so you need to disable it and make Linux use the old acpi drive instead

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Gregory Farnum
On Wed, Sep 2, 2015 at 10:00 AM, Janusz Borkowski wrote: > Hi! > > I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). > > The effect is the same when doing "echo >>" from another machine and from a > machine keeping the file open. > > The file is opened with open( .., > O_WRONLY|O_LA

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Gregory Farnum
Whoops, forgot to add Zheng. On Wed, Sep 2, 2015 at 10:11 AM, Gregory Farnum wrote: > On Wed, Sep 2, 2015 at 10:00 AM, Janusz Borkowski > wrote: >> Hi! >> >> I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). >> >> The effect is the same when doing "echo >>" from another machine an

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Janusz Borkowski
Hi! I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). The effect is the same when doing "echo >>" from another machine and from a machine keeping the file open. The file is opened with open( .., O_WRONLY|O_LARGEFILE|O_APPEND|O_BINARY|O_CREAT) Shell ">>" is implemented as (from

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 9:20 PM, Erming Pei wrote: > Hi, > > I tried to set up a read-only permission for a client but it looks always > writable. > > I did the following: > > ==Server end== > > [client.cephfs_data_ro] > key = AQxx== > caps mon = "allow r" > caps

Re: [ceph-users] libvirt rbd issue

2015-09-02 Thread Jan Schermer
1) Take a look at the number of file descriptors the QEMU process is using, I think you are over the limits pid=pid of qemu process cat /proc/$pid/limits echo /proc/$pid/fd/* | wc -w 2) Jumbo frames may be the cause, are they enabled on the rest of the network? In any case, get rid of NetworkM

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Jan Schermer
What "idle" driver are you using? /dev/cpu_dma_latency might not be sufficient if the OS uses certain idle instructions, IMO mwait is still issued and its latency might not be 1 on Atoms. What is in /sys/devices/system/cpu/cpu0/cpuidle/state*/latency on Atoms? Btw disabling all power management i

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-09-02 Thread Jan Schermer
I somehow missed the original question, but if you run a database on CEPH you will be limited not by throughput but by latency. Even if you run OSDs with ramdisk, the latency will still be 1-2ms at best (depending strictly on OSD CPU and memory speed) and that limits the number of database trans

Re: [ceph-users] SSD test results with Plextor M6 Pro, HyperX Fury, Kingston V300, ADATA SP90

2015-09-02 Thread Jan Schermer
Hi, comments below > On 01 Sep 2015, at 18:08, Jelle de Jong wrote: > > Hi Jan, > > I am building two new clusters for testing. I been reading your messages > on the mailing list for a while now and want to thank you for your support. > > I can redo all the numbers, but is your question to run