[ceph-users] weird state whilst upgrading to jewel

2016-10-10 Thread Luis Periquito
I was upgrading a really old cluster from Infernalis (9.2.1) to Jewel (10.2.3) and got some weird, but interesting issues. This cluster started its life with Bobtail -> Dumpling -> Emperor -> Firefly -> Giant -> Hammer -> Infernalis and now Jewel. When I upgraded the first MON (out of 3)

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-10 Thread Matteo Dacrema
Hi, I’m planning a similar cluster. Because it’s a new project I’ll start with only 2 node cluster witch each: 2x E5-2640v4 with 40 threads total @ 3.40Ghz with turbo 24x 1.92 TB Samsung SM863 128GB RAM 3x LSI 3008 in IT mode / HBA for OSD - 1 each 8 OSD/SDDs 2x SSD for OS 2x 40Gbit/s NIC

Re: [ceph-users] Ceph consultants?

2016-10-10 Thread Sean Redmond
Hi, In the end this was tracked back to a switch MTU problem, once that was fixed any version of ceph-deploy osd prepair/create worked as expected. Thanks On Mon, Oct 10, 2016 at 11:02 AM, Eugen Block wrote: > Did the prepare command succeed? I don't see any output referring to

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-10 Thread Orit Wasserman
On Fri, Oct 7, 2016 at 9:37 PM, Graham Allan wrote: > Dear Orit, > > On 10/07/2016 04:21 AM, Orit Wasserman wrote: >> >> Hi, >> >> On Wed, Oct 5, 2016 at 11:23 PM, Andrei Mikhailovsky >> wrote: >>> >>> Hello everyone, >>> >>> I've just updated my ceph to version

[ceph-users] building ceph from source (exorbitant space requirements)

2016-10-10 Thread Steffen Weißgerber
Hi, while using client ceph also on gentoo and because I'm a friend of building from source within a ram based filesystem since ceph release 9.x i'm wondering about the exorbitant space requirements when buildung the ceph components. Until hammer 3GB where sufficient to complete the compile.

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-10 Thread Orit Wasserman
Hi Graham, Is there a chance you have old radosgw-admin (hammer) running? You may encountered http://tracker.ceph.com/issues/17371 If an hammer radosgw-admin runs on the jewel radosgw it corrupts the configuration. We are working on a fix for that. Orit On Fri, Oct 7, 2016 at 9:37 PM, Graham

[ceph-users] Crash after importing PG using objecttool

2016-10-10 Thread John Holder
Hello We had an issue with OSDs and RAID Cards. I have recovered all but 1 pg by allowing ceph to recover on its own. But I have 1 PG which wasn't replicated, so I exported it before I took the OSD totally out. I have tried to import it using the objectstore tool, but no matter where I

Re: [ceph-users] too many PGs per OSD (326 > max 300) warning when ALL PGs are 256

2016-10-10 Thread David Turner
You have 11 pools with 256 pgs, 1 pool with 128 and 1 pool with 64... that's 3,008 pgs in your entire cluster. Multiply that number by your replica size and divide by how many OSDs you have in your cluster and you'll see what your average PGs per osd is. Based on the replica size you shared,

Re: [ceph-users] CephFS: No space left on device

2016-10-10 Thread Gregory Farnum
On Mon, Oct 10, 2016 at 9:06 AM, Davie De Smet wrote: > Hi, > > > > I don’t want to hijack this topic but the behavior described below is the > same as what I am seeing: > > > > > > [root@osd5-freu ~]# ceph daemonperf /var/run/ceph/ceph-mds.osd5-freu.asok > | > >

[ceph-users] too many PGs per OSD (326 > max 300) warning when ALL PGs are 256

2016-10-10 Thread Andrus, Brian Contractor
Ok, this is an odd one to me... I have several pools, ALL of them are set with pg_num and pgp_num = 256. Yet, the warning about too many PGs per OSD is showing up. Here are my pools: pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-10 Thread Wido den Hollander
> Op 10 oktober 2016 om 14:56 schreef Matteo Dacrema : > > > Hi, > > I’m planning a similar cluster. > Because it’s a new project I’ll start with only 2 node cluster witch each: > 2 nodes in a Ceph cluster is way to small in my opinion. I suggest that you take a lot more

Re: [ceph-users] too many PGs per OSD (326 > max 300) warning when ALL PGs are 256

2016-10-10 Thread Andrus, Brian Contractor
David, Thanks for the info. I am getting an understanding of how this works. Now I used the ceph-deploy tool to create the rgw pools. It seems then that the tool isn’t the best at creating the pools necessary for an rgw gateway as it made all of them the default sizes for pg_num/pgp_num Perhaps,

[ceph-users] Status of Calamari > 1.3 and friends (diamond...)

2016-10-10 Thread Richard Chan
Hi list, the status of calamari and friends post 1.3 seems a bit confusing to me. What are you folks using for monitoring in the Jewel era? (Could someone explain the big picture of the state of 1.4) Here's what I have gathered: 1. romana (was calamari-clients) seems dead: no commits in a

Re: [ceph-users] too many PGs per OSD (326 > max 300) warning when ALL PGs are 256

2016-10-10 Thread David Turner
The default it uses can be controlled in your ceph.conf file. The ceph-deploy tool is a generic ceph deployment tool which does not have presets for rados gateway deployments or other specific deployments. When creating pools you can specify the amount of pgs in them with the tool so that it

Re: [ceph-users] RBD-Mirror - Journal location

2016-10-10 Thread Jason Dillaman
Yes, the "journal_data" objects can be stored in a separate pool from the image. The rbd CLI allows you to use the "--journal-pool" argument when creating, copying, cloning, or importing and image with journaling enabled. You can also specify the journal data pool when dynamically enabling the

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-10 Thread Christian Balzer
Hello, On Mon, 10 Oct 2016 14:56:40 +0200 Matteo Dacrema wrote: > Hi, > > I’m planning a similar cluster. > Because it’s a new project I’ll start with only 2 node cluster witch each: > As Wido said, that's a very dense and risky proposition for a first time cluster. Never mind the lack of

Re: [ceph-users] RBD-Mirror - Journal location

2016-10-10 Thread Christian Balzer
Hello, On Tue, 11 Oct 2016 01:07:16 + Cory Hawkless wrote: > Thanks Jason, works perfectly. > > Do you know if ceph blocks the client IO until the journal has acknowledged > it's write? I.E can I store my journal on slower disks or will that have a > negative impact on performance? >

Re: [ceph-users] RBD-Mirror - Journal location

2016-10-10 Thread Cory Hawkless
Thanks Jason, works perfectly. Do you know if ceph blocks the client IO until the journal has acknowledged it's write? I.E can I store my journal on slower disks or will that have a negative impact on performance? Is there perhaps a hole in the documentation here? I've not been able to find

Re: [ceph-users] unable to start radosgw after upgrade from 10.2.2 to 10.2.3

2016-10-10 Thread Graham Allan
Hi Orit, That could well be related - as mentioned, we do have a hammer radosgw still running, and I have also run radosgw-admin on that system while trying to understand what changed between the two releases! So reading that bug report, it sounds like having the hammer radosgw itself

Re: [ceph-users] CephFS: No space left on device

2016-10-10 Thread Davie De Smet
Hi, I don’t want to hijack this topic but the behavior described below is the same as what I am seeing: [root@osd5-freu ~]# ceph daemonperf /var/run/ceph/ceph-mds.osd5-freu.asok |

Re: [ceph-users] librgw init failed (-5) when starting nfs-ganesha

2016-10-10 Thread Brad Hubbard
On Sun, Oct 9, 2016 at 9:58 PM, yiming xie wrote: > Thank your reply. I don’t know which configuration or step causes rados > initialization to fail。 > /usr/lib64/ > librgw.so.2.0.0 > librados.so.2.0.0 > > /etc/ceh/ceph.conf: > [global] > mon_host = 192.168.77.61 What

Re: [ceph-users] librgw init failed (-5) when starting nfs-ganesha

2016-10-10 Thread yiming xie
ceph -v ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) nfs-ganesha:2.3 stable 1. install ceph following docs.ceph.com on node1 2.install librgw2-devel.x86_64 on node2 3.install nfs-ganmesha on node2 cmake -DUSE_FSAL_RGW=ON ../src/ make make install 4.vi

Re: [ceph-users] librgw init failed (-5) when starting nfs-ganesha

2016-10-10 Thread Brad Hubbard
On Mon, Oct 10, 2016 at 4:37 PM, yiming xie wrote: > ceph -v > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > nfs-ganesha:2.3 stable > > 1. install ceph following docs.ceph.com on node1 > 2.install librgw2-devel.x86_64 on node2 > 3.install nfs-ganmesha on

Re: [ceph-users] rsync kernel client cepfs mkstemp no space left on device

2016-10-10 Thread Hauke Homburg
Am 07.10.2016 um 17:37 schrieb Gregory Farnum: > On Fri, Oct 7, 2016 at 7:15 AM, Hauke Homburg wrote: >> Hello, >> >> I have a Ceph Cluster with 5 Server, and 40 OSD. Aktual on this Cluster >> are 85GB Free Space, and the rsync dir has lots of Pictures and a Data >>

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

2016-10-10 Thread Goncalo Borges
Hi Mike... I was hoping that someone with a bit more experience would answer you since I never had similar situation. So, I'll try to step in and help. The peering process means that the OSDs are agreeing on the state of objects in the PGs they share. The peering process can take some time and

Re: [ceph-users] Ceph consultants?

2016-10-10 Thread Eugen Block
Did the prepare command succeed? I don't see any output referring to 'ceph-deploy osd prepare'. If this command also fails maybe there's a hint, and the activate command is only a consequence of that failure? Zitat von Alan Johnson : I did have some similar issues and

Re: [ceph-users] Crash in ceph_read_iter->__free_pages due to null page

2016-10-10 Thread Nikolay Borisov
On 10/10/2016 12:22 PM, Ilya Dryomov wrote: > On Fri, Oct 7, 2016 at 1:40 PM, Nikolay Borisov wrote: >> Hello, >> >> I've encountered yet another cephfs crash: >> >> [990188.822271] BUG: unable to handle kernel NULL pointer dereference at >> 001c >> [990188.822790]

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

2016-10-10 Thread David
Can you provide a 'ceph health detail' On 9 Oct 2016 3:56 p.m., "Mike Jacobacci" wrote: Hi, Yesterday morning I added two more OSD nodes and changed the crushmap from disk to node. It looked to me like everything went ok besides some disks missing that I can re-add later, but

Re: [ceph-users] Crash in ceph_read_iter->__free_pages due to null page

2016-10-10 Thread Ilya Dryomov
On Fri, Oct 7, 2016 at 1:40 PM, Nikolay Borisov wrote: > Hello, > > I've encountered yet another cephfs crash: > > [990188.822271] BUG: unable to handle kernel NULL pointer dereference at > 001c > [990188.822790] IP: [] __free_pages+0x5/0x30 > [990188.823090] PGD

Re: [ceph-users] rsync kernel client cepfs mkstemp no space left on device

2016-10-10 Thread John Spray
On Mon, Oct 10, 2016 at 9:05 AM, Hauke Homburg wrote: > Am 07.10.2016 um 17:37 schrieb Gregory Farnum: >> On Fri, Oct 7, 2016 at 7:15 AM, Hauke Homburg >> wrote: >>> Hello, >>> >>> I have a Ceph Cluster with 5 Server, and 40 OSD. Aktual on this

Re: [ceph-users] librgw init failed (-5) when starting nfs-ganesha

2016-10-10 Thread yiming xie
Order by your advice, I install ganesha on node1,but there be a same problem. librgw init failed (-5) node1: ps -ef | grep ceph ceph 1099 1 0 17:34 ?00:00:00 /usr/bin/radosgw -f --clusterceph --name client.rgw.node1 --setuser ceph --setgroup ceph [cep@node1 ~]$ sudo rados

[ceph-users] Does calamari 1.4.8 still use romana 1.3, carbon-cache, cthulhu-manager?

2016-10-10 Thread Richard Chan
Hi, with the move to calamari-server 1.4.8 some questions: 1. Are we still using the webapp calamari-clients/romana 1.3? Does the version number skew matter? 2. Previously there were carbon-cache.py and cthulhu-manager in supervisor. Now there is calamari-lite. Are the previous two superceded by