Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread hjcho616
Rooney, Just tried hooking up osd.0 back.  osd.0 seems to be better as I was able to run ceph-objectstore-tool export so decided to try hooking it up.  Looks like journal is not happy.  Is there any way to get this running?  Or do I need to start getting data using ceph-objectstore-tool?

[ceph-users] access ceph filesystem at storage level and not via ethernet

2017-09-13 Thread James Okken
Thanks Ronny! Exactly the info I need. And kinda of what I thought the answer would be as I was typing and thinking clearer about what I was asking. I just was hoping CEPH would work like this since the openstack fuel tools deploy CEPH storage nodes easily. I agree I would not be using CEPH for

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-13 Thread Brad Hubbard
On Wed, Sep 13, 2017 at 8:40 PM, Florian Haas wrote: > Hi everyone, > > > disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7 > no less. Reproducing this on 0.94.10 is a pending process, and we'll > update here with findings, but my goal with this post is

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-13 Thread Josh Durgin
On 09/13/2017 03:40 AM, Florian Haas wrote: So we have a client that is talking to OSD 30. OSD 30 was never down; OSD 17 was. OSD 30 is also the preferred primary for this PG (via primary affinity). The OSD now says that - it does itself have a copy of the object, - so does OSD 94, - but that

[ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-13 Thread David
Hi All I did a Jewel -> Luminous upgrade on my dev cluster and it went very smoothly. I've attempted to upgrade on a small production cluster but I've hit a snag. After installing the ceph 12.2.0 packages with "yum install ceph" on the first node and accepting all the dependencies, I found that

Re: [ceph-users] Anyone else having digest issues with Apple Mail?

2017-09-13 Thread Tomasz Kusmierz
Nope. no problem here. > On 13 Sep 2017, at 22:05, Anthony D'Atri wrote: > > For a couple of weeks now digests have been appearing to me off and on with a > few sets of MIME headers and maybe 1-2 messages. When I look at the raw text > the whole digest is in there. > >

[ceph-users] Anyone else having digest issues with Apple Mail?

2017-09-13 Thread Anthony D'Atri
For a couple of weeks now digests have been appearing to me off and on with a few sets of MIME headers and maybe 1-2 messages. When I look at the raw text the whole digest is in there. Screencap below. Anyone else experiencing this?

Re: [ceph-users] Usage not balanced over OSDs

2017-09-13 Thread Jack
How many PGs ? How many pool (and how many data, please post rados df) On 13/09/2017 22:30, Sinan Polat wrote: > Hi, > > > > I have 52 OSD's in my cluster, all with the same disk size and same weight. > > > > When I perform a: > > ceph osd df > > > > The disk with the least available

[ceph-users] Usage not balanced over OSDs

2017-09-13 Thread Sinan Polat
Hi, I have 52 OSD's in my cluster, all with the same disk size and same weight. When I perform a: ceph osd df The disk with the least available space: 863G The disk with the most available space: 1055G I expect the available space or the usage on the disks to be the same, since

Re: [ceph-users] What's 'failsafe full'

2017-09-13 Thread Sinan Polat
Hi According to: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/003140.html You can set it with: on the OSDs you may (not) want to change "osd failsafe full ratio" and "osd failsafe nearfull ratio". Van: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] Namens dE

Re: [ceph-users] Bluestore "separate" WAL and DB (and WAL/DB size?)

2017-09-13 Thread Mark Nelson
Hi Richard, Regarding recovery speed, have you looked through any of Neha's results on recovery sleep testing earlier this summer? https://www.spinics.net/lists/ceph-devel/msg37665.html She tested bluestore and filestore under a couple of different scenarios. The gist of it is that time to

Re: [ceph-users] Ceph Mentors for next Outreachy Round

2017-09-13 Thread Ali Maredia
Cephers, Last week I talked to quite a few of you and was encouraged by the initial interest in mentoring an intern for an Outreachy Project. At the same time many of you had questions about the schedule. To clarify, here is the full schedule: 1. Sept. 20, 2017 Deadline to send Leo and

Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread Maxime Guyot
Hi, This is a common problem when doing custom CRUSHmap, the default behavior is to update the OSD node to location in the CRUSHmap on start. did you keep to the defaults there? If that is the problem, you can either: 1) Disable the update on start option: "osd crush update on start = false"

Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread Luis Periquito
What's your "osd crush update on start" option? further information can be found http://docs.ceph.com/docs/master/rados/operations/crush-map/ On Wed, Sep 13, 2017 at 4:38 PM, German Anders wrote: > Hi cephers, > > I'm having an issue with a newly created cluster 12.2.0 >

[ceph-users] What's 'failsafe full'

2017-09-13 Thread dE
Hello everyone, Just started with Ceph here. I was reading the documentation here -- http://docs.ceph.com/docs/master/rados/operations/health-checks/#osd-out-of-order-full And just started to wonder what's failsafe_full... I know it's some kind of ratio, but how do I change it? I didn't

Re: [ceph-users] luminous ceph-osd crash

2017-09-13 Thread Marcin Dulak
Hi, It looks like at sdb size around 1.1 GBytes ceph (ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)) is not crashing anymore. Please don't increase the minimum disk size requirements unnecessarily - it makes it more demanding to test new ceph features and

Re: [ceph-users] access ceph filesystem at storage level and not via ethernet

2017-09-13 Thread Ronny Aasen
On 13.09.2017 19:03, James Okken wrote: Hi, Novice question here: The way I understand CEPH is that it distributes data in OSDs in a cluster. The reads and writes come across the ethernet as RBD requests and the actual data IO then also goes across the ethernet. I have a CEPH environment

Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread German Anders
Thanks a lot Maxime, I did the osd_crush_update_on_start = false on ceph.conf and push it to all the nodes, and then i create a map file: # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable

[ceph-users] access ceph filesystem at storage level and not via ethernet

2017-09-13 Thread James Okken
Hi, Novice question here: The way I understand CEPH is that it distributes data in OSDs in a cluster. The reads and writes come across the ethernet as RBD requests and the actual data IO then also goes across the ethernet. I have a CEPH environment being setup on a fiber channel disk array

Re: [ceph-users] [Luminous] rgw not deleting object

2017-09-13 Thread Jack
Thanks for the tip For the record, I fixed it using radosgw-admin bucket check --bucket= --check-objects --fix On 10/09/2017 11:44, Andreas Calminder wrote: > Hi, > I had a similar problem on jewel, where I was unable to properly delete > objects eventhough radosgw-admin returned rc 0 after

Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread German Anders
*# ceph health detail* HEALTH_OK *# ceph osd stat* 48 osds: 48 up, 48 in *# ceph pg stat* 3200 pgs: 3200 active+clean; 5336 MB data, 79455 MB used, 53572 GB / 53650 GB avail *German* 2017-09-13 13:24 GMT-03:00 dE : > On 09/13/2017 09:08 PM, German Anders wrote: > > Hi

Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread dE
On 09/13/2017 09:08 PM, German Anders wrote: Hi cephers, I'm having an issue with a newly created cluster 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc). Basically when I reboot one of the nodes, and when it come back, it come outside of the root type on the tree:

[ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread German Anders
Hi cephers, I'm having an issue with a newly created cluster 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc). Basically when I reboot one of the nodes, and when it come back, it come outside of the root type on the tree: root@cpm01:~# ceph osd tree ID CLASS WEIGHT TYPE NAME

Re: [ceph-users] debian-hammer wheezy Packages file incomplete?

2017-09-13 Thread David
Case close, found answer in the mailing list archive. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-March/016706.html Weird though that we installed it through the repo in June 2017. Why not put them in the

[ceph-users] Collectd issues

2017-09-13 Thread Marc Roos
Am I the only one having these JSON issues with collectd, did I do something wrong in configuration/upgrade? Sep 13 15:44:15 c01 collectd: ceph plugin: ds Bluestore.kvFlushLat.avgtime was not properly initialized. Sep 13 15:44:15 c01 collectd: ceph plugin: JSON handler failed with status -1.

Re: [ceph-users] Rgw install manual install luminous

2017-09-13 Thread Marc Roos
Yes this command cannot find the keyring service ceph-radosgw@gw1 start But this can radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gw1 -f I think I did not populate the /var/lib/ceph/radosgw/ceph-gw1/ folder correctly. Maybe the sysint is checking on 'done' file or so. I mannualy added

[ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-13 Thread Florian Haas
Hi everyone, disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7 no less. Reproducing this on 0.94.10 is a pending process, and we'll update here with findings, but my goal with this post is really to establish whether the behavior as seen is expected, and if so, what the

Re: [ceph-users] moving mons across networks

2017-09-13 Thread Wido den Hollander
> Op 13 september 2017 om 10:38 schreef Dan van der Ster : > > > Hi Blair, > > You can add/remove mons on the fly -- connected clients will learn > about all of the mons as the monmap changes and there won't be any > downtime as long as the quorum is maintained. > > There

[ceph-users] inconsistent pg but repair does nothing reporting head data_digest != data_digest from auth oi / hopefully data seems ok

2017-09-13 Thread Laurent GUERBY
Hi, ceph pg repair is currently not fixing three "inconsistent" objects on one of our pg on a replica 3 pool.  The 3 replica data objets are identical (we checked them on disk on the 3 OSD), the error says "head data_digest != data_digest from auth oi", see below. The data in question are used

Re: [ceph-users] Luminous BlueStore EC performance

2017-09-13 Thread Blair Bethwaite
Thanks for sharing Mohamad. What size of IOs are these? The tail latency breakdown is probably a major factor of importance here too, but I guess you don't have that. Why EC21, I assume that isn't a config anyone uses in production...? But I suppose it does facilitate a comparison between

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread Ronny Aasen
On 13. sep. 2017 07:04, hjcho616 wrote: Ronny, Did bunch of ceph pg repair pg# and got the scrub errors down to 10... well was 9, trying to fix one became 10.. waiting for it to fix (I did that noout trick as I only have two copies). 8 of those scrub errors looks like it would need data

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-13 Thread Christian Theune
Hi, (thanks to Florian who’s helping us getting this sorted out) > On Sep 13, 2017, at 12:40 PM, Florian Haas wrote: > > Hi everyone, > > > disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7 > no less. Reproducing this on 0.94.10 is a pending process,

[ceph-users] Ceph OSD crash starting up

2017-09-13 Thread Gonzalo Aguilar Delgado
Hi, I'recently updated crush map to 1 and did all relocation of the pgs. At the end I found that one of the OSD is not starting. This is what it shows: 2017-09-13 10:37:34.287248 7f49cbe12700 -1 *** Caught signal (Aborted) ** in thread 7f49cbe12700 thread_name:filestore_sync ceph version

Re: [ceph-users] moving mons across networks

2017-09-13 Thread Dan van der Ster
On Wed, Sep 13, 2017 at 11:04 AM, Dan van der Ster wrote: > On Wed, Sep 13, 2017 at 10:54 AM, Wido den Hollander wrote: >> >>> Op 13 september 2017 om 10:38 schreef Dan van der Ster >>> : >>> >>> >>> Hi Blair, >>> >>> You can add/remove

Re: [ceph-users] moving mons across networks

2017-09-13 Thread Dan van der Ster
On Wed, Sep 13, 2017 at 10:54 AM, Wido den Hollander wrote: > >> Op 13 september 2017 om 10:38 schreef Dan van der Ster : >> >> >> Hi Blair, >> >> You can add/remove mons on the fly -- connected clients will learn >> about all of the mons as the monmap changes

Re: [ceph-users] moving mons across networks

2017-09-13 Thread Dan van der Ster
Hi Blair, You can add/remove mons on the fly -- connected clients will learn about all of the mons as the monmap changes and there won't be any downtime as long as the quorum is maintained. There is one catch when it comes to OpenStack, however. Unfortunately, OpenStack persists the mon IP

Re: [ceph-users] [Solved] Oeps: lost cluster with: ceph osd require-osd-release luminous

2017-09-13 Thread Jan-Willem Michels
On 9/12/17 9:13 PM, Josh Durgin wrote: Could you post your crushmap? PGs mapping to no OSDs is a symptom of something wrong there. You can stop the osds from changing position at startup with 'osd crush update on start = false': Yes I had found that. Thanks. Seems be be by design,