Re: [ceph-users] ceph monitor keep crash

2019-06-11 Thread Joao Eduardo Luis
On 06/04/2019 07:01 PM, Jianyu Li wrote: > Hello, > > I have a ceph cluster running over 2 years and the monitor began crash > since yesterday. I had some flapping OSDs up and down occasionally, > sometimes I need to rebuild the OSD. I found 3 OSDs are down yesterday, > they may cause this issue o

Re: [ceph-users] [Ceph-community] Monitors not in quorum (1 of 3 live)

2019-06-08 Thread Joao Eduardo Luis
(adding ceph-users instead) On 19/06/07 12:53pm, Lluis Arasanz i Nonell - Adam wrote: > Hi all, > > I know I have a very old ceph version, but I need some help. > Also, understand that English is not my native language, so please, take it > in mind if something is not really well explained. > >

Re: [ceph-users] IRC channels now require registered and identified users

2019-02-19 Thread Joao Eduardo Luis
. I noticed that yesterday as well, but I since we disabled the chan mode that (I thought) led to that, I don't think that was the cause. Mike, did we ever get word from Patrick on how this was setup? Any idea what we should do here? -Joao > > On Tue, Dec 18, 2018 at 6:50 AM Joao Eduard

Re: [ceph-users] IRC channels now require registered and identified users

2018-12-18 Thread Joao Eduardo Luis
On 12/18/2018 11:22 AM, Joao Eduardo Luis wrote: > On 12/18/2018 11:18 AM, Dan van der Ster wrote: >> Hi Joao, >> >> Has that broken the Slack connection? I can't tell if its broken or >> just quiet... last message on #ceph-devel was today at 1:13am. > > Ju

Re: [ceph-users] IRC channels now require registered and identified users

2018-12-18 Thread Joao Eduardo Luis
On 12/18/2018 11:18 AM, Dan van der Ster wrote: > Hi Joao, > > Has that broken the Slack connection? I can't tell if its broken or > just quiet... last message on #ceph-devel was today at 1:13am. Just quiet, it seems. Just tested it and the bridge is still working. -Joao __

[ceph-users] IRC channels now require registered and identified users

2018-12-18 Thread Joao Eduardo Luis
All, Earlier this week our IRC channels were set to require users to be registered and identified before being allowed to join a channel. This looked like the most reasonable option to combat the onslaught of spam bots we've been getting in the last weeks/months. As of today, this is in effect f

Re: [ceph-users] [Ceph-community] Pool broke after increase pg_num

2018-11-08 Thread Joao Eduardo Luis
Hello Gesiel, Welcome to Ceph! In the future, you may want to address the ceph-users list (`ceph-users@lists.ceph.com`) for this sort of issues. On 11/08/2018 11:18 AM, Gesiel Galvão Bernardes wrote: > Hi everyone, > > I am a beginner in Ceph. I made a increase of pg_num in a pool, and > after 

Re: [ceph-users] add monitors - not working

2018-10-31 Thread Joao Eduardo Luis
On 10/31/2018 04:48 PM, Steven Vacaroaia wrote: > On the monitor that works I noticed this  > > mon.mon01@0(leader) e1 handle_probe ignoring fsid > d01a0b47-fef0-4ce8-9b8d-80be58861053 != 8e7922c9-8d3b-4a04-9a8a-e0b0934162df > > Where is that fsid ( 8e7922 ) coming from ? monmap. somehow one

Re: [ceph-users] rocksdb mon stores growing until restart

2018-08-30 Thread Joao Eduardo Luis
On 08/30/2018 09:28 AM, Dan van der Ster wrote: > Hi, > > Is anyone else seeing rocksdb mon stores slowly growing to >15GB, > eventually triggering the 'mon is using a lot of disk space' warning? > > Since upgrading to luminous, we've seen this happen at least twice. > Each time, we restart all t

Re: [ceph-users] prevent unnecessary MON leader re-election

2018-08-29 Thread Joao Eduardo Luis
On 08/29/2018 11:02 AM, William Lawton wrote: > > We have a 5 node Ceph cluster, status output copied below. During our > cluster resiliency tests we have noted that a MON leader election takes > place when we fail one member of the MON quorum, even though the failed > instance is not the current

Re: [ceph-users] New Ceph community manager: Mike Perez

2018-08-29 Thread Joao Eduardo Luis
On 08/29/2018 02:13 AM, Sage Weil wrote: > Hi everyone, > > Please help me welcome Mike Perez, the new Ceph community manager! Very happy to have you with us! Let us know if there's anything we can help you with, and don't hesitate to get in touch :) -Joao

Re: [ceph-users] Ceph Mimic on Debian 9 Stretch

2018-06-04 Thread Joao Eduardo Luis
On 06/04/2018 07:39 PM, Sage Weil wrote: > [1] > http://lists.ceph.com/private.cgi/ceph-maintainers-ceph.com/2018-April/000603.html > [2] > http://lists.ceph.com/private.cgi/ceph-maintainers-ceph.com/2018-April/000611.html Just a heads up, seems the ceph-maintainers archives are not public. -

Re: [ceph-users] ceph-mon fails to start on rasberry pi (raspbian 8.0)

2017-12-15 Thread Joao Eduardo Luis
On 12/15/2017 07:03 PM, Andrew Knapp wrote: Has anyone else tried this and had similar problems? Any advice on how to proceed or work around this issue? The daemon's log, somewhere in /var/log/ceph/ceph-mon..log, should have more info. Upload that somewhere and we'll take a look. -Joao __

Re: [ceph-users] monitor crash issue

2017-11-28 Thread Joao Eduardo Luis
Hi Zhongyan, On 11/28/2017 02:25 PM, Zhongyan Gu wrote: Hi There, We hit a monitor crash bug in our production clusters during adding more nodes into one of clusters. Thanks for reporting this. Can you please share the log resulting from the crash? I'll be looking into this. -Joao ___

Re: [ceph-users] ceph-disk is now deprecated

2017-11-28 Thread Joao Eduardo Luis
On 11/28/2017 12:52 PM, Alfredo Deza wrote: On Tue, Nov 28, 2017 at 7:38 AM, Joao Eduardo Luis wrote: On 11/28/2017 11:54 AM, Alfredo Deza wrote: On Tue, Nov 28, 2017 at 3:12 AM, Wido den Hollander wrote: Op 27 november 2017 om 14:36 schreef Alfredo Deza : For the upcoming Luminous

Re: [ceph-users] ceph-disk is now deprecated

2017-11-28 Thread Joao Eduardo Luis
On 11/28/2017 11:54 AM, Alfredo Deza wrote: On Tue, Nov 28, 2017 at 3:12 AM, Wido den Hollander wrote: Op 27 november 2017 om 14:36 schreef Alfredo Deza : For the upcoming Luminous release (12.2.2), ceph-disk will be officially in 'deprecated' mode (bug fixes only). A large banner with depr

Re: [ceph-users] What goes in the monitor database?

2017-11-04 Thread Joao Eduardo Luis
On Sat, 2017-11-04 at 20:35 +, Bryan Henderson wrote: > Hi. Can anyone give me a rough idea of what the monitor database is > for? The monitor k/v store is where we'll keep maps and other relevant data. These maps keep the cluster state over time, and are critical for the system to properly

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
cause adding a new one means losing the quorum and then being unable to remove the new one, because the quorum is lost with 2/4 nodes. (this is what actually happened about a week ago in our cluster) Best, Nico Joao Eduardo Luis writes: Hi Nico, I'm sorry I forgot about your issue. Crazy

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
seems not easily be possible, because adding a new one means losing the quorum and then being unable to remove the new one, because the quorum is lost with 2/4 nodes. (this is what actually happened about a week ago in our cluster) Best, Nico Joao Eduardo Luis writes: Hi Nico, I'm sorry

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
Hi Nico, I'm sorry I forgot about your issue. Crazy few weeks. I checked the log you initially sent to the list, but it only contains the log from one of the monitors, and it's from the one synchronizing. This monitor is not stuck however - synchronizing is progressing, albeit slowly. Can y

Re: [ceph-users] Unstable clock

2017-10-17 Thread Joao Eduardo Luis
On 10/17/2017 01:30 PM, Mohamad Gebai wrote: A concern was raised: are there more critical parts of Ceph where a clock jumping around might interfere with the behavior of the cluster? It would be good to know if there are any, and maybe prepare for them? cephx and monitor paxos leases come to m

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-08 Thread Joao Eduardo Luis
This looks a lot like a bug I fixed a week or so ago, but for which I currently don't recall the ticket off the top of my head. It was basically a crash each time a "ceph osd df" was called, if a mgr was not available after having set the luminous osd require flag. I will check the log in the morni

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Joao Eduardo Luis
On 10/04/2017 09:19 PM, Gregory Farnum wrote: Oh, hmm, you're right. I see synchronization starts but it seems to progress very slowly, and it certainly doesn't complete in that 2.5 minute logging window. I don't see any clear reason why it's so slow; it might be more clear if you could provide

Re: [ceph-users] Ceph Developers Monthly - October

2017-09-28 Thread Joao Eduardo Luis
On 09/28/2017 04:08 AM, Leonardo Vaz wrote: Hey Cephers, This is just a friendly reminder that the next Ceph Developer Montly meeting is coming up: http://wiki.ceph.com/Planning If you have work that you're doing that it a feature work, significant backports, or anything you would like to di

Re: [ceph-users] Ceph release cadence

2017-09-25 Thread Joao Eduardo Luis
I am happy with this branch of the thread! I'm guessing this would start post-Mimic though, if no one objects and if we want to target a March release? -Joao On 09/23/2017 02:58 AM, Sage Weil wrote: On Fri, 22 Sep 2017, Gregory Farnum wrote: On Fri, Sep 22, 2017 at 3:28 PM, Sage Weil wro

Re: [ceph-users] Ceph Developers Monthly - September

2017-09-12 Thread Joao Eduardo Luis
On 09/12/2017 04:59 PM, Leonardo Vaz wrote: Hey Cephers, In case you missed September's Ceph Developer Monthly, it is now up on our YouTube channel: https://youtu.be/xds1nsDoYqY Thanks Leonardo! Much appreciated ;) -Joao ___ ceph-users mailin

Re: [ceph-users] Ceph release cadence

2017-09-06 Thread Joao Eduardo Luis
On 09/06/2017 04:23 PM, Sage Weil wrote: * Keep even/odd pattern, but force a 'train' model with a more regular cadence + predictable schedule - some features will miss the target and be delayed a year Personally, I think a predictable schedule is the way to go. Two major reasons come t

Re: [ceph-users] Ceph Developers Monthly - September

2017-09-06 Thread Joao Eduardo Luis
On 09/06/2017 06:06 AM, Leonardo Vaz wrote: Hey cephers, The Ceph Developer Monthly is confirmed for tonight, September 6 at 9pm Eastern Time (EDT), in an APAC-friendly time slot. As much as I would love to attend and discuss some topics (especially the RADOS replication stuff), this is an un

Re: [ceph-users] upgrade procedure to Luminous

2017-07-14 Thread Joao Eduardo Luis
On 07/14/2017 03:12 PM, Sage Weil wrote: On Fri, 14 Jul 2017, Joao Eduardo Luis wrote: On top of this all, I found during my tests that any OSD, running luminous prior to the luminous quorum, will need to be restarted before it can properly boot into the cluster. I'm guessing this is

Re: [ceph-users] upgrade procedure to Luminous

2017-07-14 Thread Joao Eduardo Luis
On 07/14/2017 03:12 PM, Sage Weil wrote: On Fri, 14 Jul 2017, Joao Eduardo Luis wrote: Dear all, The current upgrade procedure to jewel, as stated by the RC's release notes, You mean (jewel or kraken) -> luminous, I assume... Yeah. *sigh*

[ceph-users] upgrade procedure to Luminous

2017-07-14 Thread Joao Eduardo Luis
Dear all, The current upgrade procedure to jewel, as stated by the RC's release notes, can be boiled down to - upgrade all monitors first - upgrade osds only after we have a **full** quorum, comprised of all the monitors in the monmap, of luminous monitors (i.e., once we have the 'luminous'

Re: [ceph-users] ceph-mon leader election problem, should it be improved ?

2017-07-05 Thread Joao Eduardo Luis
opose a solution. Let us know what you come up with and if you want to discuss this a bit more ;) -Joao On Tue, Jul 4, 2017 at 9:25 PM, Joao Eduardo Luis wrote: On 07/04/2017 06:57 AM, Z Will wrote: Hi: I am testing ceph-mon brain split . I have read the code . If I understand it right ,

Re: [ceph-users] ceph-mon leader election problem, should it be improved ?

2017-07-04 Thread Joao Eduardo Luis
On 07/04/2017 06:57 AM, Z Will wrote: Hi: I am testing ceph-mon brain split . I have read the code . If I understand it right , I know it won't be brain split. But I think there is still another problem. My ceph version is 0.94.10. And here is my test detail : 3 ceph-mons , there ranks are 0,

Re: [ceph-users] Question about PGMonitor::waiting_for_finished_proposal

2017-06-01 Thread Joao Eduardo Luis
On 06/01/2017 05:35 AM, 许雪寒 wrote: Hi, everyone. Recently, I’m reading the source code of Monitor. I found that, in PGMonitor::preprare_pg_stats() method, a callback C_Stats is put into PGMonitor::waiting_for_finished_proposal. I wonder, if a previous PGMap incremental is in PAXOS's propose

Re: [ceph-users] ceph-mon and existing zookeeper servers

2017-05-23 Thread Joao Eduardo Luis
On 05/23/2017 04:04 PM, Sean Purdy wrote: Hi, This is my first ceph installation. It seems to tick our boxes. Will be using it as an object store with radosgw. I notice that ceph-mon uses zookeeper behind the scenes. Is there a way to point ceph-mon at an existing zookeeper cluster, using a

Re: [ceph-users] Ceph built from source, can't start ceph-mon

2017-04-25 Thread Joao Eduardo Luis
On 04/25/2017 03:52 AM, Henry Ngo wrote: Anyone? On Sat, Apr 22, 2017 at 12:33 PM, Henry Ngo mailto:henry@phazr.io>> wrote: I followed the install doc however after deploying the monitor, the doc states to start the mon using Upstart. I learned through digging around that the Up

Re: [ceph-users] monitors at 100%; cluster out of service

2017-02-28 Thread Joao Eduardo Luis
On 02/28/2017 09:53 PM, WRIGHT, JON R (JON R) wrote: I currently have a situation where the monitors are running at 100% CPU, and can't run any commands because authentication times out after 300 seconds. I stopped the leader, and the resulting election picked a new leader, but that monitor show

Re: [ceph-users] would people mind a slow osd restart during luminous upgrade?

2017-02-09 Thread Joao Eduardo Luis
On 02/09/2017 04:19 AM, David Turner wrote: The only issue I can think of is if there isn't a version of the clients fully tested to work with a partially upgraded cluster or a documented incompatibility requiring downtime. We've had upgrades where we had to upgrade clients first and others that

Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

2017-02-09 Thread Joao Eduardo Luis
Hi Jim, On 02/08/2017 07:45 PM, Jim Kilborn wrote: I have had two ceph monitor nodes generate swap space alerts this week. Looking at the memory, I see ceph-mon using a lot of memory and most of the swap space. My ceph nodes have 128GB mem, with 2GB swap (I know the memory/swap ratio is odd)

Re: [ceph-users] Split-brain in a multi-site cluster

2017-02-03 Thread Joao Eduardo Luis
On 02/02/2017 04:01 PM, Ilia Sokolinski wrote: Hi, We are testing a multi-site CEPH cluster using 0.94.5 release. There are 2 sites with 2 CEPH nodes in each site. Each node is running a monitor and a bunch of OSDs. The CRUSH rules are configured to require a copy of data in each site. The sites

Re: [ceph-users] 答复: Monitor repeatedly calling new election

2017-02-03 Thread Joao Eduardo Luis
On 02/03/2017 09:53 AM, 许雪寒 wrote: Thanks for your quick reply:-) I'm trying to send you more logs. Many of our online clusters has been ]running hammer version for a long time, it's a bit difficult for us to > update those clusters since we are really afraid of encountering problems during u

Re: [ceph-users] Monitor repeatedly calling new election

2017-02-03 Thread Joao Eduardo Luis
On 02/03/2017 09:16 AM, 许雪寒 wrote: Hi, everyone. Recently, when I was doing some stress test, one of the monitors of my ceph cluster was marked down, and all the monitors repeatedly call new election and the I/O can be finished. There were three monitors in my cluster: rg3-ceph36, rg3-ceph40,

Re: [ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Joao Eduardo Luis
On 01/31/2017 07:12 PM, Shinobu Kinjo wrote: On Wed, Feb 1, 2017 at 1:51 AM, Joao Eduardo Luis wrote: On 01/31/2017 03:35 PM, David Turner wrote: If you do have a large enough drive on all of your mons (and always intend to do so) you can increase the mon store warning threshold in the

Re: [ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Joao Eduardo Luis
On 01/31/2017 03:35 PM, David Turner wrote: If you do have a large enough drive on all of your mons (and always intend to do so) you can increase the mon store warning threshold in the config file so that it no longer warns at 15360 MB. And if you so decide to go that route, please be aware tha

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

2017-01-20 Thread Joao Eduardo Luis
Hi, This email is better suited for the 'ceph-users' list (CC'ed). You'll likely find more answers there. -Joao On 01/20/2017 04:33 PM, hen shmuel wrote: im new to Ceph and i want to build ceph storage cluster at my work site, to provide NAS services to are clients, as NFS to are linux serv

Re: [ceph-users] Question about user's key

2017-01-20 Thread Joao Eduardo Luis
On 01/20/2017 03:52 AM, Chen, Wei D wrote: Hi, I have read through some documents about authentication and user management about ceph, everything works fine with me, I can create a user and play with the keys and caps of that user. But I cannot find where those keys or capabilities stored, obv

Re: [ceph-users] CEPH mirror down again

2016-11-25 Thread Joao Eduardo Luis
On 11/26/2016 03:05 AM, Vy Nguyen Tan wrote: Hello, I want to install CEPH on new nodes but I can't reach CEPH repo, It seems the repo are broken. I am using CentOS 7.2 and ceph-deploy 1.5.36. Patrick sent an email to the list informing this would happen back on Nov 18th; quote: Due to Dre

Re: [ceph-users] Monitor troubles

2016-11-04 Thread Joao Eduardo Luis
On 11/04/2016 01:39 AM, Tracy Reed wrote: After a lot of messing about I have manually created a monmap and got the two new monitors working for a total of three. But to do that I had to delete the first monitor which for some reason was coming up with a bogus fsid after manipulated the monmap wh

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-11-03 Thread Joao Eduardo Luis
On 11/03/2016 06:18 PM, w...@42on.com wrote: Personally, I don't like this solution one bit, but I can't see any other way without a patched monitor, or maybe ceph_monstore_tool. If you are willing to wait till tomorrow, I'll be happy to kludge a sanitation feature onto ceph_monstore_tool th

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-11-03 Thread Joao Eduardo Luis
On 11/03/2016 05:52 PM, w...@42on.com wrote: Op 3 nov. 2016 om 16:44 heeft Joao Eduardo Luis het volgende geschreven: On 11/03/2016 01:24 PM, Wido den Hollander wrote: Op 3 november 2016 om 13:09 schreef Joao Eduardo Luis : On 11/03/2016 09:40 AM, Wido den Hollander wrote: root@mon3

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-11-03 Thread Joao Eduardo Luis
On 11/03/2016 01:24 PM, Wido den Hollander wrote: Op 3 november 2016 om 13:09 schreef Joao Eduardo Luis : On 11/03/2016 09:40 AM, Wido den Hollander wrote: root@mon3:/var/lib/ceph/mon# ceph-monstore-tool ceph-mon3 dump-keys|awk '{print $1}'|uniq -c 96 auth 1143 logm

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-11-03 Thread Joao Eduardo Luis
On 11/03/2016 12:09 PM, Joao Eduardo Luis wrote: On 11/03/2016 09:40 AM, Wido den Hollander wrote: root@mon3:/var/lib/ceph/mon# ceph-monstore-tool ceph-mon3 dump-keys|awk '{print $1}'|uniq -c 96 auth 1143 logm 3 mdsmap 1 mkfs 1 mon_sync 6 monitor

Re: [ceph-users] Monitors stores not trimming after upgrade from Dumpling to Hammer

2016-11-03 Thread Joao Eduardo Luis
On 11/03/2016 09:40 AM, Wido den Hollander wrote: root@mon3:/var/lib/ceph/mon# ceph-monstore-tool ceph-mon3 dump-keys|awk '{print $1}'|uniq -c 96 auth 1143 logm 3 mdsmap 1 mkfs 1 mon_sync 6 monitor 3 monmap 1158 osdmap 358364 paxos 656 pgmap 6

Re: [ceph-users] Monitors not reaching quorum

2016-07-26 Thread Joao Eduardo Luis
I see records with a timestamp that is up to 28 minutes behind the system clock! Also, while trying to set debug level, the monitors sometimes hung for several minutes, so there's obviously something wrong with them. On Mon, Jul 25, 2016

Re: [ceph-users] Monitors not reaching quorum

2016-07-26 Thread Joao Eduardo Luis
il -f, I see records with a timestamp that is up to 28 minutes behind the system clock! Also, while trying to set debug level, the monitors sometimes hung for several minutes, so there's obviously something wrong with them. On Mon, Jul 25, 2016 at 6:16 PM, Joao Eduardo Luis m

Re: [ceph-users] Monitors not reaching quorum

2016-07-25 Thread Joao Eduardo Luis
ake a look later tonight. -Joao On Mon, Jul 25, 2016 at 5:18 PM, Joao Eduardo Luis mailto:j...@suse.de>> wrote: On 07/25/2016 04:34 PM, Sergio A. de Carvalho Jr. wrote: Thanks, Joao. All monitors have the exact same mom map. I suspect you're right t

Re: [ceph-users] Monitors not reaching quorum

2016-07-25 Thread Joao Eduardo Luis
. If this is the case, check the size of your leveldb. If it is over 5 or 6GB in size, you may need to manually compact the store (mon compact on start = true, iirc). HTH -Joao On Mon, Jul 25, 2016 at 4:10 PM, Joao Eduardo Luis mailto:j...@suse.de>> wrote: On 07/25/2016 03:41 PM

Re: [ceph-users] Monitors not reaching quorum

2016-07-25 Thread Joao Eduardo Luis
On 07/25/2016 03:41 PM, Sergio A. de Carvalho Jr. wrote: In the logs, there 2 monitors are constantly reporting that they won the leader election: 60z0m02 (monitor 0): 2016-07-25 14:31:11.644335 7f8760af7700 0 log_channel(cluster) log [INF] : mon.60z0m02@0 won leader election with quorum 0,2,4

Re: [ceph-users] Monitors not reaching quorum

2016-07-25 Thread Joao Eduardo Luis
On 07/25/2016 03:45 PM, Joshua M. Boniface wrote: My understanding is that you need an odd number of monitors to reach quorum. This seems to match what you're seeing: with 3, there is a definite leader, but with 4, there isn't. Have you tried starting both the 4th and 5th simultaneously and le

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
quorum, or you need to add another monitor (call it C) so that you can stop A and still have the cluster working. -Joao 2016-07-07 17:34 GMT+02:00 Joao Eduardo Luis mailto:j...@suse.de>>: On 07/07/2016 04:31 PM, Fran Barrera wrote: Hello, Yes I've added two m

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
016-07-07 17:22 GMT+02:00 Joao Eduardo Luis mailto:j...@suse.de>>: On 07/07/2016 04:17 PM, Fran Barrera wrote: Hi all, I have a cluster setup AIO with only one monitor and now I've created another monitor in other server following this doc

Re: [ceph-users] Monitor question

2016-07-07 Thread Joao Eduardo Luis
On 07/07/2016 04:17 PM, Fran Barrera wrote: Hi all, I have a cluster setup AIO with only one monitor and now I've created another monitor in other server following this doc http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/ but my problem is if I stop the AIO monitor, the cluster

Re: [ceph-users] Monitor not starting: Corruption: 12 missing files

2016-04-22 Thread Joao Eduardo Luis
On 20/04/16 14:22, daniel.balsi...@swisscom.com wrote: > > root@ceph2:~# /usr/bin/ceph-mon --cluster=ceph -i ceph2 -f > > Corruption: 12 missing files; e.g.: > /var/lib/ceph/mon/ceph-ceph2/store.db/811920.ldb > > 2016-04-20 13:16:49.019857 7f39a9cbe800 -1 error opening mon data > directory at '

Re: [ceph-users] mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

2016-04-12 Thread Joao Eduardo Luis
for those in inactive/peering/unclean. Someone else will probably be able to chime in with more authority than me, but I would first try to restart the osds to which those stuck pgs are being mapped. -Joao Thanks, -- Eric On 4/12/16 1:14 PM, Joao Eduardo Luis wrote: On 04/12/2016 06:38 PM

Re: [ceph-users] mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

2016-04-12 Thread Joao Eduardo Luis
m the other monitors. That'll make them happy. -Joao [1] http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors Thanks again, -- Eric On 4/12/16 11:18 AM, Joao Eduardo Luis wrote: On 04/12/2016 05:06 PM, Joao Eduardo Luis wrote: On 04/12/2016 04:27

Re: [ceph-users] mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

2016-04-12 Thread Joao Eduardo Luis
On 04/12/2016 05:06 PM, Joao Eduardo Luis wrote: On 04/12/2016 04:27 PM, Eric Hall wrote: On 4/12/16 9:53 AM, Joao Eduardo Luis wrote: So this looks like the monitors didn't remove version 1, but this may just be a red herring. What matters, really, is the values in 'first_comm

Re: [ceph-users] mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

2016-04-12 Thread Joao Eduardo Luis
On 04/12/2016 04:27 PM, Eric Hall wrote: On 4/12/16 9:53 AM, Joao Eduardo Luis wrote: So this looks like the monitors didn't remove version 1, but this may just be a red herring. What matters, really, is the values in 'first_committed' and 'last_committed'. If eithe

Re: [ceph-users] mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

2016-04-12 Thread Joao Eduardo Luis
On 04/12/2016 03:33 PM, Eric Hall wrote: On 4/12/16 9:02 AM, Gregory Farnum wrote: On Tue, Apr 12, 2016 at 4:41 AM, Eric Hall wrote: On 4/12/16 12:01 AM, Gregory Farnum wrote: Exactly what values are you reading that's giving you those values? The "real" OSDMap epoch is going to be at least 3

Re: [ceph-users] [Ceph-community] Getting WARN in __kick_osd_requests doing stress testing

2016-02-12 Thread Joao Eduardo Luis
Hi Bart, This email belongs in ceph-users (CC'ed), or maybe ceph-devel. You're unlikely to get answers to this on ceph-community. -Joao On 09/17/2015 11:33 PM, bart.bar...@osnexus.com wrote: > I'm running in a 3-node cluster and doing osd/rbd creation and deletion, > and ran across this WARN >

Re: [ceph-users] Fwd: Question about monitor leader

2016-01-27 Thread Joao Eduardo Luis
On 01/27/2016 08:58 AM, Sándor Szombat wrote: > Hello, > > I'm testing ceph with minimal config on our servers (I'm using > ceph-deploy tool). We have 3 monitor nodes with 6 OSD. The worst > scenario when only 1 monitor and 2 osd up. Unfortunatelly in this case I > got just error when I run for ex

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Joao Eduardo Luis
On 01/20/2016 03:15 PM, Wido den Hollander wrote: > Hello, > > I have an issue with a (not in production!) Ceph cluster which I'm > trying to resolve. > > On Friday the network links between the racks failed and this caused all > monitors to loose connection. > > Their leveldb stores kept growin

Re: [ceph-users] Monitor rename / recreate issue -- probing state

2015-12-14 Thread Joao Eduardo Luis
On 12/14/2015 12:41 AM, deeepdish wrote: > Perhaps I’m not understanding something.. > > The “extra_probe_peers” ARE the other working monitors in quorum out of > the mon_host line in ceph.conf. > > In the example below 10.20.1.8 = b20s08; 10.20.10.251 = smon01s; > 10.20.10.252 = smon02s > > The

Re: [ceph-users] Monitor rename / recreate issue -- probing state

2015-12-13 Thread Joao Eduardo Luis
On 12/13/2015 12:26 PM, deeepdish wrote: >> >> This appears to be consistent with a wrongly populated 'mon_host' and >> 'mon_initial_members' in your ceph.conf. >> >> -Joao > > > Thanks Joao. I had a look but my other 3 monitors are working just > fine. To be clear, I’ve confirmed the same b

Re: [ceph-users] Monitor rename / recreate issue -- probing state

2015-12-10 Thread Joao Eduardo Luis
On 12/10/2015 04:00 AM, deeepdish wrote: > Hello, > > I encountered a strange issue when rebuilding monitors reusing same > hostnames, however different IPs. > > Steps to reproduce: > > - Build monitor using ceph-deploy create mon > - Remove monitor > via http://docs.ceph.com/docs/master/rados/

Re: [ceph-users] ceph-mon high cpu usage, and response slow

2015-11-30 Thread Joao Eduardo Luis
On 11/30/2015 09:51 AM, Yujian Peng wrote: > The mons in my production cluster(0.80.7) have a very high cpu usage 100%. > I added leveldb_compression = false to the ceph.conf to disable leveldb > compression and restarted all the mons with --compact. But the mons still > have a high cpu usages, and

Re: [ceph-users] Disaster recovery of monitor

2015-11-17 Thread Joao Eduardo Luis
On 11/17/2015 12:27 PM, Jose Tavares wrote: > My concern is about this log line > > 2015-11-17 10:11:16.143864 7f81e14aa700 0 > mon.osnode01@0(probing).data_health(0) update_stats avail 19% total 220 > GB, used 178 GB, avail 43194 MB > > I use to have 7TB of available space with 263G of con

Re: [ceph-users] Disaster recovery of monitor

2015-11-17 Thread Joao Eduardo Luis
On 11/17/2015 03:56 AM, Jose Tavares wrote: > The problem is that I think I don't have any good monitor anymore. > How do I know if the map I am trying is ok? > > I also saw in the logs that the primary mon was trying to contact a > removed mon at IP .112 .. So, I added .112 again ... and it didn'

Re: [ceph-users] [Ceph-community] Cephx vs. Kerberos

2015-10-19 Thread Joao Eduardo Luis
CC-ing ceph-users where this message belongs. On 10/16/2015 05:41 PM, Michael Joy wrote: > Hey Everyone, > > Is is possible to use Kerberos for authentication vs. the built in > Cephx? Does anyone know the process to get it working if it is possible? No, but it is on the wishlist for Jewel. Let

Re: [ceph-users] v9.1.0 Infernalis release candidate released

2015-10-13 Thread Joao Eduardo Luis
On 13/10/15 22:01, Sage Weil wrote: > * *RADOS*: > * The RADOS cache tier can now proxy write operations to the base > tier, allowing writes to be handled without forcing migration of > an object into the cache. > * The SHEC erasure coding support is no longer flagged as > experimen

Re: [ceph-users] [Ceph-community] Ceph MeetUp Berlin Sept 28

2015-09-08 Thread Joao Eduardo Luis
This may see more traction in ceph-users and ceph-devel. Most people don't usually subscribe to ceph-community. Cheers! -Joao On 09/08/2015 11:44 AM, Robert Sander wrote: > Hi, > > the next meetup in Berlin takes place on September 28 at 18:00 CEST. > > Please RSVP at http://www.meetup.com/

Re: [ceph-users] Ceph monitor ip address issue

2015-09-08 Thread Joao Eduardo Luis
On 09/08/2015 08:13 AM, Willi Fehler wrote: > Hi Chris, > > I tried to reconfigure my cluster but my MONs are still using the wrong > network. The new ceph.conf was pushed to all nodes and ceph was restarted. If your monitors are already deployed, you will need to move them to the new network man

Re: [ceph-users] Monitor segfault

2015-08-31 Thread Joao Eduardo Luis
On 08/31/2015 10:37 AM, Eino Tuominen wrote: > Hi Greg, > > Sure, should have gathered that myself... > > (gdb) bt > #0 0x7f071a05020b in raise () from /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x009a996d in reraise_fatal (signum=11) at > global/signal_handler.cc:59 > #2 handle_

Re: [ceph-users] Monitor segfault

2015-08-31 Thread Joao Eduardo Luis
On 08/31/2015 10:37 AM, Eino Tuominen wrote: > Hi Greg, > > Sure, should have gathered that myself... > > (gdb) bt > #0 0x7f071a05020b in raise () from /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x009a996d in reraise_fatal (signum=11) at > global/signal_handler.cc:59 > #2 handle_

Re: [ceph-users] [Ceph-community] Ceph containers Issue

2015-07-07 Thread Joao Eduardo Luis
CC'ing to ceph-users, where you're likely to get a proper response. Ceph-community is for community related matters. Cheers! -Joao On 07/07/2015 09:16 AM, Cristian Cristelotti wrote: > Hi all, > > I'm facing issue with a centralized Keystone and I can't create containers > with returning er

Re: [ceph-users] Very chatty MON logs: Is this "normal"?

2015-06-19 Thread Joao Eduardo Luis
On 06/19/2015 11:16 AM, Daniel Schneller wrote: > On 2015-06-18 09:53:54 +0000, Joao Eduardo Luis said: > >> Setting 'mon debug = 0/5' should be okay. Unless you see that setting >> '/5' impacts your performance and/or memory consumption, you should >&

Re: [ceph-users] Very chatty MON logs: Is this "normal"?

2015-06-18 Thread Joao Eduardo Luis
On 06/17/2015 08:30 PM, Somnath Roy wrote: > << However, I'd rather not set the level to 0/0, as that would disable all > logging from the MONs > > I don't think so. All the error scenarios and stack trace (in case of crash) > are supposed to be logged with log level 0. But, generally, we need t

Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-02 Thread Joao Eduardo Luis
On 06/02/2015 01:42 AM, cameron.scr...@solnet.co.nz wrote: > I am trying to deploy a new ceph cluster and my monitors are not > reaching quorum. SELinux is off, firewalls are off, I can see traffic > between the nodes on port 6789 but when I use the admin socket to force > a re-election only the mo

Re: [ceph-users] Adding new CEPH monitor keep SYNCHRONIZING

2015-05-18 Thread Joao Eduardo Luis
On 05/18/2015 10:33 AM, Ali Hussein wrote: > The two old Monitors uses Ceph version 0.87.1 , while the new added > Monitor uses 0.87.2 > P.S:- ntp is installed and working fine This is not related with clocks (or, at least, should not be). State 'synchronizing' means the monitor is getting its st

Re: [ceph-users] RFC: Deprecating ceph-tool commands

2015-05-09 Thread Joao Eduardo Luis
On 05/09/2015 09:57 AM, Loic Dachary wrote: > Hi, > > On 09/05/2015 01:55, Joao Eduardo Luis wrote: >> This approach gives a lifespan of roughly 3 releases (at current rate, >> roughly 1.5 years) before being completely dropped. This should give >> enough time to

Re: [ceph-users] RFC: Deprecating ceph-tool commands

2015-05-09 Thread Joao Eduardo Luis
On 09/05/15 01:28, Gregory Farnum wrote: On Fri, May 8, 2015 at 4:55 PM, Joao Eduardo Luis wrote: All, While working on #11545 (mon: have mon-specific commands under 'ceph mon ...') I crashed into a slightly tough brick wall. The purpose of #11545 is to move certain commands, suc

[ceph-users] RFC: Deprecating ceph-tool commands

2015-05-08 Thread Joao Eduardo Luis
All, While working on #11545 (mon: have mon-specific commands under 'ceph mon ...') I crashed into a slightly tough brick wall. The purpose of #11545 is to move certain commands, such as 'ceph scrub', 'ceph compact' and 'ceph sync force' to the 'mon' module of the ceph-tool. These commands have

Re: [ceph-users] The first infernalis dev release will be v9.0.0

2015-05-06 Thread Joao Eduardo Luis
On 05/05/2015 08:54 PM, Steffen W Sørensen wrote: > >> On 05/05/2015, at 18.52, Sage Weil wrote: >> >> On Tue, 5 May 2015, Tony Harris wrote: >>> So with this, will even numbers then be LTS? Since 9.0.0 is following >>> 0.94.x/Hammer, and every other release is normally LTS, I'm guessing 10.x.x,

Re: [ceph-users] The first infernalis dev release will be v9.0.0

2015-05-05 Thread Joao Eduardo Luis
On 05/04/2015 05:09 PM, Sage Weil wrote: > The first Ceph release back in Jan of 2008 was 0.1. That made sense at > the time. We haven't revised the versioning scheme since then, however, > and are now at 0.94.1 (first Hammer point release). To avoid reaching > 0.99 (and 0.100 or 1.00?) we ha

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-14 Thread Joao Eduardo Luis
On 04/14/2015 04:42 AM, Francois Lafont wrote: > Joao Eduardo wrote: > >> To be more precise, it's the lowest IP:PORT combination: >> >> 10.0.1.2:6789 = rank 0 >> 10.0.1.2:6790 = rank 1 >> 10.0.1.3:6789 = rank 3 >> >> and so on. > > Ok, so if there is 2 possible quorum, the quorum with the > lowe

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-13 Thread Joao Eduardo Luis
On 04/13/2015 02:25 AM, Christian Balzer wrote: > On Sun, 12 Apr 2015 14:37:56 -0700 Gregory Farnum wrote: > >> On Sun, Apr 12, 2015 at 1:58 PM, Francois Lafont >> wrote: >>> Somnath Roy wrote: >>> Interesting scenario :-).. IMHO, I don't think cluster will be in healthy state here if t

Re: [ceph-users] "store is getting too big" on monitors

2015-03-23 Thread Joao Eduardo Luis
rning and errors in 'ceph health detail' that pertains to osds. -Joao -- Thanks & Regards K.Mohamed Pakkeer On Mon, Feb 16, 2015 at 8:14 PM, Joao Eduardo Luis mailto:j...@redhat.com>> wrote: On 02/16/2015 12:57 PM, Mohamed Pakkeer wrote: Hi ceph-expe

Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster

2015-03-12 Thread Joao Eduardo Luis
On 03/12/2015 05:16 AM, Malcolm Haak wrote: Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an O

Re: [ceph-users] "store is getting too big" on monitors

2015-02-16 Thread Joao Eduardo Luis
On 02/16/2015 12:57 PM, Mohamed Pakkeer wrote: Hi ceph-experts, We are getting "store is getting too big" on our test cluster. Cluster is running with giant release and configured as EC pool to test cephFS. cluster c2a97a2f-fdc7-4eb5-82ef-70c52f2eceb1 health HEALTH_WARN too few pgs

Re: [ceph-users] Is it possible to compile and use ceph with Raspberry Pi single-board computers?

2015-01-19 Thread Joao Eduardo Luis
ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mon problem after power failure

2015-01-09 Thread Joao Eduardo Luis
ware especially considering the leveldb corruption. -Joao -- Joao Eduardo Luis Software Engineer | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  1   2   3   >