Re: [ceph-users] Cluster is empty but it still use 1Gb of data

2018-03-02 Thread Gonzalo Aguilar Delgado
Hi Max, No that's not normal. 9GB for an empty cluster. Maybe you reserved some space or you have other service that's taking the space. But It seems way to much for me. El 02/03/18 a las 12:09, Max Cuttins escribió: I don't care of get back those space. I just want to know if it's

Re: [ceph-users] PG::peek_map_epoch assertion fail

2017-12-06 Thread Gonzalo Aguilar Delgado
Hi, Since my email server falled down because the error. I have to reply this way. I added more logs:   int r = store->omap_get_values(coll, pgmeta_oid, keys, );   if (r == 0) {     assert(values.size() == 2); -- 0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In

[ceph-users] I cannot make the OSD to work, Journal always breaks 100% time

2017-12-06 Thread Gonzalo Aguilar Delgado
Hi, Another OSD falled down. And it's pretty scary how easy is to break the cluster. This time is something related to the journal. /usr/bin/ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph starting osd.6 at :/0 osd_data /var/lib/ceph/osd/ceph-6

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Gonzalo Aguilar Delgado
Hi, I created this. http://paste.debian.net/999172/ But the expiration date is too short. So I did this too https://pastebin.com/QfrE71Dg. What I want to mention is that there's no known cause for what's happening. It's true that time desynch happens on reboot because few millis skew. But ntp

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-04 Thread Gonzalo Aguilar Delgado
pgs, 6 pools, 561 GB data, 141 kobjects     1124 GB used, 1514 GB / 2639 GB avail     20266198323167232/288980 objects degraded (7013010700798.405%) Best regards On 03/12/17 13:31, Gonzalo Aguilar Delgado wrote: > > Hi, > > Yes. Nice. Until all your OSD fails and you do

[ceph-users] PG::peek_map_epoch assertion fail

2017-12-03 Thread Gonzalo Aguilar Delgado
Hello, What can make fail this assertion?   int r = store->omap_get_values(coll, pgmeta_oid, keys, );   if (r == 0) {     assert(values.size() == 2); -- 0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t,

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-03 Thread Gonzalo Aguilar Delgado
; to go fine automatically. Are you doing something that is not adviced? > > > > > -Original Message----- > From: Gonzalo Aguilar Delgado [mailto:gagui...@aguilardelgado.com] > Sent: zaterdag 25 november 2017 20:44 > To: 'ceph-users' > Subject: [ceph-users] Anot

[ceph-users] Another OSD broken today. How can I recover it?

2017-11-25 Thread Gonzalo Aguilar Delgado
Hello, I had another blackout with ceph today. It seems that ceph osd's fall from time to time and they are unable to recover. I have 3 OSD's down now. 1 removed from the cluster and 2 down because I'm unable to recover them. We really need a recovery tool. It's not normal that an OSD breaks and

Re: [ceph-users] Infinite degraded objects

2017-10-25 Thread Gonzalo Aguilar Delgado
seeing is fixed in there but I'd upgrade to 10.2.10 and then > open a tracker ticket if the problem still persists. > > On Thu, Oct 26, 2017 at 9:13 AM, Gonzalo Aguilar Delgado > <gagui...@aguilardelgado.com> wrote: >> Hello, >> >> I cannot tell what was the p

Re: [ceph-users] Infinite degraded objects

2017-10-25 Thread Gonzalo Aguilar Delgado
eek or so ago: > http://tracker.ceph.com/issues/21803 > > On Mon, Oct 23, 2017 at 5:10 AM, Gonzalo Aguilar Delgado > <gagui...@aguilardelgado.com> wrote: >> Hello, >> >> Since we upgraded ceph cluster we are facing a lot of problems. Most of them >> due to osd

[ceph-users] Infinite degraded objects

2017-10-22 Thread Gonzalo Aguilar Delgado
Hello, Since we upgraded ceph cluster we are facing a lot of problems. Most of them due to osd crashing. What can cause this? This morning I woke up with thi message: root@red-compute:~# ceph -w     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771 health HEALTH_ERR     1 pgs are

[ceph-users] Ceph OSD get blocked and start to make inconsistent pg from time to time

2017-09-29 Thread Gonzalo Aguilar Delgado
Hi, I discovered that my cluster starts to make slow requests and all disk activity get blocked. This happens once a day. And the ceph OSD get 100% CPU. In the ceph health I get something like: 2017-09-29 10:49:01.227257 [INF] pgmap v67494428: 764 pgs: 1

Re: [ceph-users] Ceph OSD crash starting up

2017-09-20 Thread Gonzalo Aguilar Delgado
of some of the commands. Googling for `ceph-users scrub errors inconsistent pgs` is a good place to start. On Tue, Sep 19, 2017 at 11:28 AM Gonzalo Aguilar Delgado <gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote: Hi David, What I want is to add

Re: [ceph-users] Ceph OSD crash starting up

2017-09-19 Thread Gonzalo Aguilar Delgado
ack with its data or add it back in as a fresh osd. What is your `ceph status`? On Tue, Sep 19, 2017, 5:23 AM Gonzalo Aguilar Delgado <gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote: Hi David, Thank you for the great explanation of the wei

Re: [ceph-users] Ceph OSD crash starting up

2017-09-19 Thread Gonzalo Aguilar Delgado
is health_ok without any missing objects, then there is nothing that you need off of OSD1 and ceph recovered from the lost disk successfully. On Thu, Sep 14, 2017 at 4:39 PM Gonzalo Aguilar Delgado <gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote: H

Re: [ceph-users] Ceph OSD crash starting up

2017-09-14 Thread Gonzalo Aguilar Delgado
your crush map and `ceph osd df`? On Wed, Sep 13, 2017 at 6:39 AM Gonzalo Aguilar Delgado <gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote: Hi, I'recently updated crush map to 1 and did all relocation of the pgs. At the end I found that one of t

[ceph-users] Scrub failing all the time, new inconsistencies keep appearing

2017-09-14 Thread Gonzalo Aguilar Delgado
Hello, I'm using ceph since long time ago. A day ago added jewel requirement for OSD. And upgraded crush map. From this time I had all kind of errors, maybe because disks failing because rebalances or because there's a problem I don't know. I have some pg active+clean+inconsistent, from

[ceph-users] Ceph OSD crash starting up

2017-09-13 Thread Gonzalo Aguilar Delgado
Hi, I'recently updated crush map to 1 and did all relocation of the pgs. At the end I found that one of the OSD is not starting. This is what it shows: 2017-09-13 10:37:34.287248 7f49cbe12700 -1 *** Caught signal (Aborted) ** in thread 7f49cbe12700 thread_name:filestore_sync ceph version

Re: [ceph-users] Ceph mount rbd

2017-07-14 Thread Gonzalo Aguilar Delgado
Hi, Why you would like to maintain copies by yourself. You replicate on ceph and then on different files inside ceph? Let ceph take care of counting. Create a pool with 3 or more copies and let ceph take care of what's stored and where. Best regards, El 13/07/17 a las 17:06,

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-11 Thread Gonzalo Aguilar Delgado
again.  I suppose it was in a stale situation.On mié, 2016-05-11 at 09:37 +0200, Gonzalo Aguilar Delgado wrote: > Hello again,  > > I was looking at the patches sent on the repository and I found a > patch that made the OSD to check for cluster health before starting > up.  > &

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-11 Thread Gonzalo Aguilar Delgado
Hello again, I was looking at the patches sent on the repository and I found a patch that made the OSD to check for cluster health before starting up. Can this be patch the source of all my problems? Best regards, On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado < gaguilar.d

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
eff735598c0 0 probe_block_device_fsid /dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a r On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado < gaguilar.delg...@gmail.com> wrote: > Hello, > > I just upgraded my cluster to the version 10.1.2 and it worked well for a > while unti

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
quot;9028f4da-0d77-462b-be9b-dbdf7fa57771", "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7", "whoami": 3, "state": "preboot", "oldest_map": 1764, "newest_map": 2504, "num_pgs": 150 } 3 is up

[ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
Hello, I just upgraded my cluster to the version 10.1.2 and it worked well for a while until I saw that systemctl ceph-disk@dev-sdc1.service was failed and I reruned it. >From there the OSD stopped working. This is ubuntu 16.04. I connected to the IRC looking for help where people pointed me

Re: [ceph-users] Slow/Hung IOs

2015-01-06 Thread Gonzalo Aguilar Delgado
Hi, I just ran this test and found my system is not better. But I use commodity hardware. The only difference is latency. You should look at it. Total time run: 62.412381 Total writes made: 919 Write size: 4194304 Bandwidth (MB/sec): 58.899 Stddev Bandwidth:

[ceph-users] ceph-mon is taking too much memory. It's a bug?

2014-04-30 Thread Gonzalo Aguilar Delgado
Hi, I've found my system with memory almost full. I see PID USUARIO PR NIVIRTRESSHR S %CPU %MEM HORA+ ORDEN 2317 root 20 0 824860 647856 3532 S 0,7 5,3 29:46.51 ceph-mon I think it's too much. But what do you think? Best regards,

Re: [ceph-users] Ceph not replicating

2014-04-22 Thread Gonzalo Aguilar Delgado
it. Thank you a lot Michael. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado gagui...@aguilardelgado.com wrote: Hi Michael, It worked. I didn't realized of this because docs it installs two osd nodes

[ceph-users] rdb - huge disk - slow ceph

2014-04-22 Thread Gonzalo Aguilar Delgado
Hi, I did my first mistake so big... I did a rbd disk of about 300 TB, yes 300 TB rbd info test-disk -p high_value rbd image 'test-disk': size 300 TB in 78643200 objects order 22 (4096 kB objects) block_name_prefix: rb.0.18d7.2ae8944a format: 1 but even more.

Re: [ceph-users] Ceph not replicating

2014-04-22 Thread Gonzalo Aguilar Delgado
serving in production. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado gagui...@aguilardelgado.com wrote: Hi Michael, It worked. I didn't realized of this because docs it installs two osd nodes and says

[ceph-users] Journal partition on prepare

2014-04-20 Thread Gonzalo Aguilar Delgado
Hi, when I create an osd with ceph-deploy with: ceph-deploy osd prepare --fs-type btrfs red-compute:sdb I see system creates two partitions on the disk one for data, one for journal. This is right since I want to use an SSD disk for journals, but I want to follow the bcache path. So one SSD

[ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
Hi, I'm building a cluster where two nodes replicate objects inside. I found that shutting down just one of the nodes (the second one), makes everything incomplete. I cannot find why, since crushmap looks good to me. after shutting down one node cluster

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
to choose type osd in your rulesets. JC On Saturday, April 19, 2014, Gonzalo Aguilar Delgado gagui...@aguilardelgado.com wrote: Hi, I'm building a cluster where two nodes replicate objects inside. I found that shutting down just one of the nodes (the second one), makes everything incomplete

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
: for testing efficiently and most options available, functionnally speaking, deploy a cluster with 3 nodes, 3 OSDs each is my best practice. Or make 1 node with 3 OSDs modifying your crushmap to choose type osd in your rulesets. JC On Saturday, April 19, 2014, Gonzalo Aguilar Delgado gagui