Hi Max,
No that's not normal. 9GB for an empty cluster. Maybe you reserved some
space or you have other service that's taking the space. But It seems
way to much for me.
El 02/03/18 a las 12:09, Max Cuttins escribió:
I don't care of get back those space.
I just want to know if it's
Hi,
Since my email server falled down because the error. I have to reply
this way.
I added more logs:
int r = store->omap_get_values(coll, pgmeta_oid, keys, );
if (r == 0) {
assert(values.size() == 2); --
0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In
Hi,
Another OSD falled down. And it's pretty scary how easy is to break the
cluster. This time is something related to the journal.
/usr/bin/ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph
starting osd.6 at :/0 osd_data /var/lib/ceph/osd/ceph-6
Hi,
I created this. http://paste.debian.net/999172/ But the expiration date
is too short. So I did this too https://pastebin.com/QfrE71Dg.
What I want to mention is that there's no known cause for what's
happening. It's true that time desynch happens on reboot because few
millis skew. But ntp
pgs, 6 pools, 561 GB data, 141 kobjects
1124 GB used, 1514 GB / 2639 GB avail
20266198323167232/288980 objects degraded (7013010700798.405%)
Best regards
On 03/12/17 13:31, Gonzalo Aguilar Delgado wrote:
>
> Hi,
>
> Yes. Nice. Until all your OSD fails and you do
Hello,
What can make fail this assertion?
int r = store->omap_get_values(coll, pgmeta_oid, keys, );
if (r == 0) {
assert(values.size() == 2); --
0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In
function 'static int PG::peek_map_epoch(ObjectStore*, spg_t,
; to go fine automatically. Are you doing something that is not adviced?
>
>
>
>
> -Original Message-----
> From: Gonzalo Aguilar Delgado [mailto:gagui...@aguilardelgado.com]
> Sent: zaterdag 25 november 2017 20:44
> To: 'ceph-users'
> Subject: [ceph-users] Anot
Hello,
I had another blackout with ceph today. It seems that ceph osd's fall
from time to time and they are unable to recover. I have 3 OSD's down
now. 1 removed from the cluster and 2 down because I'm unable to recover
them.
We really need a recovery tool. It's not normal that an OSD breaks and
seeing is fixed in there but I'd upgrade to 10.2.10 and then
> open a tracker ticket if the problem still persists.
>
> On Thu, Oct 26, 2017 at 9:13 AM, Gonzalo Aguilar Delgado
> <gagui...@aguilardelgado.com> wrote:
>> Hello,
>>
>> I cannot tell what was the p
eek or so ago:
> http://tracker.ceph.com/issues/21803
>
> On Mon, Oct 23, 2017 at 5:10 AM, Gonzalo Aguilar Delgado
> <gagui...@aguilardelgado.com> wrote:
>> Hello,
>>
>> Since we upgraded ceph cluster we are facing a lot of problems. Most of them
>> due to osd
Hello,
Since we upgraded ceph cluster we are facing a lot of problems. Most of
them due to osd crashing. What can cause this?
This morning I woke up with thi message:
root@red-compute:~# ceph -w
cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
health HEALTH_ERR
1 pgs are
Hi,
I discovered that my cluster starts to make slow requests and all disk
activity get blocked.
This happens once a day. And the ceph OSD get 100% CPU. In the ceph
health I get something like:
2017-09-29 10:49:01.227257 [INF] pgmap v67494428: 764 pgs: 1
of some
of the commands. Googling for `ceph-users scrub errors inconsistent
pgs` is a good place to start.
On Tue, Sep 19, 2017 at 11:28 AM Gonzalo Aguilar Delgado
<gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote:
Hi David,
What I want is to add
ack with its data or add it back in as
a fresh osd. What is your `ceph status`?
On Tue, Sep 19, 2017, 5:23 AM Gonzalo Aguilar Delgado
<gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote:
Hi David,
Thank you for the great explanation of the wei
is health_ok
without any missing objects, then there is nothing that you need off
of OSD1 and ceph recovered from the lost disk successfully.
On Thu, Sep 14, 2017 at 4:39 PM Gonzalo Aguilar Delgado
<gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote:
H
your crush map and `ceph osd df`?
On Wed, Sep 13, 2017 at 6:39 AM Gonzalo Aguilar Delgado
<gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote:
Hi,
I'recently updated crush map to 1 and did all relocation of the
pgs. At the end I found that one of t
Hello,
I'm using ceph since long time ago. A day ago added jewel requirement
for OSD. And upgraded crush map.
From this time I had all kind of errors, maybe because disks failing
because rebalances or because there's a problem I don't know.
I have some pg active+clean+inconsistent, from
Hi,
I'recently updated crush map to 1 and did all relocation of the pgs. At
the end I found that one of the OSD is not starting.
This is what it shows:
2017-09-13 10:37:34.287248 7f49cbe12700 -1 *** Caught signal (Aborted) **
in thread 7f49cbe12700 thread_name:filestore_sync
ceph version
Hi,
Why you would like to maintain copies by yourself. You replicate on ceph
and then on different files inside ceph? Let ceph take care of counting.
Create a pool with 3 or more copies and let ceph take care of what's
stored and where.
Best regards,
El 13/07/17 a las 17:06,
again.
I suppose it was in a stale situation.On mié, 2016-05-11 at 09:37 +0200,
Gonzalo Aguilar Delgado wrote:
> Hello again,
>
> I was looking at the patches sent on the repository and I found a
> patch that made the OSD to check for cluster health before starting
> up.
>
&
Hello again,
I was looking at the patches sent on the repository and I found a patch
that made the OSD to check for cluster health before starting up.
Can this be patch the source of all my problems?
Best regards,
On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado <
gaguilar.d
eff735598c0 0 probe_block_device_fsid
/dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a
r
On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado <
gaguilar.delg...@gmail.com> wrote:
> Hello,
>
> I just upgraded my cluster to the version 10.1.2 and it worked well for a
> while unti
quot;9028f4da-0d77-462b-be9b-dbdf7fa57771",
"osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7",
"whoami": 3,
"state": "preboot",
"oldest_map": 1764,
"newest_map": 2504,
"num_pgs": 150
}
3 is up
Hello,
I just upgraded my cluster to the version 10.1.2 and it worked well for a
while until I saw that systemctl ceph-disk@dev-sdc1.service was failed and
I reruned it.
>From there the OSD stopped working.
This is ubuntu 16.04.
I connected to the IRC looking for help where people pointed me
Hi,
I just ran this test and found my system is not better. But I use
commodity hardware. The only difference is latency. You should look at
it.
Total time run: 62.412381
Total writes made: 919
Write size: 4194304
Bandwidth (MB/sec): 58.899
Stddev Bandwidth:
Hi,
I've found my system with memory almost full. I see
PID USUARIO PR NIVIRTRESSHR S %CPU %MEM HORA+ ORDEN
2317 root 20 0 824860 647856 3532 S 0,7 5,3 29:46.51
ceph-mon
I think it's too much. But what do you think?
Best regards,
it.
Thank you a lot Michael.
Thanks,
Michael J. Kidd
Sr. Storage Consultant
Inktank Professional Services
On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado
gagui...@aguilardelgado.com wrote:
Hi Michael,
It worked. I didn't realized of this because docs it installs two
osd nodes
Hi,
I did my first mistake so big... I did a rbd disk of about 300 TB, yes
300 TB
rbd info test-disk -p high_value
rbd image 'test-disk':
size 300 TB in 78643200 objects
order 22 (4096 kB objects)
block_name_prefix: rb.0.18d7.2ae8944a
format: 1
but even more.
serving in production.
Thanks,
Michael J. Kidd
Sr. Storage Consultant
Inktank Professional Services
On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado
gagui...@aguilardelgado.com wrote:
Hi Michael,
It worked. I didn't realized of this because docs it installs two
osd nodes and says
Hi,
when I create an osd with ceph-deploy with:
ceph-deploy osd prepare --fs-type btrfs red-compute:sdb
I see system creates two partitions on the disk one for data, one for
journal. This is right since I want to use an SSD disk for journals,
but I want to follow the bcache path. So one SSD
Hi,
I'm building a cluster where two nodes replicate objects inside. I
found that shutting down just one of the nodes (the second one), makes
everything incomplete.
I cannot find why, since crushmap looks good to me.
after shutting down one node
cluster
to choose type
osd in your rulesets.
JC
On Saturday, April 19, 2014, Gonzalo Aguilar Delgado
gagui...@aguilardelgado.com wrote:
Hi,
I'm building a cluster where two nodes replicate objects inside. I
found that shutting down just one of the nodes (the second one),
makes everything incomplete
: for testing efficiently and most options available,
functionnally speaking, deploy a cluster with 3 nodes, 3 OSDs each
is my best practice.
Or make 1 node with 3 OSDs modifying your crushmap to choose type
osd in your rulesets.
JC
On Saturday, April 19, 2014, Gonzalo Aguilar Delgado
gagui
33 matches
Mail list logo