Hello List,
the other day when i looked at our ceph cluster it showed:
health HEALTH_ERR 135 pgs inconsistent; 1 pgs recovering;
recovery 76/4633296 objects degraded (0.002%); 169 scrub errors; clock
skew detected on mon.mon2-nb8
I did a
ceph pg dump | grep -i incons | cut -f 1 | while r
2014-06-06 9:18 GMT+02:00 Benedikt Fraunhofer
:
Hello List,
> and it logs nothing in "ceph -w" when i issue
>
> ceph pg repair 2.c1
> instructing pg 2.c1 on osd.51 to repair
> ceph pg repair 2.68
> instructing pg 2.68 on osd.69 to repair
Rebooting the hosts
Hi Nick,
did you do anything fancy to get to ~90MB/s in the first place?
I'm stuck at ~30MB/s reading cold data. single-threaded-writes are
quite speedy, around 600MB/s.
radosgw for cold data is around the 90MB/s, which is imho limitted by
the speed of a single disk.
Data already present on the
Hi Mika,
2014-10-20 11:16 GMT+02:00 Vickie CH :
> 2.Use dd command to create a 1.2T file.
>#dd if=/dev/zero of=/mnt/ceph-mount/test12T bs=1M count=12288000
I think you're off by one "zero"
12288000/1024/1024
11
Means you're instructing it to create a 11TB file on a 1.5T volume.
Cheers
Hello everyone,
I can't download anything that's been uploaded as a multipart-upload.
I'm on 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4)
on a non-ec-pool.
The upload is acknowledged as beeing ok
2014-03-31 14:56:56.722727 7f4080ff9700 2 req 8:0.023285:s3:POST
/file1:complete_multipart:http
Hi Yehuda,
> 2014-04-01 15:49 GMT+02:00 Yehuda Sadeh :
> It could be the gateway's fault, might be related to the new manifest
> that went in before 0.78. I'll need more logs though, can you
> reproduce with 'debug ms = 1', and 'debug rgw = 20', and provide a log
> for all the upload and for the
Hi Yehuda,
i tried your patch and it feels fine,
except you might need some special handling for those already corrupt uploads,
as trying to delete them gets radosgw in an endless loop and high cpu usage:
2014-04-02 11:03:15.045627 7fbf157d2700 0
RGWObjManifest::operator++(): result: ofs=3355443
2014-04-04 0:31 GMT+02:00 Yehuda Sadeh :
Hi Yehuda,
sorry for the delay. We ran into another problem and this took up all the time.
>> Are you running the version off the master branch, or did you just
>> cherry-pick the patch? I can't seem to reproduce the problem.
I just patched that line in a
Hello Yehuda,
2014-04-04 9:35 GMT+02:00 Benedikt Fraunhofer
:
> I'm currently trying to find a faster box with enough free space to
> reproduce that and capture logs.
Here's the complete log with "debug rgw 20" and "debug ms 1" of a
failed large-ish multi
Hello List,
after some crash of a box, the journal vanished. Creating a new one
with --mkjournal results in the osd beeing unable to start.
Does anyone want to dissect this any further or should I just trash
the osd and recreate it?
Thx in advance
Benedikt
2015-12-01 07:46:31.505255 7fadb7f1e9
Hello Cephers!
trying to repair an inconsistent PG results in the osd dying with an
assertion failure:
0> 2015-12-01 07:22:13.398006 7f76d6594700 -1 osd/SnapMapper.cc:
In function 'int SnapMapper::get_snaps(const hobject_t&
, SnapMapper::object_snaps*)' thread 7f76d6594700 time 2015-12-01
07
Hello Cephers,
lately, our ceph-cluster started to show some weird behavior:
the osd boxes show a load of 5000-15000 before the osds get marked down.
Usually the box is fully usable, even "apt-get dist-upgrade" runs smoothly,
you can read and write to any disk, only things you can't do are strace
Hi Jan,
2015-12-08 8:12 GMT+01:00 Jan Schermer :
> Journal doesn't just "vanish", though, so you should investigate further...
We tried putting journals as files to overcome the changes in ceph-deploy
where you can't have the journals unencrypted but only the disks itself.
(and/or you can't have
= 4194304
> (I think it also sets this as well: kernel.threads-max = 4194304)
>
> I think you are running out of processs IDs.
>
> Jan
>
>> On 08 Dec 2015, at 08:10, Benedikt Fraunhofer wrote:
>>
>> Hello Cephers,
>>
>> lately, our ceph-cluster started t
Hi Jan,
we had 65k for pid_max, which made
kernel.threads-max = 1030520.
or
kernel.threads-max = 256832
(looks like it depends on the number of cpus?)
currently we've
root@ceph1-store209:~# sysctl -a | grep -e thread -e pid
kernel.cad_pid = 1
kernel.core_uses_pid = 0
kernel.ns_last_pid = 60298
k
s case -
don't get any data to work with.
Thx
Benedikt
2015-12-08 8:44 GMT+01:00 Jan Schermer :
>
> Jan
>
>
>> On 08 Dec 2015, at 08:41, Benedikt Fraunhofer wrote:
>>
>> Hi Jan,
>>
>> we had 65k for pid_max, which made
>> kernel.threads-max
Hi Tom,
> We have been seeing this same behavior on a cluster that has been perfectly
> happy until we upgraded to the ubuntu vivid 3.19 kernel. We are in the
i can't recall when we gave 3.19 a shot but now that you say it... The
cluster was happy for >9 months with 3.16.
Did you try 4.2 or do y
Hi Tom,
2015-12-08 10:34 GMT+01:00 Tom Christensen :
> We didn't go forward to 4.2 as its a large production cluster, and we just
> needed the problem fixed. We'll probably test out 4.2 in the next couple
unfortunately we don't have the luxury of a test cluster.
and to add to that, we couldnt s
18 matches
Mail list logo