Hi HP.
Mine was not really a fix, it was just a hack to get the OSD up long enough
to make sure I had a full backup, then I rebuilt the cluster from scratch
and restored the data. Though the hack did stop the OSD from crashing, it
is probably a symptom of some internal problem, and may not be
l_default_min_size = ??
>
> And whats the setting for the to create copies ?
>
> osd_pool_default_size = ???
>
> Please give us the output of
>
> ceph osd pool ls detail
>
>
>
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interac
gestions appreciated,
Blade.
On Sat, Apr 30, 2016 at 9:31 AM, Blade Doyle <blade.do...@gmail.com> wrote:
> Hi Ceph-Users,
>
> Help with how to resolve these would be appreciated.
>
> 2016-04-30 09:25:58.399634 9b809350 0 log_channel(cluster) log [INF] :
> 4.97 deep-scrub
Hi Ceph-Users,
Help with how to resolve these would be appreciated.
2016-04-30 09:25:58.399634 9b809350 0 log_channel(cluster) log [INF] :
4.97 deep-scrub starts
2016-04-30 09:26:00.041962 93009350 0 -- 192.168.2.52:6800/6640 >>
192.168.2.32:0/3983425916 pipe(0x27406000 sd=111 :6800 s=0 pgs=0
I went ahead and removed the assert and conditionalized the future use of
the obc variable on its being non-null. And linked that into a custom
ceph-osd binary for use on the most problematic node (8). That got the osd
up and running again! I took the opportunity to use the standard "remove
an
b350 -1 *** Caught signal (Aborted)
**
in thread 9d0bb350
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0x69764c]
2: (__default_sa_restorer()+0) [0xb694ed10]
3: (gsignal()+0x38) [0xb694daa8]
NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.
Aborted
, Apr 20, 2016 at 12:37 AM, Blade Doyle <blade.do...@gmail.com> wrote:
> I get a lot of osd crash with the following stack - suggestion please:
>
> 0> 1969-12-31 16:04:55.455688 83ccf410 -1 osd/ReplicatedPG.cc: In
> function 'void ReplicatedPG::hit_set_trim(ReplicatedPG::R
I get a lot of osd crash with the following stack - suggestion please:
0> 1969-12-31 16:04:55.455688 83ccf410 -1 osd/ReplicatedPG.cc: In
function 'void ReplicatedPG::hit_set_trim(ReplicatedPG::RepGather*,
unsigned int)' thread 83ccf410 time 295.324905
osd/ReplicatedPG.cc: 11011: FAILED
Help, my Ceph cluster is losing data slowly over time. I keep finding files
that are the same length as they should be, but all the content has been
lost & replaced by nulls.
Here is an example:
(from a backup I have the original file)
[root@blotter docker]# ls -lart
On Mon, Mar 14, 2016 at 3:48 PM, Christian Balzer <ch...@gol.com> wrote:
>
> Hello,
>
> On Mon, 14 Mar 2016 09:16:13 -0700 Blade Doyle wrote:
>
> > Hi Ceph Community,
> >
> > I am trying to use "ceph -w" output to monitor my ceph cluster. The
>
Hi Ceph Community,
I am trying to use "ceph -w" output to monitor my ceph cluster. The basic
setup is:
A python script runs ceph -w and processes each line of output. It finds
the data it wants and reports it to InfluxDB. I view the data using
Grafana, and Ceph Dashboard.
For the most part
Greg, Thats very useful info. I had not queried the admin sockets before
today, so I am learning new things!
on the x86_64: mds, mon, and osd, and rbd + cephfs client
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
On the arm7 nodes: mon, osd, and rbd + cephfs clients
ceph
After several months of use without needing any administration at all, I
think I finally found something to debug.
Attempting to "ls -l" within a directory on CephFS hangs - strace shows its
hanging on lstat():
open("/etc/group", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644,
13 matches
Mail list logo