[ceph-users] Re : bad crc/signature errors

2017-10-05 Thread Olivier Bonvalet
Le jeudi 05 octobre 2017 à 11:47 +0200, Ilya Dryomov a écrit :
> The stable pages bug manifests as multiple sporadic connection
> resets,
> because in that case CRCs computed by the kernel don't always match
> the
> data that gets sent out.  When the mismatch is detected on the OSD
> side, OSDs reset the connection and you'd see messages like
> 
>   libceph: osd1 1.2.3.4:6800 socket closed (con state OPEN)
>   libceph: osd2 1.2.3.4:6804 socket error on write
> 
> This is a different issue.  Josy, Adrian, Olivier, do you also see
> messages of the "libceph: read_partial_message ..." type or is it
> just
> "libceph: ... bad crc/signature" errors?
> 
> Thanks,
> 
> Ilya

I have "read_partial_message" too, for example :

Oct  5 09:00:47 lorunde kernel: [65575.969322] libceph: read_partial_message 
88027c231500 data crc 181941039 != exp. 115232978
Oct  5 09:00:47 lorunde kernel: [65575.969953] libceph: osd122 10.0.0.31:6800 
bad crc/signature
Oct  5 09:04:30 lorunde kernel: [65798.958344] libceph: read_partial_message 
880254a25c00 data crc 443114996 != exp. 2014723213
Oct  5 09:04:30 lorunde kernel: [65798.959044] libceph: osd18 10.0.0.22:6802 
bad crc/signature
Oct  5 09:14:28 lorunde kernel: [66396.788272] libceph: read_partial_message 
880238636200 data crc 1797729588 != exp. 2550563968
Oct  5 09:14:28 lorunde kernel: [66396.788984] libceph: osd43 10.0.0.9:6804 bad 
crc/signature
Oct  5 10:09:36 lorunde kernel: [69704.211672] libceph: read_partial_message 
8802712dff00 data crc 2241944833 != exp. 762990605
Oct  5 10:09:36 lorunde kernel: [69704.212422] libceph: osd103 10.0.0.28:6804 
bad crc/signature
Oct  5 10:25:41 lorunde kernel: [70669.203596] libceph: read_partial_message 
880257521400 data crc 3655331946 != exp. 2796991675
Oct  5 10:25:41 lorunde kernel: [70669.204462] libceph: osd16 10.0.0.21:6806 
bad crc/signature
Oct  5 10:25:52 lorunde kernel: [70680.255943] libceph: read_partial_message 
880245e3d600 data crc 3787567693 != exp. 725251636
Oct  5 10:25:52 lorunde kernel: [70680.257066] libceph: osd60 10.0.0.23:6800 
bad crc/signature


On OSD side, for osd122 for example, I don't see any "reset" in osd
log.


Thanks,

Olivier
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Re : bad crc/signature errors

2017-10-05 Thread Olivier Bonvalet
I also see that, but on 4.9.52 and 4.13.3 kernel.

I also have some kernel panic, but don't know if it's related (RBD are
mapped on Xen hosts).

Le jeudi 05 octobre 2017 à 05:53 +, Adrian Saul a écrit :
> We see the same messages and are similarly on a 4.4 KRBD version that
> is affected by this.
> 
> I have seen no impact from it so far that I know about
> 
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > Behalf Of
> > Jason Dillaman
> > Sent: Thursday, 5 October 2017 5:45 AM
> > To: Gregory Farnum 
> > Cc: ceph-users ; Josy
> > 
> > Subject: Re: [ceph-users] bad crc/signature errors
> > 
> > Perhaps this is related to a known issue on some 4.4 and later
> > kernels [1]
> > where the stable write flag was not preserved by the kernel?
> > 
> > [1] http://tracker.ceph.com/issues/19275
> > 
> > On Wed, Oct 4, 2017 at 2:36 PM, Gregory Farnum 
> > wrote:
> > > That message indicates that the checksums of messages between
> > > your
> > > kernel client and OSD are incorrect. It could be actual physical
> > > transmission errors, but if you don't see other issues then this
> > > isn't
> > > fatal; they can recover from it.
> > > 
> > > On Wed, Oct 4, 2017 at 8:52 AM Josy 
> > 
> > wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > We have setup a cluster with 8 OSD servers (31 disks)
> > > > 
> > > > Ceph health is Ok.
> > > > --
> > > > [root@las1-1-44 ~]# ceph -s
> > > >cluster:
> > > >  id: de296604-d85c-46ab-a3af-add3367f0e6d
> > > >  health: HEALTH_OK
> > > > 
> > > >services:
> > > >  mon: 3 daemons, quorum
> > > > ceph-las-mon-a1,ceph-las-mon-a2,ceph-las-mon-a3
> > > >  mgr: ceph-las-mon-a1(active), standbys: ceph-las-mon-a2
> > > >  osd: 31 osds: 31 up, 31 in
> > > > 
> > > >data:
> > > >  pools:   4 pools, 510 pgs
> > > >  objects: 459k objects, 1800 GB
> > > >  usage:   5288 GB used, 24461 GB / 29749 GB avail
> > > >  pgs: 510 active+clean
> > > > 
> > > > 
> > > > We created a pool and mounted it as RBD in one of the client
> > > > server.
> > > > While adding data to it, we see this below error :
> > > > 
> > > > 
> > > > [939656.039750] libceph: osd20 10.255.0.9:6808 bad
> > > > crc/signature
> > > > [939656.041079] libceph: osd16 10.255.0.8:6816 bad
> > > > crc/signature
> > > > [939735.627456] libceph: osd11 10.255.0.7:6800 bad
> > > > crc/signature
> > > > [939735.628293] libceph: osd30 10.255.0.11:6804 bad
> > > > crc/signature
> > > > 
> > > > =
> > > > 
> > > > Can anyone explain what is this and if I can fix it ?
> > > > 
> > > > 
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > 
> > > 
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > 
> > 
> > 
> > 
> > --
> > Jason
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> Confidentiality: This email and any attachments are confidential and
> may be subject to copyright, legal or some other professional
> privilege. They are intended solely for the attention and use of the
> named addressee(s). They may only be copied, distributed or disclosed
> with the consent of the copyright owner. If you have received this
> email by mistake or by breach of the confidentiality clause, please
> notify the sender immediately by return email and delete or destroy
> all copies of the email. Any confidentiality, privilege or copyright
> is not waived or lost because this email has been sent to you by
> mistake.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com