I've seen one misbehaving OSD stopping all the IO in a cluster... I've had a situation where everything seemed fine with the OSD/node but the cluster was grinding to a halt. There was no iowait, disk wasn't very busy, wasn't doing recoveries, was up+in, no scrubs... Restart the OSD and everything recovers like magic...
On Thu, Aug 27, 2015 at 8:38 PM, Robert LeBlanc <rob...@leblancnet.us> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > +1 > > :) > > - ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > On Thu, Aug 27, 2015 at 1:16 PM, Jan Schermer wrote: > Well, there's no other way to get reliable performance and SLAs compared to > traditional storage when what you work with is commodity hardware in a mesh-y > configuration. > And we do like the idea of killing the traditional storage, right? I think > 80s called already and wanted their SAN back... > > Jan > > > On 27 Aug 2015, at 21:01, Robert LeBlanc wrote: > > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA256 > > > > I know writing to min_size as sync and size-min_size as async has been > > discussed before and would help here. From what I understand required > > a lot of code changes and goes against the strong consistency model of > > Ceph. I'm not sure if it will be implemented although I do love this > > idea to help against tail latency. > > - ---------------- > > Robert LeBlanc > > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > > > > On Thu, Aug 27, 2015 at 12:48 PM, Jan Schermer wrote: > >> Don't kick out the node, just deal with it gracefully and without > >> interruption... if the IO reached the quorum number of OSDs then there's > >> no need to block anymore, just queue it. Reads can be mirrored or retried > >> (much quicker, because making writes idempotent, ordered and async is > >> pretty hard and expensive). > >> If there's an easy way to detect unreliable OSD that flaps - great, let's > >> have a warning in ceph health. > >> > >> Jan > >> > >>> On 27 Aug 2015, at 20:43, Robert LeBlanc wrote: > >>> > >>> -----BEGIN PGP SIGNED MESSAGE----- > >>> Hash: SHA256 > >>> > >>> This has been discussed a few times. The consensus seems to be to make > >>> sure error rates of NICs or other such metrics are included in your > >>> monitoring solution. It would also be good to preform periodic network > >>> tests like a full size ping with nofrag set between all nodes and have > >>> your monitoring solution report that as well. > >>> > >>> Although I would like to see such a feature in Ceph, the concern is > >>> that such a feature can quickly get out of hand and that something > >>> else that is really designed for it should do it. I can understand > >>> where they are coming from in that regard, but having Ceph kick out a > >>> misbehaving node quickly is appealing as well (there would have to be > >>> a way to specify that only so many nodes could be kicked out). > >>> - ---------------- > >>> Robert LeBlanc > >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > >>> > >>> > >>> On Thu, Aug 27, 2015 at 9:37 AM, Christoph Adomeit wrote: > >>>> Hello Ceph Users, > >>>> > >>>> yesterday I had a defective Gbic in 1 node of my 10 node ceph cluster. > >>>> > >>>> The Gbic was working somehow but had 50% packet-loss. Some packets went > >>>> through, some did not. > >>>> > >>>> What happend that the whole cluster did not service requests in time, > >>>> there were lots of timeouts and so on > >>>> until the problem was isolated. Monitors and osds where asked for data > >>>> but did dot answer or answer late. > >>>> > >>>> I am wondering, here we have a highly redundant network setup and a > >>>> highly redundant piece of software, but a small > >>>> network fault brings down the whole cluster. > >>>> > >>>> Is there anything that can be configured or changed in ceph so that > >>>> availability will become better in case of flapping networks ? > >>>> > >>>> I understand, it is not a ceph problem but a network problem but maybe > >>>> something can be learned from such incidents ? > >>>> > >>>> Thanks > >>>> Christoph > >>>> -- > >>>> Christoph Adomeit > >>>> GATWORKS GmbH > >>>> Reststrauch 191 > >>>> 41199 Moenchengladbach > >>>> Sitz: Moenchengladbach > >>>> Amtsgericht Moenchengladbach, HRB 6303 > >>>> Geschaeftsfuehrer: > >>>> Christoph Adomeit, Hans Wilhelm Terstappen > >>>> > >>>> christoph.adom...@gatworks.de Internetloesungen vom Feinsten > >>>> Fon. +49 2166 9149-32 Fax. +49 2166 9149-10 > > >>>> _______________________________________________ > >>>> ceph-users mailing list > >>>> ceph-users@lists.ceph.com > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >>> -----BEGIN PGP SIGNATURE----- > >>> Version: Mailvelope v1.0.2 > >>> Comment: https://www.mailvelope.com > >>> > >>> wsFcBAEBCAAQBQJV31pFCRDmVDuy+mK58QAA7qwQAL0EvbHneC00qhCX/jjT > >>> Xl8whWvQgm/UUDEPAWe2wGkgVZtP3cSAx/p+IkusZuD6NClIiWvazdz5n+vf > >>> cj4Y+S8Zj4Lw7gypHjy5GSCDSbQnEni32QNKp74GM/EZ1331gXuDvP0bS2Sz > >>> 7g5MXu8Vpf0Kdrj8JrOPnHY1PtljxkQXdrEmijDkmnjruO+XGFQrl8l9GFbN > >>> enFZI+PpEAoSEJPZosCnX+ZLM3/ZiwAfAPtvcARyDwdmjV7CjyRjVviloR3K > >>> DV/b+VuWX+NVzTZMKCnILVubt1Khexzk6reU3m7Yjy713dmEehDmKQsESFci > >>> pMi61iEuxje0O+iqOp+mhhYWtv+Iv7bbpHcGv04vfMsl6+ms6v/EHo/Cccoi > >>> ZiOa+xD6l7ZkO+A+2bvunBvC3cjBFXn8yrNpHDj6G+jUWMDuJcs7wAhExhPv > >>> Qicjhzk9AoTFXPIkfkGnuHJ/ngFnswdHeVa1DU7GV+Evh/2BCtoHH7Ur+XQY > >>> u7gL6LXt+2UAB3+ZIEvr2NOAFiIVsPqnGqQqNiNz5XQDFh5bD3e1iScucZbm > >>> VNStBkWDoDwrBYVe74cN55ZXA5auTSDYuYlen+BPbYhAKmpkBp+Suv1H4CFy > >>> 01cnANvJfbaxoBIPLzvhdx4c73Qd+J6ttxi2g8u8EedXDbPIYGFPy2madvtW > >>> JNPc > >>> =3sV8 > >>> -----END PGP SIGNATURE----- > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > -----BEGIN PGP SIGNATURE----- > > Version: Mailvelope v1.0.2 > > Comment: https://www.mailvelope.com > > > > wsFcBAEBCAAQBQJV316iCRDmVDuy+mK58QAAxogP+QGDVhfGxa4OeIslEsoj > > aW3LY4nzzFP1iDJNjlvPDDTj5AcC56c2QhvshLRy3pYUmwoWqO0gnTOGh/YX > > ma6+hGVJCaBZU5L6rZ0SQfcfo3CNglIzQ2ts07Xb5XPQRrS6/yLsMki+kDf0 > > qjCqpZpPTL/d80sBrbCNDoZcnMKBYKwBZbay8RsSBZ0pHmdylfnhvGSxZBEk > > U8ZTrUdsZ9ejzdfh29byR3V/Mz6EkGVnnFPlkIAdkuZJPns+i6NGoe16z3kL > > 3u967qFfFcNGUWCGO0MC/iYT4fcRoramqMWhY5hBUD8DmWXgbKQmutal8vaO > > sDBfKLgmQkBpDgOTng6/uE/BpkROmjsjXuCar/xf+QwXMQJhIWFmBqcaKdQl > > TUjd4QovBJVFPWq9qpyh9ia95cfoFm942LaunaA4chTQnxjTbS/0aajSTM7/ > > OxUuMcPCnuAsbHXsj/wkPE1ZTmNU9KPQgo10h8UlhMoQU2fOZB8h/p/0fuqk > > bfBo9k07EkdakFFc/ASLpFqIeV49ZTYjUg/0MPdXW5KnJPb+4OBojIZZF9An > > /20UgUXqqBd2LKFY/bwKqoOw5bKCMxQCJptJXmY6zx7vJ76ahyOuiP/OaQWh > > 8uikTxCUeNdaNzez3s3TUgQSR4y94zT1It5rF16VrFPMik66Yq8x11t6z8eR > > Po34 > > =tqy/ > > -----END PGP SIGNATURE----- > > > -----BEGIN PGP SIGNATURE----- > Version: Mailvelope v1.0.2 > Comment: https://www.mailvelope.com > > > wsFcBAEBCAAQBQJV32cbCRDmVDuy+mK58QAAev8P+wQY2fEnphdvRzeW2XII > D6IJEB+7qklberNTLq1nVbnFDoMi/d0M3nNEEvChckbI9AR/YYsrUalxehQR > x5uxMs+pbH71XHbq9IDe/3ej057dij/YtwTGJd2i8A3B82dJFlFCu0TnmVRr > HvR07k1rY7LCp6PwV8rIWYiGsTriM8QFq8z9gK5Qo8OPwNgOCxV5Z4AJO8DY > 7PrtfBC6AAoeNvIqkmBs3bjQ1Bzv6qxV1XM0kAO72+g9PvBNf9G2YCAUNfIW > uRrPKrFDGLCI5X3POBgU/4un+ULH81N3jNfPKT6m1uTGim8wWyQXiC3P0+TY > OpgMurCjdylrc4BQ3nr2EAEzRLAihgg9SCENWcGrca+i4fmfsmzc7PGoXmUb > i02IHYixIpOrc0hzSzfRez4nkDVQMuOKSHEfdf8y/NOaG/E+rnx39RB5ff65 > LSAG3v1e+pgjA4zOIBYlkPelV5FfcM/7nKxXz/f7fYqDSgpweIxcAId6RYy3 > 1TEc1YpEWhZ+v8OivVnLkiVwE73vAmFUlZFaLhlMxf5I9fT0+4U9ykhrO0Rt > Cf+lVpBz22jTIhTXdYSUQXh9+7LtaBqure8S+BWP4IlBBfiWecA61hWwDX6Q > GDM7EvRFIjY0kMxv9zV5npwCXud3lEx8XaI545jW1LMw8vPFVblEIq6vThDc > oHTK > =estz > -----END PGP SIGNATURE----- > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com