thanks to this im adding regular bandwidth tests. is there, or should there be a best practices doc on ceph.com?
On Sat, Aug 1, 2015 at 2:16 PM Josef Johansson <[email protected]> wrote: > Hi, > > I did a "big-ping" test to verify the network after last major network > problem. If anyone wants to take a peek I could share. > > Cheers > > > Josef > > lör 1 aug 2015 02:19 Ben Hines <[email protected]> skrev: > >> I encountered a similar problem. Incoming firewall ports were blocked >> on one host. So the other OSDs kept marking that OSD as down. But, it >> could talk out, so it kept saying 'hey, i'm up, mark me up' so then >> the other OSDs started trying to send it data again, causing backed up >> requests.. Which goes on, ad infinitum. I had to figure out the >> connectivity problem myself by looking in the OSD logs. >> >> After a while, the cluster should just say 'no, you're not reachable, >> stop putting yourself back into the cluster'. >> >> -Ben >> >> On Fri, Jul 31, 2015 at 11:21 AM, Jan Schermer <[email protected]> wrote: >> > I remember reading that ScaleIO (I think?) does something like this by >> regularly sending reports to a multicast group, thus any node with issues >> (or just overload) is reweighted or avoided automatically on the client. >> OSD map is the Ceph equivalent I guess. It makes sense to gather metrics >> and prioritize better performing OSDs over those with e.g. worse latencies, >> but it needs to update fast. But I believe that _network_ monitoring itself >> ought to be part of… a network monitoring system you should already have >> :-) and not a storage system that just happens to use network. I don’t >> remember seeing anything but a simple ping/traceroute/dns test in any SAN >> interface. If an OSD has issues it might be anything from a failing drive >> to a swapping OS and a number like “commit latency” (= response time >> average from the clients’ perspective) is maybe the ultimate metric of all >> for this purpose, irrespective of the root cause. >> > >> > Nice option would be to read data from all replicas at once - this >> would of course increase load and cause all sorts of issues if abused, but >> if you have an app that absolutely-always-without-fail-must-get-data-ASAP >> then you could enable this in the client (and I think that would be an easy >> option to add). This is actually used in some systems. Harder part is to >> fail nicely when writing (like waiting only for the remote network buffers >> on 2 nodes to get the data instead of waiting for commit on all 3 replicas…) >> > >> > Jan >> > >> >> On 31 Jul 2015, at 19:45, Robert LeBlanc <[email protected]> wrote: >> >> >> >> -----BEGIN PGP SIGNED MESSAGE----- >> >> Hash: SHA256 >> >> >> >> Even just a ping at max MTU set with nodefrag could tell a lot about >> >> connectivity issues and latency without a lot of traffic. Using Ceph >> >> messenger would be even better to check firewall ports. I like the >> >> idea of incorporating simple network checks into Ceph. The monitor can >> >> correlate failures and help determine if the problem is related to one >> >> host from the CRUSH map. >> >> - ---------------- >> >> Robert LeBlanc >> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> >> >> >> On Thu, Jul 30, 2015 at 11:27 PM, Stijn De Weirdt wrote: >> >>> wouldn't it be nice that ceph does something like this in background >> (some >> >>> sort of network-scrub). debugging network like this is not that easy >> (can't >> >>> expect admins to install e.g. perfsonar on all nodes and/or clients) >> >>> >> >>> something like: every X min, each service X pick a service Y on >> another host >> >>> (assuming X and Y will exchange some communication at some point; >> like osd >> >>> with other osd), send 1MB of data, and make the timing data available >> so we >> >>> can monitor it and detect underperforming links over time. >> >>> >> >>> ideally clients also do this, but not sure where they should >> report/store >> >>> the data. >> >>> >> >>> interpreting the data can be a bit tricky, but extreme outliers will >> be >> >>> spotted easily, and the main issue with this sort of debugging is >> collecting >> >>> the data. >> >>> >> >>> simply reporting / keeping track of ongoing communications is already >> a big >> >>> step forward, but then we need to have the size of the exchanged data >> to >> >>> allow interpretation (and the timing should be about the network >> part, not >> >>> e.g. flush data to disk in case of an osd). (and obviously sampling is >> >>> enough, no need to have details of every bit send). >> >>> >> >>> >> >>> >> >>> stijn >> >>> >> >>> >> >>> On 07/30/2015 08:04 PM, Mark Nelson wrote: >> >>>> >> >>>> Thanks for posting this! We see issues like this more often than >> you'd >> >>>> think. It's really important too because if you don't figure it out >> the >> >>>> natural inclination is to blame Ceph! :) >> >>>> >> >>>> Mark >> >>>> >> >>>> On 07/30/2015 12:50 PM, Quentin Hartman wrote: >> >>>>> >> >>>>> Just wanted to drop a note to the group that I had my cluster go >> >>>>> sideways yesterday, and the root of the problem was networking >> again. >> >>>>> Using iperf I discovered that one of my nodes was only moving data >> at >> >>>>> 1.7Mb / s. Moving that node to a different switch port with a >> different >> >>>>> cable has resolved the problem. It took awhile to track down because >> >>>>> none of the server-side error metrics for disk or network showed >> >>>>> anything was amiss, and I didn't think to test network performance >> (as >> >>>>> suggested in another thread) until well into the process. >> >>>>> >> >>>>> Check networking first! >> >>>>> >> >>>>> QH >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> ceph-users mailing list >> >>>>> [email protected] >> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>>>> >> >>>> _______________________________________________ >> >>>> ceph-users mailing list >> >>>> [email protected] >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>>> >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> [email protected] >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> -----BEGIN PGP SIGNATURE----- >> >> Version: Mailvelope v0.13.1 >> >> Comment: https://www.mailvelope.com >> >> >> >> wsFcBAEBCAAQBQJVu7QoCRDmVDuy+mK58QAAcpAQAKbv6xPRxMMJ8NWrXym0 >> >> NAtZFIYywvStKfTG2pL1xjb2p/xDM+6Z5mnYJTBHb+0dkGIO6qe0jF9t4XEE >> >> ppH+55eIpkCZrKMdfN1L0vUe9ldFnJS2jsAlGkvzyRLJale++q1evymIAaWb >> >> JnEZgV3pGrPTCRaVKNrT3NaGZVDLm6ygnsT6PYJaiXM8Av3equ00Uls2/i6v >> >> vZhlIBz5TbKsNag/W7cRJVvjj7YDsgU+dplDl62mmDJ6o+cWvILlf9WPINdV >> >> MrmIeg+7fqUEp8nuEzTMm+BDHQ3c/5cxrYr8bksiVoBTXV7m9fO0Je9Exn6N >> >> iWTa5eDUBtR6Ha8WaVUib/cvFj6j94QRNWYmXHl9lG50p+XZ0L5bZ1G8v9Nb >> >> gGxRoYgAncp9M1J+7Pvm5z8wZgxXAs/veUtrf+6SkUbGyCRnUSn/VS7C8syJ >> >> 4WW2aWP/A0nxSDe1u+TGpkkPmhk7UDrJEfMQaZrFwS9FkFLfgLH7PxMcAZjJ >> >> hlN129vldPh3QxLviLidlJmzUTvKtb+XrSkA0MjhFMJS2M79DR16j+XWe7Ub >> >> wPnKpZcZ8WsQzOlTHtDEHQvhE3ilcm+4oALSiuqEAZKNKk8lUTtvfzJ2BKyu >> >> Tv46c+Wf3LbwrdMnkGiMHLuIlqhQT2FzauM2Pi+Pt7QJ7L9xXfWW4vzdemxj >> >> bBQD >> >> =rPC0 >> >> -----END PGP SIGNATURE----- >> >> _______________________________________________ >> >> ceph-users mailing list >> >> [email protected] >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > _______________________________________________ >> > ceph-users mailing list >> > [email protected] >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
