collectd is fairly light which is why it is popular as a collection agent.
I'm going to assume you are comfortable installing packages and
configuring software. I don't have the time to write copy pasta
instructions. I *strongly* recommend you read all of this & the links
before you begin, and make sure you understand what is required of you.
The key components are:
- https://collectd.org/
- http://graphite.wikidot.com/
You need to get collectd running on each of your TF2 hosts. Basically
apt-get but see note below regarding collectd versions.
You'll then want to setup Graphite on another machine. You *could* run
it on your TF2 host but Carbon can get I/O hungry (it is tunable) and
that will create more problems for you so I strongly recommend running
Graphite on another machine.
Also, having Graphite on another machine (with the collectd collector,
below) makes it easy for you to have multiple TF2 hosts, or migrate TF2
hosts.
In my setup I have my Graphite host running on an Ubuntu VM at home with
6 external servers reporting to it.
Here's a picture of the overall setup:
https://dl.dropboxusercontent.com/u/8110989/2014/collectd-graphite.png
Back to setup... with collectd running on your TF2 host, and Graphite on
another host, how do you connect them?
https://collectd.org/wiki/index.php/Networking_introduction
Your Graphite host *also* needs to run collectd in order to act as a
collection server for your TF2 host's collectd to send it data. Edit
config on both TF2 host & Graphite host -- both sides need to run the
collectd network plugin, Graphite host as server, TF2 host as client.
Your Graphite host's collectd also needs to run the write_graphite
plugin to write the network collected data to Graphite.
https://collectd.org/wiki/index.php/Plugin:Write_Graphite
<Plugin write_graphite>
<Carbon>
Host "localhost"
Port "2003"
EscapeCharacter "_"
</Carbon>
</Plugin>
Note: if you Google collectd + graphite you may be confused by many blog
posts refer to custom written plugins which were necessary before
collectd had its write_graphite plugin.
Note 2: since you're on Debian note that the write_graphite plugin was
added with collectd 5.1. You may need to get it from backports or something.
For Graphite...
This is a reasonable overview but may be out of date:
http://graphite.wikidot.com/installation
Read ^ as an overview but maybe follow the current instructions here:
https://graphite.readthedocs.org/en/latest/install.html
You need to pay attention to the storage-schemas.conf but you can more
or less ignore other instructions about feeding data into Graphite. With
the collectd write_graphite plugin your data will automagically be fed
from collectd -> localhost:2003 which is Carbon (Graphite's collector).
Good luck :]
PS: I am happy to answer specific questions about the collectd/graphite
setup but if you ask general sysadmin stuff I probably won't respond.
On 8/04/2014 12:50 AM, pilger wrote:
I've noticed the yellow bars mainly on the Mem field. Don't know if that
might be related. Could it?
About collectd, it seems very nice and a lot easier to visualize but you
talked greek to me up there. Would you point me to some tutorial or show
me some ropes on how to get it running so I can find the bottlenecks?
Does it use a lot of resource!?
_pilger
On 7 April 2014 11:35, Yun Huang Yong <[email protected]
<mailto:[email protected]>> wrote:
Your concern about noisy VPS neighbours will show up as CPU steal -
htop shows this as yellow bars by default.
Disk latency could also be an issue.
66 tick means each tick has a time budget of around 15ms (1000/66).
If disk latency exceeds 15ms you will get stuttering - I had this
happen on servers in the past.
e.g.
https://dl.dropboxusercontent.__com/u/8110989/2013/np1-disk-__latency.png
<https://dl.dropboxusercontent.com/u/8110989/2013/np1-disk-latency.png>
Stuttery server leading up to 08/03 (US style month/day, August last
year). Host migrated my server to another less loaded machine, great
for a few weeks then as that machine also became more heavily
utilised (by other customers) it started to stutter again.
FWIW I use collectd to gather these metrics on each host, feeding
into a single collectd collector which then uses collectd's
write_graphite plugin to write all the data into graphite for
storage & graphing. collectd's default 10s polling is great for
picking up transient issues, and graphite_web makes the
visualisation easy.
On 7/04/2014 10:26 PM, pilger wrote:
Hey guys, thanks for the replies.
* The RAM seems all right when I look at it with htop;
* We tried CentOS but the network was behaving poorly with it
so we
switched to Debian x64 and it became a lot better;
* net_splitpacket_maxrate was set to 50000 while the rates
were from
30000 to 60000. I've now set the splitpacket to 100000 and
the rates
to 50000 to 100000 as you guys suggested. Gotta wait a bit
for the
server to get full so I can check if it worked;
Wouldn't the htop or any other monitoring tool show something
wrong even
it being a VPS!?
But, anyway, as I mentioned before, the problem occurs with the
server
practically empty. So I don't think it is related to CPU being
overloaded... could I be wrong on this? Could my VPS neighbours be
leeching on my CPU even it being supposedly reserved to my service?
Thanks!
_pilger
On 7 April 2014 02:10, John <lists.valve@nuclearfallout.__net
<mailto:[email protected]>
<mailto:lists.valve@__nuclearfallout.net
<mailto:[email protected]>>> wrote:
Its not the RAM. Its packet loss from server side - you
won't
see it on net graph as its only client side.
Packet loss should show in net_graph output either way.
But, to be
safe, certainly run MTR tests.
I've had this happen to me lots of times. Been running
servers
since the 1.5 days. Ditch your host and also ditch
Debian BS.
Recent versions of Debian work well for game servers, so
ditching it
would not be necessary.
You should confer with your host on the status of your
hardware and
whether a performance limitation is involved, such as I/O
delays.
You should also double-check server-side rates, including
by making
sure that net_splitpacket_maxrate is set sufficiently high
(such as
100000). These symptoms seem along the lines of what I
would expect
from net_splitpacket_maxrate being low.
Ask ant corporation or enterprise, all use CentOS.
CentOS is marketed to enterprise and works well for such
applications because of its older, stable, well-tested software
packages and extended RHEL support for those older
packages. For
game servers, it is not ideal, since those older packages
often lack
useful features and performance tweaks. Debian is usually a
better
choice for game servers.
If you're interested in hosting DDoS protected servers,
email me
- I can help you.
Be very careful with hosts that claim to offer DDoS protection.
There is an extremely limited number who do it right, and a
very
large number who do not.
-John
___________________________________________________
To unsubscribe, edit your list preferences, or view the list
archives, please visit:
https://list.valvesoftware.____com/cgi-bin/mailman/listinfo/____hlds
<https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>>
_________________________________________________
To unsubscribe, edit your list preferences, or view the list
archives, please visit:
https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>
_________________________________________________
To unsubscribe, edit your list preferences, or view the list
archives, please visit:
https://list.valvesoftware.__com/cgi-bin/mailman/listinfo/__hlds
<https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds>
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds