Hello all,

Here's my experience at monitoring XWikis.

With i2geo.net and with my private XWiki, I use a zabbix server.
This php-based monitoring tool is quite easy to configure for http monitoring 
and with a few more steps you get a mail notification when, e.g., a timeout 
occurs in connections.
I've been using HypericHQ for a while, a java based monitoring, which was 
rather nice to manipulate but a machine-name-change broke everything, so I 
looked for something a tick more modern.

At curriki.org, a site with lots of visitors, there's quite a few tools used to 
monitor.
- First, for the safety and honesty of a system outside, alertsite.com is used. 
It is very effective at detecting breakges, including potential internet 
backbones'. We use monitoring from three locations.
- Second, because, indeed, the XWiki servers sometimes need a push, there used 
to be a regular script that checks a basic page and, if failed, auto-restarts 
the app-server. For us, this is a bit unsafe because we like to control things 
after a restart.
- Third, for a while, we have been running a "combined monitoring" which 
allowed to combine a small graphical view synced with logs of apache, the 
app-server, thread-dumps, and mysql. This allowed to catch "bad actions" which 
sometimes happen when power users perform actions which trigger too big queries 
which locked others (group-deletions were such an action).
- Finally, we also added a zabbix which collects http monitoring as well as 
other "classical" values (disks, memory, apache-stats, …).
The rhythm at curriki is about a week… after a week, one of the two cluster 
nodes (there's two currently) needs a restart because some memory gets 
exhausted and the GC starts to fail. We generally get alertsite errors then.

The interest of running a monitoring infrastructure such as zabbix, is that you 
can analyze the behaviors of multiple variables and see if there is a way to 
predict if things are getting wrong. It remains a guts' feeling story but still 
gives you quite some confidence.

It would be really nice if we could converge on a set of JMX analysis "items" 
for zabbix so that we could be analyzing more concretely the xwiki-relevant 
information (in particular the cache behaviors) and start adjusting to less  
fall out of memory.

paul



On 31 oct. 2014, at 22:29, Jason Clemons <[email protected]> wrote:

> I's also find any suggestions very helpful, I've had that happen a few times 
> and outside of monitoring CPU and RAM, I've found logging to be difficult to 
> use and configure, and even when I get it configured it's not very helpful.
> 
> 
> 
>> On Oct 31, 2014, at 1:57 PM, Bryn Jeffries <[email protected]> 
>> wrote:
>> 
>> Having made my XWiki site available to other users, I was concerned to find 
>> that the site became unusable at one point with client connections 
>> eventually timing out. I had no way to diagnose the problem, but eventually 
>> I managed to make a (slow) SSH connection to the server and restarted 
>> Tomcat, and things seemed to settle back to normal.
>> 
>> The problem is I have no real sense of what happened and how to prevent it 
>> happening again. To that end, I'd appreciate any suggestions for monitoring 
>> the server and diagnosing poor performance. What do others typically use? I 
>> have an Apache2 server passing wiki page requests to Tomcat7 via an ajp 
>> connector, and a PostgreSQL database. My guess is that Tomcat is doing most 
>> of the work here so that's probably what I need to monitor the most.

_______________________________________________
users mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/users

Reply via email to