Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-03 Thread Bryn Jeffries
(Sorry for top-posting - I'm stuck with Outlook Live :-(

My system's very small - approx 20 users and very little content at present. 
It's for a small collaborative research group which may grew to ~100 members. I 
expect quite a few largish attachments and have already configured filesystem 
attachments.

I'm using PostgreSQL for various reasons, so can't add all the suggested 
indexes (names the string prefix ones that are specific to MySQL), but I'd be 
surprised to be hitting limits so soon.

The slowdown occurred soon after I opened the wiki to the other members, who 
probably did some exploring and perhaps triggered something painful.

From: Paul Libbrecht [p...@hoplahup.net]
Sent: 02 November 2014 22:21
To: XWiki Users
Subject: Re: [xwiki-users] Monitoring an Xwiki stack

Btw,

I am not sure you could say the XWiki installs are that pesky.
However, depending on the user base, it may really need quite some tuning.
For example, if your xwiki manipulates complex documents the document cache may 
be too big for the memory, and that you only reach with some time (it could be 
a week or two). OutOfMemoryErrors then appear on a regular basis.

Another example has been the amount of registered users. This is a bit too much 
on curriki to store into a page of objects, so special treatment has been 
applied.
Yet another example would be the mass of attachments, e.g if people use this as 
a shared disk, where the file-system-attachments solution has helped quite many 
(I think).

I think all wikis and CMSs that I know of are rather limited in their default 
goals (beyond XWiki, I have experience with Drupal and Wordpress) and special 
treatment maybe be quickly available, in config, install, or custom development.

Monitoring tools can help you adjust this.

Bryn, maybe you want to indicate how big is your system?
Maybe it's just a matter of some too eager clients (e.g. some ever repeat 
javascript-based-requests served to tens of users every second or so)?

paul
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-03 Thread Paul Libbrecht
 My system's very small - approx 20 users and very little content at present. 
 It's for a small collaborative research group which may grew to ~100 members. 
 I expect quite a few largish attachments and have already configured 
 filesystem attachments.

Sounds quite reasonable.
Limited memory?

 I'm using PostgreSQL for various reasons, so can't add all the suggested 
 indexes (names the string prefix ones that are specific to MySQL), but I'd be 
 surprised to be hitting limits so soon.

The database difference might be something to explore.
I've seen rather often that things are more battle-tested for MySQL.
But I fully agree you have reasons to prefer PostgreSQL.

 The slowdown occurred soon after I opened the wiki to the other members, who 
 probably did some exploring and perhaps triggered something painful.

You definitely need to explore when it happens.
- something such as show full processlist (that's a MySQL thing) should be 
available to show if there's an SQL query taking long and if others are slow 
(typically, big LIKE queries tend to block all writes)
- invoke a thread-dump (kill -HUP or use JMX or 
http://extensions.xwiki.org/xwiki/bin/view/Extension/JMX+Access#HExample5:gettingafullthread-dump):
 the threads generally carry the path of the http requests
- check memory
- make sure you enable -verbose:gc as a JVM option to indicate when memory 
limits are reached

paul

___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-03 Thread Bryn Jeffries
Says Paul Libbrecht:
 My system's very small - approx 20 users and very little content at present. 
 It's 
 for a small collaborative research group which may grew to ~100 members. I
 expect quite a few largish attachments and have already configured filesystem
 attachments.

 Sounds quite reasonable.
 Limited memory?

Well I'm running everything off a single VM configured with 8GB RAM and 2 
VCPUS. If necessary I can use more VMs but I wanted to keep things simple until 
I knew what my actual performance needs were. I also have access to plenty of 
persistent storage.

I have Tomcat configured with CATALINA_OPTS=-server -Xms800m -Xmx800m 
-XX:MaxPermSize=196m

 I'm using PostgreSQL for various reasons, so can't add all the suggested 
 indexes (names the string prefix ones that are specific to MySQL), but I'd 
 be 
 surprised to be hitting limits so soon.

 The database difference might be something to explore.
 I've seen rather often that things are more battle-tested for MySQL.
 But I fully agree you have reasons to prefer PostgreSQL.

Yes, and I'd be interested to look into optimising common searches once things 
have grown and I understand the query workload. Alternatives to MySQL string 
prefix indexes would be GIN or GIST - the pg_trgm package looks promising.


___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-02 Thread vinc...@massol.net
Hi Bryn,

There are some information that I’ve put at 
http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Monitoring to monitor XWiki 
instances.

JavaMelody is quite great for that but it won’t send you alerts.

Make sure to check the tools at the end in the Others section too. There’s the 
xinit tool which is a monitoring/admin tool developed by XWiki SAS.

Hope it helps,
-Vincent

On 31 Oct 2014 at 21:57:35, Bryn Jeffries 
(bryn.jeffr...@sydney.edu.au(mailto:bryn.jeffr...@sydney.edu.au)) wrote:

 Having made my XWiki site available to other users, I was concerned to find 
 that the site became unusable at one point with client connections eventually 
 timing out. I had no way to diagnose the problem, but eventually I managed to 
 make a (slow) SSH connection to the server and restarted Tomcat, and things 
 seemed to settle back to normal.
  
 The problem is I have no real sense of what happened and how to prevent it 
 happening again. To that end, I'd appreciate any suggestions for monitoring 
 the server and diagnosing poor performance. What do others typically use? I 
 have an Apache2 server passing wiki page requests to Tomcat7 via an ajp 
 connector, and a PostgreSQL database. My guess is that Tomcat is doing most 
 of the work here so that's probably what I need to monitor the most.

___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-02 Thread Paul Libbrecht

Btw,

I am not sure you could say the XWiki installs are that pesky.
However, depending on the user base, it may really need quite some tuning.
For example, if your xwiki manipulates complex documents the document cache may 
be too big for the memory, and that you only reach with some time (it could be 
a week or two). OutOfMemoryErrors then appear on a regular basis.

Another example has been the amount of registered users. This is a bit too much 
on curriki to store into a page of objects, so special treatment has been 
applied.
Yet another example would be the mass of attachments, e.g if people use this as 
a shared disk, where the file-system-attachments solution has helped quite many 
(I think).

I think all wikis and CMSs that I know of are rather limited in their default 
goals (beyond XWiki, I have experience with Drupal and Wordpress) and special 
treatment maybe be quickly available, in config, install, or custom development.

Monitoring tools can help you adjust this.

Bryn, maybe you want to indicate how big is your system?
Maybe it's just a matter of some too eager clients (e.g. some ever repeat 
javascript-based-requests served to tens of users every second or so)?

paul
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-02 Thread vinc...@massol.net
BTW if you have setup some tools and have had successes with them to monitor 
xwiki instances, it would be great if you could add some doc about them at 
http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Monitoring

Thanks!
-Vincent


On 2 Nov 2014 at 12:20:34, vinc...@massol.net 
(vinc...@massol.net(mailto:vinc...@massol.net)) wrote:

 Hi Bryn,
  
 There are some information that I’ve put at 
 http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Monitoring to monitor 
 XWiki instances.  
  
 JavaMelody is quite great for that but it won’t send you alerts.  
  
 Make sure to check the tools at the end in the Others section too. There’s 
 the xinit tool which is a monitoring/admin tool developed by XWiki SAS.  
  
 Hope it helps,  
 -Vincent
  
 On 31 Oct 2014 at 21:57:35, Bryn Jeffries 
 (bryn.jeffr...@sydney.edu.au(mailto:bryn.jeffr...@sydney.edu.au)) wrote:
  
  Having made my XWiki site available to other users, I was concerned to find 
  that the site became unusable at one point with client connections 
  eventually timing out. I had no way to diagnose the problem, but eventually 
  I managed to make a (slow) SSH connection to the server and restarted 
  Tomcat, and things seemed to settle back to normal.
 
  The problem is I have no real sense of what happened and how to prevent it 
  happening again. To that end, I'd appreciate any suggestions for monitoring 
  the server and diagnosing poor performance. What do others typically use? I 
  have an Apache2 server passing wiki page requests to Tomcat7 via an ajp 
  connector, and a PostgreSQL database. My guess is that Tomcat is doing most 
  of the work here so that's probably what I need to monitor the most.
  
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-02 Thread Bryn Jeffries
Vincent wrote:
 BTW if you have setup some tools and have had successes
 with them to monitor xwiki instances, it would be great if
 you could add some doc about them at 
 http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Monitoring
 

Thanks for the pointers. These sound like useful tools, although I think to 
cover the situation I encountered (web services looked up) I probably need to 
get more familiar with the logs, make sure they collect the right stuff, and 
have a command-line tool to filter it. But yes, I shall certainly contribute 
any experience I glean back into the wiki.
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-01 Thread Bryn Jeffries
Hi Paul,

Many thanks for your contribution. I'll certainly look into Zabbix, although I 
must confess to being aghast at what appears to be a large and complex tool for 
what I'd hoped was quite simple. I hadn't realised these servers were so 
temperamental. Before I loose myself in getting acquainted with a new 
sophisticated product, could you tell me whether Zabbix (or something else) 
will help me identify the following?:
- When are users suffering timeouts (doesn't have to be real time, happy to 
check summary later)
- Where was the timeout occuring (network, Apache, Tomcat, Postgres)
- What was the cause of the timeout (too many connections, low memory, long 
Java operation, long query, etc)
- What specific item (Java program, DB query) was responsible
I wonder whether all this should be discoverable in the logs, with the right 
configuration.

I've seen a lot of mention of JMX for Tomcat monitoring, but I've shied away 
from it since I wanted to start simple, but perhaps there is no simple ... ;-(


From: Paul Libbrecht [p...@hoplahup.net]
Sent: 01 November 2014 09:41
To: XWiki Users
Subject: Re: [xwiki-users] Monitoring an Xwiki stack

Hello all,

Here's my experience at monitoring XWikis.

With i2geo.net and with my private XWiki, I use a zabbix server.
This php-based monitoring tool is quite easy to configure for http monitoring 
and with a few more steps you get a mail notification when, e.g., a timeout 
occurs in connections.
I've been using HypericHQ for a while, a java based monitoring, which was 
rather nice to manipulate but a machine-name-change broke everything, so I 
looked for something a tick more modern.

At curriki.org, a site with lots of visitors, there's quite a few tools used to 
monitor.
- First, for the safety and honesty of a system outside, alertsite.com is used. 
It is very effective at detecting breakges, including potential internet 
backbones'. We use monitoring from three locations.
- Second, because, indeed, the XWiki servers sometimes need a push, there used 
to be a regular script that checks a basic page and, if failed, auto-restarts 
the app-server. For us, this is a bit unsafe because we like to control things 
after a restart.
- Third, for a while, we have been running a combined monitoring which 
allowed to combine a small graphical view synced with logs of apache, the 
app-server, thread-dumps, and mysql. This allowed to catch bad actions which 
sometimes happen when power users perform actions which trigger too big queries 
which locked others (group-deletions were such an action).
- Finally, we also added a zabbix which collects http monitoring as well as 
other classical values (disks, memory, apache-stats, …).
The rhythm at curriki is about a week… after a week, one of the two cluster 
nodes (there's two currently) needs a restart because some memory gets 
exhausted and the GC starts to fail. We generally get alertsite errors then.

The interest of running a monitoring infrastructure such as zabbix, is that you 
can analyze the behaviors of multiple variables and see if there is a way to 
predict if things are getting wrong. It remains a guts' feeling story but still 
gives you quite some confidence.

It would be really nice if we could converge on a set of JMX analysis items 
for zabbix so that we could be analyzing more concretely the xwiki-relevant 
information (in particular the cache behaviors) and start adjusting to less  
fall out of memory.

paul
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-11-01 Thread Paul Libbrecht
Bryn,

without JMX, straight and simple console output and thread-dump based… that's 
what the monitor-sample tool was doing, it has a potential to answer the 
back-end of that (e.g. DB-queries). In particular, it is the only one that can 
express the query being hanging and this is exactly what was needed for us.

JMX is a simple way to communicate to the internals of the java process. 
Standard things are typically delivered with JMX (e.g. the heap size).


On 1 nov. 2014, at 20:15, Bryn Jeffries bryn.jeffr...@sydney.edu.au wrote:
 Many thanks for your contribution. I'll certainly look into Zabbix, although 
 I must confess to being aghast at what appears to be a large and complex tool 
 for what I'd hoped was quite simple. I hadn't realised these servers were so 
 temperamental. Before I loose myself in getting acquainted with a new 
 sophisticated product, could you tell me whether Zabbix (or something else) 
 will help me identify the following?:
 - When are users suffering timeouts (doesn't have to be real time, happy to 
 check summary later)

Yes, you'd get that with the tomcat connection time or apache workers.

 - Where was the timeout occuring (network, Apache, Tomcat, Postgres)

That's a tick more delicate since one timeout creates others...

 - What was the cause of the timeout (too many connections, low memory, long 
 Java operation, long query, etc)

You really can't disambiguate this so clearly.
But you can see the amount of connections the heap memory or the amount of 
active threads with JMX. That helps you in this direction, I think.

 - What specific item (Java program, DB query) was responsible
 I wonder whether all this should be discoverable in the logs, with the right 
 configuration.

The DB query I could only get with the combined log monitor… but if you can get 
warned by zabbix, then you can run a get full processlist from mysql.
FWIW, the combined monitor is at 
https://github.com/xwiki-contrib/xwiki-clams-core/tree/master/tools/appservmonitoring
 but it is likely to be very specific to our installation (e.g. it requires 
key-based ssh).

 I've seen a lot of mention of JMX for Tomcat monitoring, but I've shied away 
 from it since I wanted to start simple, but perhaps there is no simple ... ;-(

Typically, this gets you measures for free… some of which can really be useful.

paul

___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-10-31 Thread Jason Clemons
I's also find any suggestions very helpful, I've had that happen a few times 
and outside of monitoring CPU and RAM, I've found logging to be difficult to 
use and configure, and even when I get it configured it's not very helpful.



 On Oct 31, 2014, at 1:57 PM, Bryn Jeffries bryn.jeffr...@sydney.edu.au 
 wrote:
 
 Having made my XWiki site available to other users, I was concerned to find 
 that the site became unusable at one point with client connections eventually 
 timing out. I had no way to diagnose the problem, but eventually I managed to 
 make a (slow) SSH connection to the server and restarted Tomcat, and things 
 seemed to settle back to normal.
 
 The problem is I have no real sense of what happened and how to prevent it 
 happening again. To that end, I'd appreciate any suggestions for monitoring 
 the server and diagnosing poor performance. What do others typically use? I 
 have an Apache2 server passing wiki page requests to Tomcat7 via an ajp 
 connector, and a PostgreSQL database. My guess is that Tomcat is doing most 
 of the work here so that's probably what I need to monitor the most.
 ___
 users mailing list
 users@xwiki.org
 http://lists.xwiki.org/mailman/listinfo/users
___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users


Re: [xwiki-users] Monitoring an Xwiki stack

2014-10-31 Thread Paul Libbrecht
Hello all,

Here's my experience at monitoring XWikis.

With i2geo.net and with my private XWiki, I use a zabbix server.
This php-based monitoring tool is quite easy to configure for http monitoring 
and with a few more steps you get a mail notification when, e.g., a timeout 
occurs in connections.
I've been using HypericHQ for a while, a java based monitoring, which was 
rather nice to manipulate but a machine-name-change broke everything, so I 
looked for something a tick more modern.

At curriki.org, a site with lots of visitors, there's quite a few tools used to 
monitor.
- First, for the safety and honesty of a system outside, alertsite.com is used. 
It is very effective at detecting breakges, including potential internet 
backbones'. We use monitoring from three locations.
- Second, because, indeed, the XWiki servers sometimes need a push, there used 
to be a regular script that checks a basic page and, if failed, auto-restarts 
the app-server. For us, this is a bit unsafe because we like to control things 
after a restart.
- Third, for a while, we have been running a combined monitoring which 
allowed to combine a small graphical view synced with logs of apache, the 
app-server, thread-dumps, and mysql. This allowed to catch bad actions which 
sometimes happen when power users perform actions which trigger too big queries 
which locked others (group-deletions were such an action).
- Finally, we also added a zabbix which collects http monitoring as well as 
other classical values (disks, memory, apache-stats, …).
The rhythm at curriki is about a week… after a week, one of the two cluster 
nodes (there's two currently) needs a restart because some memory gets 
exhausted and the GC starts to fail. We generally get alertsite errors then.

The interest of running a monitoring infrastructure such as zabbix, is that you 
can analyze the behaviors of multiple variables and see if there is a way to 
predict if things are getting wrong. It remains a guts' feeling story but still 
gives you quite some confidence.

It would be really nice if we could converge on a set of JMX analysis items 
for zabbix so that we could be analyzing more concretely the xwiki-relevant 
information (in particular the cache behaviors) and start adjusting to less  
fall out of memory.

paul



On 31 oct. 2014, at 22:29, Jason Clemons jason.clem...@live.com wrote:

 I's also find any suggestions very helpful, I've had that happen a few times 
 and outside of monitoring CPU and RAM, I've found logging to be difficult to 
 use and configure, and even when I get it configured it's not very helpful.
 
 
 
 On Oct 31, 2014, at 1:57 PM, Bryn Jeffries bryn.jeffr...@sydney.edu.au 
 wrote:
 
 Having made my XWiki site available to other users, I was concerned to find 
 that the site became unusable at one point with client connections 
 eventually timing out. I had no way to diagnose the problem, but eventually 
 I managed to make a (slow) SSH connection to the server and restarted 
 Tomcat, and things seemed to settle back to normal.
 
 The problem is I have no real sense of what happened and how to prevent it 
 happening again. To that end, I'd appreciate any suggestions for monitoring 
 the server and diagnosing poor performance. What do others typically use? I 
 have an Apache2 server passing wiki page requests to Tomcat7 via an ajp 
 connector, and a PostgreSQL database. My guess is that Tomcat is doing most 
 of the work here so that's probably what I need to monitor the most.

___
users mailing list
users@xwiki.org
http://lists.xwiki.org/mailman/listinfo/users