Walter Underwood wrote:
Yes, that is possible, but we also monitor Apache, Tomcat, the JVM, and
OS through JMX and other live monitoring interfaces. Why invent a real-time
HTTP log analysis system when I can fetch /search/stats.jsp at any time?
"there are lies, damnd lies, and statistics"

The stats page, while useful, can be dangerous if you don't fully understand what you are looking at and it's limitations.

3 examples spring to mind.

1. a development group fetching 20k of records at a time every hour or so off a production server. This would represent itself on by the average rows returned going to 10. So at the casual glance, it looked like every request was returning 10, and the dev request was hidden.

2. the average isn't broken down enough. certain queries are more complex than others, and usually executed less often than the simple ones. These skew the numbers, and hide the bad performers

3. variation is the key to performance, not averages IMHO. I would prefer responses delivered slower, but with no variations (ie it will always take 2ms), then generally faster queries which bite every once and a while. Averages
dont show you this.


ok.. thats my rant about averages gone. now of course you could always implement variation and percentile performance figures inside of SolR and show them on the stats page, but this can get complex quickly.
By "number of rows fetched", do you mean "number of documents matched"?
yep, I just grabbed the log format of a similar search appliance we use, It would be better to add solr/lucene specific
things to it. I'm not tied to it.
The log you describe is pretty useful. Ultraseek has something similar
and that is the log most often used by admins. I'd recommend also
logging the start and rows part of the request so you can distinguish
between new queries and second page requests. If possible, make the
timestamp the same as the HTTP access log so you can correlate the
entries.
yep, that's a good idea as well.

I tinkered a bit more with the idea last night and came up with https://issues.apache.org/jira/browse/SOLR-232 which allows a tomcat container access to somewhere we can put the data we need from the request.

feedback is welcome, as I defiantly don't have all the answers, and am a newbie when it comes to solr ;-)

--Ian
wunder

On 5/9/07 9:43 PM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:
Walter Underwood wrote:
This is for monitoring -- what happened in the last 30 seconds.
Log file analysis doesn't really do that.
I would respectfully disagree.
Log file analysis of each request can give you that, and a whole lot more.

you could either grab the stats via a regular cron job, or create a separate
filter to parse them real time.
It would then let you grab more sophisticated stats if you choose to.

What I would like to know is (and excuse the newbieness of the question) how
to enable solr to log a file with the following data.

- time spent (ms) in the request.
- IP# of the incoming request
- what the request was (and what handler executed it)
- a status code to signal if the request failed for some reasons
- number of rows fetched
and - the number of rows actually returned

is this possible? (I'm using tomcat if that changes the answer).

regards
Ian



Reply via email to