[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-08-25 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592747#comment-16592747
 ] 

Shawn Heisey commented on SOLR-11934:
-

Side note.  Probably needs its own issue.  Doesn't impact the main log that 
this issue is concerned about, it's separate.  I'm only commenting about it 
here because this issue is open and tangentially related, and I'm trying to get 
a sense as to whether or not I should create the new issue.

I think we should enable the Jetty request log in the Solr download.  The 
config is already there, it just needs to be uncommented.It even has config 
to delete old logs -- retention is set to 90 days, which might be too long.

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-08-25 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592488#comment-16592488
 ] 

Shawn Heisey commented on SOLR-11934:
-

I do like the notion of changing some of the classes in the config to WARN.  It 
can make some real progress in the short term with not a lot of effort.

Long term, I think we should try to adjust the overall choice of logging levels 
in the code so that the logging config needs fewer entries.  If we do leave 
some entries at WARN, they should be entries that are *commonly* needed for 
deeper troubleshooting, but don't provide much value for a system that's 
working as expected.  I'm not sure what classes those would be.

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-08-24 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591808#comment-16591808
 ] 

David Smiley commented on SOLR-11934:
-

+1 to "change the configs to set the level of these classes to WARN as a 
default"

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-08-22 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589690#comment-16589690
 ] 

Erick Erickson commented on SOLR-11934:
---

OK, circling back around to this, I'm finally breaking some time loose. I 
looked at a few more logs and here's the list (it includes what's above) of 
classes that produce, by far, the largest percentage of messages:

org.apache.solr.update.DirectUpdateHandler2;
 org.apache.solr.update.processor.LogUpdateProcessor;
 org.apache.solr.servlet.HttpSolrCall; (not sure how this one crept in here, 
may be peculiar to that client using HTTP connections)
 org.apache.solr.core.SolrCore;
 org.apache.solr.update.SolrIndexWriter;
 org.apache.solr.search.SolrIndexSearcher;

So here's what I'm thinking now that there's a controlled number of logging 
config files.

Rather than debate forever about when a message is a WARN or INFO or DEBUG or 
whatever, what about enabling/disabling specific class logging levels? Either
 * Move the frequent messages to DEBUG with a comment in the config about how 
to enable DEBUG level for these classes
 * change the configs to set the level of these classes to WARN as a default

I like the latter best for two reasons:
 * it doesn't make someone have to dig around for "how to I change the config 
for these classes?" The answer is "either comment out the specific class levels 
in log4j2.xml or change WARN to INFO".
 * it has the least amount of code changes. Well, none really just config 
changes.

 

As part of SOLR-11884 I'll be looking at pretty much all the log messages 
anyway and _may_ change some of the levels we log some messages at, but that's 
a tale for another day and we can argue about them there ;)

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-05-08 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467930#comment-16467930
 ] 

Mark Miller commented on SOLR-11934:


I think if we turn this off by default and document how to easily turn it back 
on, we don't have a problem. We never have promised back compat with our 
logging. What's more scary in SOLR-12055 is that you may think things are 
working as you expect for what can be a very sensitive function when they are 
not. If you actually count on certain logging from an issue like this though, 
when you upgrade you have to understand what logging exists for that version 
and how your tools work with what you expect - but at least you can count on 
the logging you as you would expect to if you see it in testing.

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-05-08 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467495#comment-16467495
 ] 

Erick Erickson commented on SOLR-11934:
---

[~markrmil...@gmail.com] Your comment on SOLR-12055 got me to wondering. Do we 
have to accommodate audit logging in this too? If so, I think it'd be a 
documentation issue, as you say in 12055 announce LOUDLY.

Plus, that informs the question about what level we log individual messages 
at...

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-05-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465435#comment-16465435
 ] 

Mark Miller commented on SOLR-11934:


I'm pro reducing the amount of default index logging. It's very rarely been 
useful to me in debugging problems from a customer site. When you need that 
level of detail, it's usually when trying to trace things during a reproduction 
attempt or something. My experience is that all the logging makes actual 
debugging that you do on a productions systems logs much more difficult and I 
can't think of a case where I have used this info from production logs.

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-05-05 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464824#comment-16464824
 ] 

Shawn Heisey commented on SOLR-11934:
-

[~erickerickson], I like your ideas.

The query logging is also a significant contributor, *especially* on sharded 
indexes.  SOLR-11453 got me thinking in the background about the possibility of 
sending query (technically, actually request, which includes query) logs to 
their own file as well.

Updates do result in particularly verbose and numerous logs.  I wonder if 
there's any value to extending the 'separate logfile' paradigm for the detailed 
information, with the summary info you suggested in the main log.

The only disadvantage I can think of with separate logfiles for primary logging 
events, and the main notion that makes me *slightly* resistant to making that 
config default, is that it is very difficult to reconstruct a narrative of 
events that are logged at the same millisecond in those separate files.  If 
they're all in the same file, it's usually very easy to determine the order of 
events.  If we do make separate request/update logs a default config, the 
documentation must explain how to turn those separate logs off.  That would be 
vital to troubleshooting issues that only show up under heavy load.


> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-05-04 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464575#comment-16464575
 ] 

Erick Erickson commented on SOLR-11934:
---

I looked at sample logs from a client and wrote a little program to bucket 
them. NOTE: this was an index-heavy type application. Here are the number of 
log messages originating in these files:

35 occurrences in 
org.apache.solr.common.cloud.ZkStateReader$StateWatcher;
48 occurrences in org.apache.solr.core.QuerySenderListener;
55 occurrences in org.apache.solr.servlet.HttpSolrCall;
 64388 occurrences in org.apache.solr.update.SolrIndexWriter;
 64413 occurrences in org.apache.solr.search.SolrIndexSearcher;
 64601 occurrences in org.apache.solr.core.SolrCore;
128795 occurrences in org.apache.solr.update.DirectUpdateHandler2;
   3345415 occurrences in 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor;

that got me to thinking about the _value_ of certain messages when 
troubleshooting. Is the value in the individual messages or in getting a sense 
of how things are behaving?

So WDYT about logging some summary information instead of individual messages 
for high-cardinality messages? Rather than logging every update at info level, 
would it be "useful enough" when troubleshooting to log a summary of updates' 
vital statistics every N seconds?

Here's what I'm thinking. Pretty often when troubleshooting, we want to 
correlate heavy request activity for Solr to something else (GC, slowdowns, 
recoveries, whatever). Using updates as an example, what if at info level, we 
logged something like

INFO  - 2018-05-04 03:35:09.206 over the last T seconds there were N update 
requests updating D documents taking min/avg/median/max milliseconds to 
complete with min/average/median/max documents per request.

With the new metrics capabilities, that information may be available already, 
along with lots of other information (percentiles and the like).

If seeing the underlying individual requests were important, one could turn on 
finer-grained logging for this class only. We could even supply commented-out 
ways to enable voluminous messages for anything we choose to write summary 
information out in the log4j2 conifgs.

I totally understand that for gnarly problems you want (and need) voluminous 
output. I totally understand that sometimes you can't predict that up front. I 
totally understand that in some cases this would introduce an extra round-trip, 
"you have to enable this option in your logging configuration, wait for it to 
happen again and call me".

I also totally understand that in large Solr operations, the log files are 
very, very hard to use because there's so much information being dumped by 
potentially hundreds of Solr instances.

I also totally understand that I've spend more time than I like to remember 
writing programs to parse log files to give me summary information like the 
above to know where to start looking for problems. Why not have Solr do it?

This example is just for updates, but queries could do something very similar 
along with whatever else fits this pattern. Queries certainly come to mind.

The question I'm really asking is if we're logging intelligently or not. 
Basically we're throwing everything that anyone _might_ be interested into a 
log file and then requiring the users (and consultants) to figure out how to 
make it useful. We know enough about how this information is used I think to do 
some of it up front, always with the option to drop back to the finer-grained 
information when necessary.

Anyway, what do people think?

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If 

[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-03-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415596#comment-16415596
 ] 

Jan Høydahl commented on SOLR-11934:


Thanks for the tip!

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-03-27 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415556#comment-16415556
 ] 

Shawn Heisey commented on SOLR-11934:
-

bq. It would also be good if we could change request logging from being done in 
SolrCore to some other class SolrRequests so that it is easy to redirect normal 
query/update logs to a separate log file and keeping system-level logs in 
solr.log. Any thoughts?

Even though request logging is actually done in the SolrCore class, it's 
already possible to handle it differently in the log4j config.  The logger used 
for requests is org.apache.solr.core.SolrCore.Request:

{code:java}
  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
  public static final Logger requestLog = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass().getName() + 
".Request");
{code}

Side note: Is there actually any reason to make the request logger public like 
it is now?  I can't think of one.  It's already possible for users writing 
their own code to access ANY logger in the Solr ecosystem by name.


> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-03-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415339#comment-16415339
 ] 

Jan Høydahl commented on SOLR-11934:


Remember that the start scripts has support for {{-q}} option for setting log 
level to WARN on startup, and {{-v}} for setting it to DEBUG. You could also 
set a variable {{SOLR_LOG_LEVEL=ERROR}} if you require something else. I think 
the ref guide mentions this. But I agree that there are probably still many 
places where we log to the wrong level, so a global cleanup of this would be 
great.

It would also be good if we could change request logging from being done in 
{{SolrCore}} to some other class {{SolrRequests}} so that it is easy to 
redirect normal query/update logs to a separate log file and keeping 
system-level logs in solr.log. Any thoughts?

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-03-26 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414979#comment-16414979
 ] 

Shawn Heisey commented on SOLR-11934:
-

I wrote a lengthy comment on SOLR-7887, with some details that might help with 
deciding what the default logfile rotation size should be.  See the comment on 
that issue for full gory details.  Erick reminded me about this issue as the 
correct place to discuss it.

A summary, part of that long comment:

The 4MB log covered times from 18:51:26.552 to 18:57:58.139, about six and a 
half minutes. The 32MB log covered times from 18:01:23.375 to 18:58:13.886, 
which is almost an hour.

This server, running version 4.7.2, has a pretty low query rate, but each user 
query is quite large in terms URL parameter size -- filters are added by the 
application to limit what the user can see according to their permission level. 
 The log is dominated by the ping requests from the load balancer and those 
very large user queries.  It is my opinion that those logs should NOT be moved 
out of the INFO level.

I don't have any newer versions with production usage, so I can't comment on 
what will happen when running a version where attempts have been made to reduce 
the noise level in the logs.

Based on what I found when I examined my logs, I think the default logfile 
rotation size needs to be fairly large.  The current 4MB is certainly too 
small.  Perhaps 64MB or 128MB?  We'll need to see if we can ask some users 
varying traffic levels to provide us some information about their logfiles, to 
try and come up with a number that will preserve a decent amount of history for 
a "typical" install, without consuming REALLY large amounts of disk space.


> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-21 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371524#comment-16371524
 ] 

Gus Heck commented on SOLR-11934:
-

Forgot slf4j didn't have that level (though note it can be added in... sort 
of..: https://www.slf4j.org/faq.html#fatal). Without that, yeah error has to 
include the panic button. I tend to use log4j2 in most of my projects :), so of 
course I would not mind it here too, but yeah master only, and as much as I 
like it, such a change would probably be no minor thing and need clear a clear 
cost/benefit analysis. 

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371042#comment-16371042
 ] 

Shawn Heisey commented on SOLR-11934:
-

Migrating from slf4j to log4j2, if that idea has support, should only happen in 
master.  I think it's too drastic a change for a minor release.


> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371040#comment-16371040
 ] 

Shawn Heisey commented on SOLR-11934:
-

Thought stream:

I think the default logfile rollover size of 4MB is WAY too small.  On a busy 
server, it can cycle through all ten logfiles in a hurry and what you're 
looking for may not even be available.  I increase this to 4GB on my installs, 
but this is perhaps far too large for a default.  I think it does need to be 
increased.  I wonder if 100MB is too large.  64MB?  Should we switch to 
time-based rollover instead?  Because the time-based logs have variable 
filenames, scripting becomes more difficult.  And on a busy server, one day of 
logging could be REALLY huge.

I'm wary of splitting the logging into separate files.  It sounds like a good 
idea to control verbosity, until you discover that you need to see the order of 
seven different log entries that all happened in the same millisecond, and 
because they're in multiple files, you have no idea what order they occurred.  
The solution to that would probably lead to duplication of logging entries in 
multiple files, so we're back to a situation where too much data is being 
logged.

Logging into a core/collection is an interesting idea, especially if there is a 
UI for querying it, and the problem of recursive logging storms can be 
prevented.  (probably by logging everything for that core/collection ONLY to a 
logfile)

Without doing some research, I'm neutral on the subject of intricate log4j2 
config details and how to use MDC more effectively.

Minor point: I disagree with the assertion [~gus_heck] made that ERROR can be 
looked at in the morning.  I would use that phrase to describe WARN.

I don't see any ability in slf4j to actually log at FATAL, and I've never seen 
any logs from Solr at that level.  The highest severity it seems to give the 
developer is ERROR.  I think there are probably some problems that should be 
logged at this level.  The log4j API has FATAL, but we're not using log4j in 
the code, only slf4j.

Perhaps as an Apache project, we should be eating our own dogfood, not using 
slf4j.  What do people think about the idea of a new issue that migrates slf4j 
to log4j2 and configures it programmatically from solr.xml?  The config could 
have a few very simple options (some for size-based rotation like we have now, 
some for time-based rotation, etc).  I wonder if there's possibly a way that we 
could tell it to configure from a log4j.xml file instead of those easy options, 
so expert users can still have complete control over the logging with all the 
capability that log4j normally has.


> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have 

[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-01 Thread Michael Braun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349051#comment-16349051
 ] 

Michael Braun commented on SOLR-11934:
--

Regarding the %C in the pattern, is it not practical to use %c instead? (logger 
name as opposed to the class calling the logger)

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-01 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349031#comment-16349031
 ] 

Gus Heck commented on SOLR-11934:
-

Yeah, logs having already rolled off is the flip side of the problem of course. 
An idea rolling around in the back of my head for some time after Dave and I 
get Time Routed Aliases fully sorted out is a feature to turn on self-indexing 
of logs (into a TRA of course) to make our own logs searchable. One component 
of that idea might be to separate logging by collection or groups of 
collections. It would be interesting to have a collection parameter that 
defines the logging context for that collection, which then results in a 
sub-dir containing just the logging for operations on the collections with that 
context. Obviously that's non-trivial and other ticket etc. WRT searchable logs 
that would be needed to avoid an endless loop of indexing the logs of indexing 
the logs of indexing the logs etc...

I suspect %C could get better in Java 9, if the stack walking api can be 
leveraged by log4j2... but yeah probably that should be off by default. 
Benchmarking with/without would be interesting (when did it get added? maybe 
the benchmarks were done at at that time)

I'm also wondering if there might be a way to use Markers or MDC to categorize 
logging statements in ways orthogonal to levels. The idea would be to make it 
easy to filter out (or focus in on) overseer message processing stuff or 
whatnot... Of course much of the above has to wait for  SOLR-7887 ...

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an umbrella issue so we can keep all the commits 
> straight and people can volunteer to "fix the files in core" as a separate 
> piece of work (hint).
> There are several existing JIRAs about logging in general, let's link them in 
> here as well.
> Let the discussion begin!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-02-01 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348842#comment-16348842
 ] 

Erick Erickson commented on SOLR-11934:
---

bq:  if someone tells me when it went wrong, with a default Solr install I have 
something to dig into to find out what went wrong

Yeah, that's the tension, isn't it? When it does hit the fan you want 
_everything_, and it's awkward to say "Sorry, the info we need isn't in the 
logs, change your logging and next time it happens immediately zip up all the 
log files and call me..."

Unfortunately with the defaults I find myself saying the above all the time. 
Solr produces so much information that the logs roll over pretty quickly under 
any kind of load, especially indexing. So the info I want is often gone anyway.

Which leads to another discussion about the default configuration ;)

I suspect this going to morph into a wider discussion about "logging in 
general". It's grown ad-hoc so far, your comments about "is there a query log" 
is a question I've heard from clients and on the user's list more than a few 
times for instance.

I don't particularly care if the defaults are producing lots and lots and lots 
of info, At present, though, we don't rationally group log messages, they're 
mostly INFO. The important stuff is buried in a sea of messages. I can't grep 
for WARN and be confident I'll see messages that matter for instance. Of course 
"what matters" often is something arcane ;)...

If the consensus is to move a lot of our INFO messages to DEBUG and distribute 
OOB with DEBUG enabled, that's fine as well I have about 4 different 
versions of my Java program that I use to tease out different "stuff" from the 
log file, mostly just to ignore noise. Which noise I need to filter out changes 
with the problem unfortunately ;)

As an example of how logging has grown without any supervision... Did you know 
that the "%C" bit in the layout pattern generates an exception for each and 
every log line issued? Useful information to be sure when trying to analyze a 
problem but costly steady-state. Not only do we generate a stack trace, but 
also allocate memory, contribute to GC, etc. 

That's an example of how we've viewed logging from a 
development/troubleshooting PoV without considering the needs of the operations 
folks. Much of this discussion will be about thinking about this from an 
operations viewpoint too.



> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and probably me but all help gratefully accepted) needs to go 
> through our logging and assign appropriate levels. This will take quite a 
> while, I intend to work on it in small chunks.
> 2> Actually answer whether unnecessary objects are created when something 
> like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this 
> independent on the logging implementation used? The SLF4J and log4j seem a 
> bit contradictory.
> 3> Maybe regularize log, logger, LOG as variable names, but that's a nit.
> As a tactical approach, I suggest we tag each LoggerFactory.getLogger in 
> files we work on with //SOLR-(whatever number is assigned when I create 
> this). We can remove them all later, but since I expect to approach this 
> piecemeal it'd be nice to keep track of which files have been done already.
> Finally, I really really really don't want to do this all at once. There are 
> 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting 
> now it would probably span the 7.3 release.
> This will probably be an 

[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.

2018-01-31 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348111#comment-16348111
 ] 

Gus Heck commented on SOLR-11934:
-

First off, let me say I LIKE the heavy logging. When the $#!7 hits the fan, if 
someone tells me when it went wrong, with a default Solr install I have 
something to dig into to find out what went wrong. Often the answer is 
something like: "because you had two mutually exclusive filters selected" or 
"the record was added but not committed yet" or "That's because the system 
paused for a stop the world GC for 40 minutes" or "That looks like a pagination 
bug in the UI, they added a filter but didn't reset the pagination" or "the 
term you're searching for is in a field that's not indexed" Many of these 
things rely on knowing *exactly* what the query or order of events was from 
*Solr's* perspective (particularly vs what the customer +_thought_+ they sent 
for a query).

The case of a "very large system" where logging is burdensome is IMHO an edge 
case. Perhaps some sort of ready made profiles for this edge case should exist 
but there are almost always a whole lot more little fish than big fish in the 
sea.

That said, yes, the level of logging should be appropriate.  Since you asked 
for opinions, here's my opinion on levels:
 * FATAL - worthy of immediate attention (beeper, SMS, whatever),  zoo keeper 
unreachable, system fails to come up etc.
 * ERROR - Something serious, should be looked at in the morning or at least 
some time soon, may be prelude to FATAL showing up, or high likelihood of 
customer bug report, very rarely something that can be ignored.
 * WARN - Something questionable, possible backlog ticket, possibly ignore/turn 
off in some cases.
 * INFO - that which is likely to be useful figure out what led up to the ERROR 
that woke you at 3am... Also that which will be helpful in trouble shooting 
customer bug reports (yesterday at we got this empty result when there were 
definitely documents that should have matchedturns out to be a pagination 
bug). This definitely includes queries, updates, admin commands, gc logs etc.
 * DEBUG - That which might help in trouble shooting or finding a bug, or 
trouble shooting ane odd behavior. Rarely by admins, usually by devs
 * TRACE - Stuff only ever turned on by devs missing tufts of hair trying to 
figure out things like "what order did those 2 threads run X and Y in"

Possibly a separate ticket but related: Breaking things out into separate files 
with potentially different rotation frequencies, verbosities etc. seems like a 
decent idea. I only rarely want to see admin/updates/queries together, and none 
of those alongside system logging... GC is already separated, which is 
excellent. IMHO Turning off logging of queries/updates/whatever up front is 
premature optimization and presents a usability issue. Splitting things into 
more focused files probably helps most folks (except those who have been 
enjoying the firehose with log analysis tools I suppose) but that's just my 
opinion. If things are broken up it should be possible to restore the firehose 
for back compatibility.

The lack of query/update logs out of the box is one of the things that 
irritates me about Elastic FWIW.

Obviously it would be a different ticket, but maybe some additional admin UI 
controls  to augment the huge tree of loggers would be good? I'm thinking broad 
presets for default/quiet/verbose for each type of log. 

 

> Visit Solr logging, it's too noisy.
> ---
>
> Key: SOLR-11934
> URL: https://issues.apache.org/jira/browse/SOLR-11934
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I think we have way too much INFO level logging. Or, perhaps more correctly, 
> Solr logging needs to be examined and messages logged at an appropriate level.
> We log every update at an INFO level for instance. But I think we log LIR at 
> INFO as well. As a sysadmin I don't care to have my logs polluted with a 
> message for every update, but if I'm trying to keep my system healthy I want 
> to see LIR messages and try to understand why.
> Plus, in large installations logging at INFO level is creating a _LOT_ of 
> files.
> What I want to discuss on this JIRA is
> 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE 
> levels?
> 2> Who's the audience at each level? For a running system that's functioning, 
> sysops folks would really like WARN messages that mean something need 
> attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? 
> TRACE?
> So let's say we get some kind of agreement as to the above. Then I propose 
> three things
> 1> Someone (and