[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592747#comment-16592747 ] Shawn Heisey commented on SOLR-11934: - Side note. Probably needs its own issue. Doesn't impact the main log that this issue is concerned about, it's separate. I'm only commenting about it here because this issue is open and tangentially related, and I'm trying to get a sense as to whether or not I should create the new issue. I think we should enable the Jetty request log in the Solr download. The config is already there, it just needs to be uncommented.It even has config to delete old logs -- retention is set to 90 days, which might be too long. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592488#comment-16592488 ] Shawn Heisey commented on SOLR-11934: - I do like the notion of changing some of the classes in the config to WARN. It can make some real progress in the short term with not a lot of effort. Long term, I think we should try to adjust the overall choice of logging levels in the code so that the logging config needs fewer entries. If we do leave some entries at WARN, they should be entries that are *commonly* needed for deeper troubleshooting, but don't provide much value for a system that's working as expected. I'm not sure what classes those would be. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591808#comment-16591808 ] David Smiley commented on SOLR-11934: - +1 to "change the configs to set the level of these classes to WARN as a default" > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589690#comment-16589690 ] Erick Erickson commented on SOLR-11934: --- OK, circling back around to this, I'm finally breaking some time loose. I looked at a few more logs and here's the list (it includes what's above) of classes that produce, by far, the largest percentage of messages: org.apache.solr.update.DirectUpdateHandler2; org.apache.solr.update.processor.LogUpdateProcessor; org.apache.solr.servlet.HttpSolrCall; (not sure how this one crept in here, may be peculiar to that client using HTTP connections) org.apache.solr.core.SolrCore; org.apache.solr.update.SolrIndexWriter; org.apache.solr.search.SolrIndexSearcher; So here's what I'm thinking now that there's a controlled number of logging config files. Rather than debate forever about when a message is a WARN or INFO or DEBUG or whatever, what about enabling/disabling specific class logging levels? Either * Move the frequent messages to DEBUG with a comment in the config about how to enable DEBUG level for these classes * change the configs to set the level of these classes to WARN as a default I like the latter best for two reasons: * it doesn't make someone have to dig around for "how to I change the config for these classes?" The answer is "either comment out the specific class levels in log4j2.xml or change WARN to INFO". * it has the least amount of code changes. Well, none really just config changes. As part of SOLR-11884 I'll be looking at pretty much all the log messages anyway and _may_ change some of the levels we log some messages at, but that's a tale for another day and we can argue about them there ;) > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467930#comment-16467930 ] Mark Miller commented on SOLR-11934: I think if we turn this off by default and document how to easily turn it back on, we don't have a problem. We never have promised back compat with our logging. What's more scary in SOLR-12055 is that you may think things are working as you expect for what can be a very sensitive function when they are not. If you actually count on certain logging from an issue like this though, when you upgrade you have to understand what logging exists for that version and how your tools work with what you expect - but at least you can count on the logging you as you would expect to if you see it in testing. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467495#comment-16467495 ] Erick Erickson commented on SOLR-11934: --- [~markrmil...@gmail.com] Your comment on SOLR-12055 got me to wondering. Do we have to accommodate audit logging in this too? If so, I think it'd be a documentation issue, as you say in 12055 announce LOUDLY. Plus, that informs the question about what level we log individual messages at... > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465435#comment-16465435 ] Mark Miller commented on SOLR-11934: I'm pro reducing the amount of default index logging. It's very rarely been useful to me in debugging problems from a customer site. When you need that level of detail, it's usually when trying to trace things during a reproduction attempt or something. My experience is that all the logging makes actual debugging that you do on a productions systems logs much more difficult and I can't think of a case where I have used this info from production logs. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464824#comment-16464824 ] Shawn Heisey commented on SOLR-11934: - [~erickerickson], I like your ideas. The query logging is also a significant contributor, *especially* on sharded indexes. SOLR-11453 got me thinking in the background about the possibility of sending query (technically, actually request, which includes query) logs to their own file as well. Updates do result in particularly verbose and numerous logs. I wonder if there's any value to extending the 'separate logfile' paradigm for the detailed information, with the summary info you suggested in the main log. The only disadvantage I can think of with separate logfiles for primary logging events, and the main notion that makes me *slightly* resistant to making that config default, is that it is very difficult to reconstruct a narrative of events that are logged at the same millisecond in those separate files. If they're all in the same file, it's usually very easy to determine the order of events. If we do make separate request/update logs a default config, the documentation must explain how to turn those separate logs off. That would be vital to troubleshooting issues that only show up under heavy load. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464575#comment-16464575 ] Erick Erickson commented on SOLR-11934: --- I looked at sample logs from a client and wrote a little program to bucket them. NOTE: this was an index-heavy type application. Here are the number of log messages originating in these files: 35 occurrences in org.apache.solr.common.cloud.ZkStateReader$StateWatcher; 48 occurrences in org.apache.solr.core.QuerySenderListener; 55 occurrences in org.apache.solr.servlet.HttpSolrCall; 64388 occurrences in org.apache.solr.update.SolrIndexWriter; 64413 occurrences in org.apache.solr.search.SolrIndexSearcher; 64601 occurrences in org.apache.solr.core.SolrCore; 128795 occurrences in org.apache.solr.update.DirectUpdateHandler2; 3345415 occurrences in org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor; that got me to thinking about the _value_ of certain messages when troubleshooting. Is the value in the individual messages or in getting a sense of how things are behaving? So WDYT about logging some summary information instead of individual messages for high-cardinality messages? Rather than logging every update at info level, would it be "useful enough" when troubleshooting to log a summary of updates' vital statistics every N seconds? Here's what I'm thinking. Pretty often when troubleshooting, we want to correlate heavy request activity for Solr to something else (GC, slowdowns, recoveries, whatever). Using updates as an example, what if at info level, we logged something like INFO - 2018-05-04 03:35:09.206 over the last T seconds there were N update requests updating D documents taking min/avg/median/max milliseconds to complete with min/average/median/max documents per request. With the new metrics capabilities, that information may be available already, along with lots of other information (percentiles and the like). If seeing the underlying individual requests were important, one could turn on finer-grained logging for this class only. We could even supply commented-out ways to enable voluminous messages for anything we choose to write summary information out in the log4j2 conifgs. I totally understand that for gnarly problems you want (and need) voluminous output. I totally understand that sometimes you can't predict that up front. I totally understand that in some cases this would introduce an extra round-trip, "you have to enable this option in your logging configuration, wait for it to happen again and call me". I also totally understand that in large Solr operations, the log files are very, very hard to use because there's so much information being dumped by potentially hundreds of Solr instances. I also totally understand that I've spend more time than I like to remember writing programs to parse log files to give me summary information like the above to know where to start looking for problems. Why not have Solr do it? This example is just for updates, but queries could do something very similar along with whatever else fits this pattern. Queries certainly come to mind. The question I'm really asking is if we're logging intelligently or not. Basically we're throwing everything that anyone _might_ be interested into a log file and then requiring the users (and consultants) to figure out how to make it useful. We know enough about how this information is used I think to do some of it up front, always with the option to drop back to the finer-grained information when necessary. Anyway, what do people think? > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415596#comment-16415596 ] Jan Høydahl commented on SOLR-11934: Thanks for the tip! > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415556#comment-16415556 ] Shawn Heisey commented on SOLR-11934: - bq. It would also be good if we could change request logging from being done in SolrCore to some other class SolrRequests so that it is easy to redirect normal query/update logs to a separate log file and keeping system-level logs in solr.log. Any thoughts? Even though request logging is actually done in the SolrCore class, it's already possible to handle it differently in the log4j config. The logger used for requests is org.apache.solr.core.SolrCore.Request: {code:java} private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); public static final Logger requestLog = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass().getName() + ".Request"); {code} Side note: Is there actually any reason to make the request logger public like it is now? I can't think of one. It's already possible for users writing their own code to access ANY logger in the Solr ecosystem by name. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415339#comment-16415339 ] Jan Høydahl commented on SOLR-11934: Remember that the start scripts has support for {{-q}} option for setting log level to WARN on startup, and {{-v}} for setting it to DEBUG. You could also set a variable {{SOLR_LOG_LEVEL=ERROR}} if you require something else. I think the ref guide mentions this. But I agree that there are probably still many places where we log to the wrong level, so a global cleanup of this would be great. It would also be good if we could change request logging from being done in {{SolrCore}} to some other class {{SolrRequests}} so that it is easy to redirect normal query/update logs to a separate log file and keeping system-level logs in solr.log. Any thoughts? > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414979#comment-16414979 ] Shawn Heisey commented on SOLR-11934: - I wrote a lengthy comment on SOLR-7887, with some details that might help with deciding what the default logfile rotation size should be. See the comment on that issue for full gory details. Erick reminded me about this issue as the correct place to discuss it. A summary, part of that long comment: The 4MB log covered times from 18:51:26.552 to 18:57:58.139, about six and a half minutes. The 32MB log covered times from 18:01:23.375 to 18:58:13.886, which is almost an hour. This server, running version 4.7.2, has a pretty low query rate, but each user query is quite large in terms URL parameter size -- filters are added by the application to limit what the user can see according to their permission level. The log is dominated by the ping requests from the load balancer and those very large user queries. It is my opinion that those logs should NOT be moved out of the INFO level. I don't have any newer versions with production usage, so I can't comment on what will happen when running a version where attempts have been made to reduce the noise level in the logs. Based on what I found when I examined my logs, I think the default logfile rotation size needs to be fairly large. The current 4MB is certainly too small. Perhaps 64MB or 128MB? We'll need to see if we can ask some users varying traffic levels to provide us some information about their logfiles, to try and come up with a number that will preserve a decent amount of history for a "typical" install, without consuming REALLY large amounts of disk space. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371524#comment-16371524 ] Gus Heck commented on SOLR-11934: - Forgot slf4j didn't have that level (though note it can be added in... sort of..: https://www.slf4j.org/faq.html#fatal). Without that, yeah error has to include the panic button. I tend to use log4j2 in most of my projects :), so of course I would not mind it here too, but yeah master only, and as much as I like it, such a change would probably be no minor thing and need clear a clear cost/benefit analysis. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371042#comment-16371042 ] Shawn Heisey commented on SOLR-11934: - Migrating from slf4j to log4j2, if that idea has support, should only happen in master. I think it's too drastic a change for a minor release. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371040#comment-16371040 ] Shawn Heisey commented on SOLR-11934: - Thought stream: I think the default logfile rollover size of 4MB is WAY too small. On a busy server, it can cycle through all ten logfiles in a hurry and what you're looking for may not even be available. I increase this to 4GB on my installs, but this is perhaps far too large for a default. I think it does need to be increased. I wonder if 100MB is too large. 64MB? Should we switch to time-based rollover instead? Because the time-based logs have variable filenames, scripting becomes more difficult. And on a busy server, one day of logging could be REALLY huge. I'm wary of splitting the logging into separate files. It sounds like a good idea to control verbosity, until you discover that you need to see the order of seven different log entries that all happened in the same millisecond, and because they're in multiple files, you have no idea what order they occurred. The solution to that would probably lead to duplication of logging entries in multiple files, so we're back to a situation where too much data is being logged. Logging into a core/collection is an interesting idea, especially if there is a UI for querying it, and the problem of recursive logging storms can be prevented. (probably by logging everything for that core/collection ONLY to a logfile) Without doing some research, I'm neutral on the subject of intricate log4j2 config details and how to use MDC more effectively. Minor point: I disagree with the assertion [~gus_heck] made that ERROR can be looked at in the morning. I would use that phrase to describe WARN. I don't see any ability in slf4j to actually log at FATAL, and I've never seen any logs from Solr at that level. The highest severity it seems to give the developer is ERROR. I think there are probably some problems that should be logged at this level. The log4j API has FATAL, but we're not using log4j in the code, only slf4j. Perhaps as an Apache project, we should be eating our own dogfood, not using slf4j. What do people think about the idea of a new issue that migrates slf4j to log4j2 and configures it programmatically from solr.xml? The config could have a few very simple options (some for size-based rotation like we have now, some for time-based rotation, etc). I wonder if there's possibly a way that we could tell it to configure from a log4j.xml file instead of those easy options, so expert users can still have complete control over the logging with all the capability that log4j normally has. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349051#comment-16349051 ] Michael Braun commented on SOLR-11934: -- Regarding the %C in the pattern, is it not practical to use %c instead? (logger name as opposed to the class calling the logger) > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349031#comment-16349031 ] Gus Heck commented on SOLR-11934: - Yeah, logs having already rolled off is the flip side of the problem of course. An idea rolling around in the back of my head for some time after Dave and I get Time Routed Aliases fully sorted out is a feature to turn on self-indexing of logs (into a TRA of course) to make our own logs searchable. One component of that idea might be to separate logging by collection or groups of collections. It would be interesting to have a collection parameter that defines the logging context for that collection, which then results in a sub-dir containing just the logging for operations on the collections with that context. Obviously that's non-trivial and other ticket etc. WRT searchable logs that would be needed to avoid an endless loop of indexing the logs of indexing the logs of indexing the logs etc... I suspect %C could get better in Java 9, if the stack walking api can be leveraged by log4j2... but yeah probably that should be off by default. Benchmarking with/without would be interesting (when did it get added? maybe the benchmarks were done at at that time) I'm also wondering if there might be a way to use Markers or MDC to categorize logging statements in ways orthogonal to levels. The idea would be to make it easy to filter out (or focus in on) overseer message processing stuff or whatnot... Of course much of the above has to wait for SOLR-7887 ... > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an umbrella issue so we can keep all the commits > straight and people can volunteer to "fix the files in core" as a separate > piece of work (hint). > There are several existing JIRAs about logging in general, let's link them in > here as well. > Let the discussion begin! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348842#comment-16348842 ] Erick Erickson commented on SOLR-11934: --- bq: if someone tells me when it went wrong, with a default Solr install I have something to dig into to find out what went wrong Yeah, that's the tension, isn't it? When it does hit the fan you want _everything_, and it's awkward to say "Sorry, the info we need isn't in the logs, change your logging and next time it happens immediately zip up all the log files and call me..." Unfortunately with the defaults I find myself saying the above all the time. Solr produces so much information that the logs roll over pretty quickly under any kind of load, especially indexing. So the info I want is often gone anyway. Which leads to another discussion about the default configuration ;) I suspect this going to morph into a wider discussion about "logging in general". It's grown ad-hoc so far, your comments about "is there a query log" is a question I've heard from clients and on the user's list more than a few times for instance. I don't particularly care if the defaults are producing lots and lots and lots of info, At present, though, we don't rationally group log messages, they're mostly INFO. The important stuff is buried in a sea of messages. I can't grep for WARN and be confident I'll see messages that matter for instance. Of course "what matters" often is something arcane ;)... If the consensus is to move a lot of our INFO messages to DEBUG and distribute OOB with DEBUG enabled, that's fine as well I have about 4 different versions of my Java program that I use to tease out different "stuff" from the log file, mostly just to ignore noise. Which noise I need to filter out changes with the problem unfortunately ;) As an example of how logging has grown without any supervision... Did you know that the "%C" bit in the layout pattern generates an exception for each and every log line issued? Useful information to be sure when trying to analyze a problem but costly steady-state. Not only do we generate a stack trace, but also allocate memory, contribute to GC, etc. That's an example of how we've viewed logging from a development/troubleshooting PoV without considering the needs of the operations folks. Much of this discussion will be about thinking about this from an operations viewpoint too. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and probably me but all help gratefully accepted) needs to go > through our logging and assign appropriate levels. This will take quite a > while, I intend to work on it in small chunks. > 2> Actually answer whether unnecessary objects are created when something > like log.info("whatever {}", someObjectOrMethodCall); is invoked. Is this > independent on the logging implementation used? The SLF4J and log4j seem a > bit contradictory. > 3> Maybe regularize log, logger, LOG as variable names, but that's a nit. > As a tactical approach, I suggest we tag each LoggerFactory.getLogger in > files we work on with //SOLR-(whatever number is assigned when I create > this). We can remove them all later, but since I expect to approach this > piecemeal it'd be nice to keep track of which files have been done already. > Finally, I really really really don't want to do this all at once. There are > 5-6 thousand log messages. Even at 1,000 a week that's 6 weeks, even starting > now it would probably span the 7.3 release. > This will probably be an
[jira] [Commented] (SOLR-11934) Visit Solr logging, it's too noisy.
[ https://issues.apache.org/jira/browse/SOLR-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348111#comment-16348111 ] Gus Heck commented on SOLR-11934: - First off, let me say I LIKE the heavy logging. When the $#!7 hits the fan, if someone tells me when it went wrong, with a default Solr install I have something to dig into to find out what went wrong. Often the answer is something like: "because you had two mutually exclusive filters selected" or "the record was added but not committed yet" or "That's because the system paused for a stop the world GC for 40 minutes" or "That looks like a pagination bug in the UI, they added a filter but didn't reset the pagination" or "the term you're searching for is in a field that's not indexed" Many of these things rely on knowing *exactly* what the query or order of events was from *Solr's* perspective (particularly vs what the customer +_thought_+ they sent for a query). The case of a "very large system" where logging is burdensome is IMHO an edge case. Perhaps some sort of ready made profiles for this edge case should exist but there are almost always a whole lot more little fish than big fish in the sea. That said, yes, the level of logging should be appropriate. Since you asked for opinions, here's my opinion on levels: * FATAL - worthy of immediate attention (beeper, SMS, whatever), zoo keeper unreachable, system fails to come up etc. * ERROR - Something serious, should be looked at in the morning or at least some time soon, may be prelude to FATAL showing up, or high likelihood of customer bug report, very rarely something that can be ignored. * WARN - Something questionable, possible backlog ticket, possibly ignore/turn off in some cases. * INFO - that which is likely to be useful figure out what led up to the ERROR that woke you at 3am... Also that which will be helpful in trouble shooting customer bug reports (yesterday at we got this empty result when there were definitely documents that should have matchedturns out to be a pagination bug). This definitely includes queries, updates, admin commands, gc logs etc. * DEBUG - That which might help in trouble shooting or finding a bug, or trouble shooting ane odd behavior. Rarely by admins, usually by devs * TRACE - Stuff only ever turned on by devs missing tufts of hair trying to figure out things like "what order did those 2 threads run X and Y in" Possibly a separate ticket but related: Breaking things out into separate files with potentially different rotation frequencies, verbosities etc. seems like a decent idea. I only rarely want to see admin/updates/queries together, and none of those alongside system logging... GC is already separated, which is excellent. IMHO Turning off logging of queries/updates/whatever up front is premature optimization and presents a usability issue. Splitting things into more focused files probably helps most folks (except those who have been enjoying the firehose with log analysis tools I suppose) but that's just my opinion. If things are broken up it should be possible to restore the firehose for back compatibility. The lack of query/update logs out of the box is one of the things that irritates me about Elastic FWIW. Obviously it would be a different ticket, but maybe some additional admin UI controls to augment the huge tree of loggers would be good? I'm thinking broad presets for default/quiet/verbose for each type of log. > Visit Solr logging, it's too noisy. > --- > > Key: SOLR-11934 > URL: https://issues.apache.org/jira/browse/SOLR-11934 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I think we have way too much INFO level logging. Or, perhaps more correctly, > Solr logging needs to be examined and messages logged at an appropriate level. > We log every update at an INFO level for instance. But I think we log LIR at > INFO as well. As a sysadmin I don't care to have my logs polluted with a > message for every update, but if I'm trying to keep my system healthy I want > to see LIR messages and try to understand why. > Plus, in large installations logging at INFO level is creating a _LOT_ of > files. > What I want to discuss on this JIRA is > 1> What kinds of messages do we want log at WARN, INFO, DEBUG, and TRACE > levels? > 2> Who's the audience at each level? For a running system that's functioning, > sysops folks would really like WARN messages that mean something need > attention for instance. If I'm troubleshooting should I turn on INFO? DEBUG? > TRACE? > So let's say we get some kind of agreement as to the above. Then I propose > three things > 1> Someone (and