[
https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492574#comment-13492574
]
Sylvain Lebresne commented on CASSANDRA-4861:
---------------------------------------------
bq. Ambivalent about whether getting trace-level log messages for free is worth
having to pass logger to each trace call
I agree actually. That started pissing me off always through the patch, not
sure why I kept it.
bq. So I would recommend using MessageFormatter (which is public) instead
That's a good idea. Attached v2 that does that (and stop logging thing to
trace).
> Consider separating tracing from log4j
> --------------------------------------
>
> Key: CASSANDRA-4861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.2.0 beta 1
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 1.2.0
>
> Attachments: 4861.txt, 4861-v2.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j
> appender that "intercepts" all log messages and write them to a system table.
> I'm sorry to not have bring that up during the initial review (it's hard to
> follow every ticket) but before we release this I'd like to have a serious
> discussion on that choice because I'm not convinced (at all) that it's a good
> idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and
> conversely, every traced message to be logged at debug. But I strongly think
> that debug logging and query tracing are not the same thing. Don't get me
> wrong, there is clearly a large intersection between those two things (which
> is fine), but I do think that *identifying* them is a mistake. More
> concretely:
> ** Consider some of the messages we log at debug in CFS:
> {noformat}
> logger.debug("memtable is already frozen; another thread must be flushing
> it");
> logger.debug("forceFlush requested but everything is clean in {}",
> columnFamily);
> logger.debug("Checking for sstables overlapping {}", sstables);
> {noformat}
> Those messages are useful for debugging and have a place in the log at
> debug, but they are noise as far as query tracing is concerned (None have any
> concrete impact on query performance, they just describe what the code has
> done). Or take the following ones from CompactionManager:
> {noformat}
> logger.debug("Background compaction is still running for {}.{} ({}
> remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily,
> count});
> logger.debug("Scheduling a background task check for {}.{} with {}", new
> Object[] {cfs.table.name, cfs.columnFamily,
> cfs.getCompactionStrategy().getClass().getSimpleName()});
> logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
> logger.debug("Aborting compaction for dropped CF");
> logger.debug("No tasks available");
> logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
> logger.debug("Cache flushing was already in progress: skipping {}",
> writer.getCompactionInfo());
> {noformat}
> It is useful to have that in the debug log, but how is any of that useful
> to users in query tracing? (it may be useful to trace if a new compaction
> start or stop, because that does influence query performance, but those
> message do not). Also take the following message logged when a compaction is
> user
> interrupted:
> {noformat}
> if (t instanceof CompactionInterruptedException)
> {
> logger.info(t.getMessage());
> logger.debug("Full interruption stack trace:", t);
> }
> {noformat}
> I can buy that you may want the first log message in the query tracing,
> but the second one is definitively something that only make sense for debug
> logging but not for query tracing (and as a side note, the current
> implementation don't do something sensible as it traces "Full interruption
> stack trace:" but completely ignore the throwable).
> Lastly, and though that's arguably more a detail (but why would we settle
> for something good enough if we can do better) I believe that in some cases
> you want an event to be both logged at debug and traced but having different
> messages could make sense. For instance, in CFS we have
> {noformat}
> logger.debug("Snapshot for " + table + " keyspace data file " +
> ssTable.getFilename() + " created in " + snapshotDirectory);
> {noformat}
> I'm not convinced that snapshot should be part of query tracing given it
> doesn't really have an impact on queries, but even if we do trace it, we
> probably don't care about having one event for each snapshoted file (2
> events, one for the start of the snapshot, one for the end would be enough).
> As it stands, I think query tracing will have a lot of random noises,
> which will not only be annoying but I'm also sure will make users spend time
> worrying about events that have no impact whatsoever. And I've only looked at
> the debug message of 2 classes ...
> ** I also think there could be case where we would want to trace something,
> but not have it in the debug log. For instance, it makes sense in the query
> trace to know how long parsing the query took. But logging too much info per
> query like that will make the debug log unmanageable in many case. And if you
> say, let's log that at TRACE, you have to put the TracingAppender at trace
> and now you get all the junk (junk as far as query tracing is concerned) that
> trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it
> writes its trace in a system table. How come changing some settings in the
> log4j config file can disable that feature for me? I agree it's not a big
> deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an
> example), I think it could make sense to later add a tracing level. I might
> want a very detailed tracing mode where I get very fine grained details like
> what sstable was hit, and how many seeks on that sstable we did and whatnot.
> But as said above, using the log4j TRACE level for that is not convenient
> because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we
> don't have to go through all our debug statements to check which ones make
> sense to also add to query traces. But as lengthly explained above, that's
> not a real advantage as it end up generating trace that are less useful/user
> friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this
> ticket is to discuss those. But if there isn't, I'm very much in favor of
> moving from log4j for this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira