[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266282#comment-15266282 ] Marcus Eriksson commented on CASSANDRA-10805: - Nice, +1 > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262904#comment-15262904 ] Carl Yeksigian commented on CASSANDRA-10805: [~krummas] I reran the logall branch dtests, and they look much better now. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242914#comment-15242914 ] Carl Yeksigian commented on CASSANDRA-10805: I pushed a new update that includes a fix, and then also pushed a test to make sure that activate the compaction logger for all column families. [utest|http://cassci.datastax.com/job/carlyeks-ticket-10805-logall-testall/] [dtest|http://cassci.datastax.com/job/carlyeks-ticket-10805-logall-dtest/] I need to dig into the dtest results to figure out whether they are being caused by the new logging. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223834#comment-15223834 ] Marcus Eriksson commented on CASSANDRA-10805: - Wanted to run all tests with the logging default to on, but I get this on startup: {code} java.lang.NullPointerException at org.apache.cassandra.db.compaction.CompactionStrategyManager.getStrategyFolders(CompactionStrategyManager.java:674) at org.apache.cassandra.db.compaction.CompactionLogger.startStrategy(CompactionLogger.java:172) at org.apache.cassandra.db.compaction.CompactionLogger.lambda$compactionStrategyMap$1(CompactionLogger.java:126) at java.util.ArrayList.forEach(ArrayList.java:1249) at org.apache.cassandra.db.compaction.CompactionLogger.lambda$forEach$0(CompactionLogger.java:120) at java.util.Arrays$ArrayList.forEach(Arrays.java:3880) at org.apache.cassandra.db.compaction.CompactionLogger.forEach(CompactionLogger.java:120) at org.apache.cassandra.db.compaction.CompactionLogger.compactionStrategyMap(CompactionLogger.java:126) at org.apache.cassandra.db.compaction.CompactionLogger.startStrategies(CompactionLogger.java:213) at org.apache.cassandra.db.compaction.CompactionLogger.enable(CompactionLogger.java:221) at org.apache.cassandra.db.compaction.CompactionStrategyManager.startup(CompactionStrategyManager.java:150) at org.apache.cassandra.db.compaction.CompactionStrategyManager.reload(CompactionStrategyManager.java:246) at org.apache.cassandra.db.compaction.CompactionStrategyManager.(CompactionStrategyManager.java:86) at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:408) at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:367) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:577) at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:554) at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:383) at org.apache.cassandra.db.Keyspace.(Keyspace.java:320) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:130) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:107) at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:889) at org.apache.cassandra.service.StartupChecks$8.execute(StartupChecks.java:297) at org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:106) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:169) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:680) {code} other than that, code LGTM > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15221872#comment-15221872 ] Carl Yeksigian commented on CASSANDRA-10805: I've pushed a new version which addresses the comments. I started a {{CompactionLogger.Writer}} interface; wanted to know if this made sense, or if we should change to having these be objects that could be serialized as JSON to be serialized differently for different interfaces. I've just kicked off new utests/dtests, so we'll see how it looks. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185160#comment-15185160 ] Carl Yeksigian commented on CASSANDRA-10805: [~spo...@gmail.com] It isn't possible to use logback as we do everywhere else here, as we need to know when the log file has rotated and add all of the current sstables. If we don't have that, the log file won't represent the state of the compaction strategy at the time we start. We'll end up replacing the custom logging logic with logback, but right now I'm just using something really simple to focus on the rest of the logging system. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184916#comment-15184916 ] Stefan Podkowinski commented on CASSANDRA-10805: This looks really promising. I've played around with the branch and did some minor changes in a [PR|https://github.com/carlyeks/cassandra/pull/1/files]. However, I'm still not sure why you plan to implement your own file rolling logic. Getting files rolled by logback and archive them manually afterwards would work perfectly fine for me. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > Fix For: 3.x > > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182983#comment-15182983 ] Marcus Eriksson commented on CASSANDRA-10805: - it looks good, a few comments; * timestamps on all log entries * it could perhaps be useful to log if the compaction strategy handles repaired or unrepaired data and which data directory it is handling instead of the strategy id? Or perhaps log the mapping on startup so we can figure it out? and a nit - remove redundant "this." in CompactionLogger. And an idea - feel free to ignore, could we make the serialization 'pluggable' in CompactionLogger? Then we could for example have all nodes in a cluster write to a socket somewhere so that we don't have to ship log files to visualize? We could do this in a followup ticket when/if anyone needs it though > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180621#comment-15180621 ] Carl Yeksigian commented on CASSANDRA-10805: I've pushed a new version of this branch which is updated for the CASSANDRA-6696 changes, and outputs JSON objects. I haven't looked at how to use just the parts of logback that we would want to yet, but if the current approach looks OK except for reimplementing the log rolling, I'll take a look at it early next week. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167682#comment-15167682 ] Carl Yeksigian commented on CASSANDRA-10805: I've been looking at logback and how we might be able to use logback directly. The biggest problem is that there isn't a notification for when the log file changes. We need to know that the logfile is changing so that we can log out the sstables that are already on disk so that each logfile is independent (and old ones can be deleted). We won't be able to use the logback loggers, or logback.xml, because we won't be able to use the loggers as they are currently defined, since additional information needs to be passed on creation of the logger to know what tables' files to log on a new file. We can still use the infrastructure of logback to be able to execute the mechanics of doing the log rotations so that we aren't responsible for it; it just won't be as seamless as updating the logback.xml files and having it reread it. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110859#comment-15110859 ] mark quinsland commented on CASSANDRA-10805: I had the idea for this JIRA a few months ago, but was too busy/distracted to do anything with it. Really glad to see that it's not only been added by others, but that it's also actively being addressed. Fantastic. We're doing some comparative studies of STCS and DTCS for a huge C* user and these enhancements will really provide actionable metrics for people desiring to tune their compaction procedures. > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081762#comment-15081762 ] Carl Yeksigian commented on CASSANDRA-10805: * I was initially using logback, but changed because I was getting incomplete files (since I didn't know when a new file was created). Looking at the [logback docs|http://logback.qos.ch/manual/appenders.html#RollingFileAppender] it seems like I probably just need to implement these two classes to make sure the logs are complete * That will work well; I'll make sure the logger has the name of the table it is assigned to in order to capture just the output from one table * Good point; this could also simplify some of the multiple-line events > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging
[ https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074953#comment-15074953 ] Marcus Eriksson commented on CASSANDRA-10805: - * Can we use logback to do the logging to file? It should be possible to create a special logger that goes to a separate file. Feels wrong to implement our own log rotation etc for this * To enable/disable, can we just change the log level of that logger? * Would be nice if the logging could be a bit more self-describing and human readable - JSON? > Additional Compaction Logging > - > > Key: CASSANDRA-10805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10805 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Observability >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian >Priority: Minor > > Currently, viewing the results of past compactions requires parsing the log > and looking at the compaction history system table, which doesn't have > information about, for example, flushed sstables not previously compacted. > This is a proposal to extend the information captured for compaction. > Initially, this would be done through a JMX call, but if it proves to be > useful and not much overhead, it might be a feature that could be enabled for > the compaction strategy all the time. > Initial log information would include: > - The compaction strategy type controlling each column family > - The set of sstables included in each compaction strategy > - Information about flushes and compactions, including times and all involved > sstables > - Information about sstables, including generation, size, and tokens > - Any additional metadata the strategy wishes to add to a compaction or an > sstable, like the level of an sstable or the type of compaction being > performed -- This message was sent by Atlassian JIRA (v6.3.4#6332)