[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972639#comment-16972639 ] Dimitar Dimitrov commented on CASSANDRA-15368: -- Thanks for chasing this down, [~benedict]! I'm glad it turned out that, as initially suspected, you're pretty good at this stuff, and the issue was not lurking from before, but more or less necessitated by the fix for CASSANDRA-15367. Then I guess it makes the most sense if you continue and take care of this. > Failing to flush Memtable without terminating process results in permanent > data loss > > > Key: CASSANDRA-15368 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15368 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log, Local/Memtable >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x > > > {{Memtable}} do not contain records that cover a precise contiguous range of > {{ReplayPosition}}, since there are only weak ordering constraints when > rolling over to a new {{Memtable}} - the last operations for the old > {{Memtable}} may obtain their {{ReplayPosition}} after the first operations > for the new {{Memtable}}. > Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate > the entire range on flush. Ordinarily we only invalidate records when all > prior {{Memtable}} have also successfully flushed. However, in the event of > a flush that does not terminate the process (either because of disk failure > policy, or because it is a software error), the later flush is able to > invalidate the region of the commit log that includes records that should > have been flushed in the prior {{Memtable}} > More problematically, this can also occur on restart without any associated > flush failure, as we use commit log boundaries written to our flushed > sstables to filter {{ReplayPosition}} on recovery, which is meant to > replicate our {{Memtable}} flush behaviour above. However, we do not know > that earlier flushes have completed, and they may complete successfully > out-of-order. So any flush that completes before the process terminates, but > began after another flush that _doesn’t_ complete before the process > terminates, has the potential to cause permanent data loss. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968613#comment-16968613 ] Dimitar Dimitrov commented on CASSANDRA-15368: -- Thanks for the super-quick reply, [~benedict]! I'll definitely check out your patch for CASSANDRA-15367. As for this problem, if me taking potentially (much) longer to fix it isn't a problem for you, I can surely take a stab. Also here's the analysis that I mentioned in my previous reply - all comments appreciated. h4. Defining the problem Let's assume we have a single table (no indexes, no MVs) that's being continuously written to when a single flush for it is requested. We want to examine if we can have the old memtable accepting a write with a higher CL position, and the new memtable accepting a write with a lower CL position - the latter also implies the old memtable rejecting that write. * Below we'll be calling the write with the higher CL position *HW*, its assigned {{OpOrder.Group}} (and the action for assigning it) *HW group*, and its assigned CL position (and the action for assigning it) *HW position*. * Similar for the write with the lower CL position - *LW*, *LW group*, and *LW position*. So to get the (un)desired ordering, we need the following specific results from 3 executions of {{Memtable.accepts(OpOrder.Group, CommitLogPosition)}}: - {{oldMemtable.accepts()}} (called *HW accept?* below), which should return true - {{oldMemtable.accepts()}} (called *LW accept?* below), which should return false - {{newMemtable.accepts()}}, which should return true (not necessary for the analysis below) h4. Some constraints A. For each of the writes, the {{OpOrder.Group}} assignment happens-before the CL position allocation for the corresponding write, which happens-before the {{Memtable.accepts(OpOrder.Group, CommitLogPosition)}} call for the corresponding write. * HW group --hb-> HW position --hb-> HW accept? * LW group --hb-> LW position --hb-> LW accept? B. The CL position allocations are totally (and numerically) ordered by happens-before, due to the way {{CommitLogSegment}}-s are advanced and the way their internal {{allocatePosition}} markers are CAS-ed. * LW position --hb-> HW position C. If {{writeBarrier.issue()}} in the {{Flush}} ctor happens-before HW group, then the final upper CL bound for the old memtable (called *UB* below) has been set, and is guaranteed to be less than HW position, but then HW accept? is guaranteed to return false (because it will see {{writeBarrier}} as not {{null}}, and HW position would be guaranteed to be more than UB) => contradiction * If {{writeBarrier.issue()}} --hb-> HW group => UB --hb-> HW group => UB --hb-> HW position => contradiction * Therefore HW group --hb-> {{writeBarrier.issue()}} * Note that this was not true before the fix for CASSANDRA-8383. D. If {{writeBarrier.issue()}} happens-before LW group, then UB has been set, and is guaranteed to be less than LW position, and therefore less than HW position. Also {{writeBarrier.issue()}} would happen-before HW position, which would happen-before HW accept?. That means that HW accept? will see {{writeBarrier}} as not {{null}}, and UB as set and less than HW position, so is guaranteed to return false => contradiction * If {{writeBarrier.issue()}} --hb-> LW group => UB --hb-> LW position --hb-> HW position && {{writeBarrier.issue()}} --hb-> HW accept? => contradiction * Therefore LW group --hb-> {{writeBarrier.issue()}} E. As a corollary of C. and D., LW group and HW group should both be before the barrier issued by the flush, and therefore *the placements of LW and HW will both be determined by LW position, HW position, and UB*. h4. The case work In order for HW accept? to return true: # ...it could be seeing {{writeBarrier}} as {{null}}, which means to have started before the {{writeBarrier}} is set in {{oldMemtable.setDiscarding}}. ## This implies that LW accept? is started after HW accept? has started - otherwise LW accept? would also have seen {{writeBarrier}} as {{null}} and returned true already => contradiction ## So LW accept? has started after HW accept? has started, and needs to return false because of LW position (see E. why it cannot be due to LW group). This could happen only if UB has been set and is less than LW position. But as setting UB happens after {{oldMemtable.setDiscarding}}, and HW accept? had started before the {{writeBarrier}} is set in {{oldMemtable.setDiscarding}}, UB should be at least HW position, which is more than LW position => contradiction #* If HW accept? start --hb-> writeBarrier set in {{oldMemtable.setDiscarding}} => HW position --hb-> writeBarrier set in {{oldMemtable.setDiscarding}} --hb-> UB => LW position --hb-> UB => contradiction #* Therefore writeBarrier set in {{oldMemtable.setDiscarding}} --hb-> HW accept? start # ...it could have been
[jira] [Comment Edited] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968261#comment-16968261 ] Dimitar Dimitrov edited comment on CASSANDRA-15368 at 11/6/19 11:50 AM: [~benedict], I assume this is something that you're planning to take up yourself, but let me know if you can use a volunteer in any way. Also can you please help me understand some of the details around the pre-conditions for this problem? I'm probably missing something, but I still can't understand: * how _*the last operations for the old Memtable may obtain their ReplayPosition after the first operations for the new Memtable*_ can hold true after CASSANDRA-8383. * how _*Unfortunately, we treat the Memtable range as contiguous, and invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 (with some interaction with CASSANDRA-9669). I'm also wondering, is _*More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter ReplayPosition on recovery*_ related to {{CommitLogReplayer#firstNotCovered(Collection>)}} and its caveats? P.S. Specifically for the upper bound of the old memtable being above the lower bound of the new memtable, I've tried to explicitly write down the possible orderings, and I can't see how that could happen - I'll format and post my notes in a separate comment a bit later. was (Author: dimitarndimitrov): [~benedict], I assume this is something that you're planning to take up yourself, but let me know if you can use a volunteer in any way. Also can you please help me understand some of the details around the pre-conditions for this problem? I'm probably mising something, but I still can't understand: * how _*the last operations for the old Memtable may obtain their ReplayPosition after the first operations for the new Memtable*_ can hold true after CASSANDRA-8383. * how _*Unfortunately, we treat the Memtable range as contiguous, and invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 (with some interaction with CASSANDRA-9669). I'm also wondering, is _*More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter ReplayPosition on recovery*_ related to {{CommitLogReplayer#firstNotCovered(Collection>)}} and its caveats? P.S. Specifically for the upper bound of the old memtable being above the lower bound of the new memtable, I've tried to explicitly write down the possible orderings, and I can't see how that could happen - I'll format and post my notes in a separate comment a bit later. > Failing to flush Memtable without terminating process results in permanent > data loss > > > Key: CASSANDRA-15368 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15368 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log, Local/Memtable >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x > > > {{Memtable}} do not contain records that cover a precise contiguous range of > {{ReplayPosition}}, since there are only weak ordering constraints when > rolling over to a new {{Memtable}} - the last operations for the old > {{Memtable}} may obtain their {{ReplayPosition}} after the first operations > for the new {{Memtable}}. > Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate > the entire range on flush. Ordinarily we only invalidate records when all > prior {{Memtable}} have also successfully flushed. However, in the event of > a flush that does not terminate the process (either because of disk failure > policy, or because it is a software error), the later flush is able to > invalidate the region of the commit log that includes records that should > have been flushed in the prior {{Memtable}} > More problematically, this can also occur on restart without any associated > flush failure, as we use commit log boundaries written to our flushed > sstables to filter {{ReplayPosition}} on recovery, which is meant to > replicate our {{Memtable}} flush behaviour above. However, we do not know > that earlier flushes have completed, and they may complete successfully > out-of-order. So any flush that completes before the process terminates, but > began after another flush that _doesn’t_ complete before the process > terminates, has the potential to cause permanent data loss. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968261#comment-16968261 ] Dimitar Dimitrov commented on CASSANDRA-15368: -- [~benedict], I assume this is something that you're planning to take up yourself, but let me know if you can use a volunteer in any way. Also can you please help me understand some of the details around the pre-conditions for this problem? I'm probably mising something, but I still can't understand: * how _*the last operations for the old Memtable may obtain their ReplayPosition after the first operations for the new Memtable*_ can hold true after CASSANDRA-8383. * how _*Unfortunately, we treat the Memtable range as contiguous, and invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 (with some interaction with CASSANDRA-9669). I'm also wondering, is _*More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter ReplayPosition on recovery*_ related to {{CommitLogReplayer#firstNotCovered(Collection>)}} and its caveats? P.S. Specifically for the upper bound of the old memtable being above the lower bound of the new memtable, I've tried to explicitly write down the possible orderings, and I can't see how that could happen - I'll format and post my notes in a separate comment a bit later. > Failing to flush Memtable without terminating process results in permanent > data loss > > > Key: CASSANDRA-15368 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15368 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log, Local/Memtable >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x > > > {{Memtable}} do not contain records that cover a precise contiguous range of > {{ReplayPosition}}, since there are only weak ordering constraints when > rolling over to a new {{Memtable}} - the last operations for the old > {{Memtable}} may obtain their {{ReplayPosition}} after the first operations > for the new {{Memtable}}. > Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate > the entire range on flush. Ordinarily we only invalidate records when all > prior {{Memtable}} have also successfully flushed. However, in the event of > a flush that does not terminate the process (either because of disk failure > policy, or because it is a software error), the later flush is able to > invalidate the region of the commit log that includes records that should > have been flushed in the prior {{Memtable}} > More problematically, this can also occur on restart without any associated > flush failure, as we use commit log boundaries written to our flushed > sstables to filter {{ReplayPosition}} on recovery, which is meant to > replicate our {{Memtable}} flush behaviour above. However, we do not know > that earlier flushes have completed, and they may complete successfully > out-of-order. So any flush that completes before the process terminates, but > began after another flush that _doesn’t_ complete before the process > terminates, has the potential to cause permanent data loss. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734130#comment-16734130 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- I have reworked the changes as discussed, and am currently testing. Initial unit test results are relatively good - 3.0 and 3.11 seem OK (no failures), and trunk seems to have a bunch of unrelated failures (e.g. {{SingleSSTableLCSTaskTest}} failing with an OOME, {{org.apache.cassandra.dht.tokenallocator}} tests failing with a NPE in {{DatabaseDescriptor.diagnosticEventsEnabled}}). 2.2 has some failures in {{ScrubTest}} and {{SSTableRewriterTest}} that I'd like to take a closer look at. I still don't have initial results for dtests, due to these being harder to land on a CI VM, and longer to run. I'll update when I have something to report though. Here are the draft changes, in case anyone is interested: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov >Priority: Major > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, > c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, > c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722172#comment-16722172 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- First, apologies for making this look like a 19th century trans-continental chess correspondence game... I think you're absolutely right - after [{{849a438690aa97a361227781108cc90355dcbcd9}}|https://github.com/apache/cassandra/commit/849a438690aa97a361227781108cc90355dcbcd9], we return solely candidates from some subset of {{Directories.dataDirectories}}, all of which are initialized in the {{clinit}}, so currently it doesn't look like there's a case in which the error handling could be hit. I agree with your suggestion (although I'm a tiny bit sad that the oh-so-laborious test would go as well). I'll dust off the ancient branches that I had for this and update here with the corresponding patches soon. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov >Priority: Major > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, > c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, > c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Status: In Progress (was: Awaiting Feedback) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov >Priority: Major > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, > c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, > c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542961#comment-16542961 ] Dimitar Dimitrov edited comment on CASSANDRA-13938 at 9/11/18 5:52 AM: --- {quote}The problem is that when {{CompressedInputStream#position()}} is called, the new position might be in the middle of a buffer. We need to remember that offset, and subtract that value when updating {{current}} in {{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double counted on the first call to {{#reBuffer()}} after {{#position()}} as we add the {{buffer.position()}} to {{current}}. {{current}} already accounts for those offset bytes when {{#position()}} was called. {quote} [~jasobrown], isn't that equivalent (although a bit more complex) to just setting {{current}} to the last reached/read position in the stream when rebuffering? (i.e. {{current = streamOffset + buffer.position()}}). I might be missing something, but the role of {{currentBufferOffset}} seems to be solely to "align" {{current}} and {{streamOffset}} the first time after a new section is started. Then {{current += buffer.position() - currentBufferOffset}} expands to {{current = -current- + buffer.position() + streamOffset - -current- }} which is the same as {{current = streamOffset + buffer.position()}}. After that first time, {{current}} naturally follows {{streamOffset}} without the need of any adjustment, but it seems more natural to express this as {{streamOffset + buffer.position()}} instead of the new expression or the old {{current + buffer.position()}}. To me, it's also a bit more intuitive and easier to understand (hopefully it's also right in addition to intuitive :)). The equivalence above would hold true if {{current}} and {{streamOffset}} don't change their value in the meantime, but I think this is ensured by the well-ordered sequential fashion in which the decompressing and the offset bookkeeping functionality of {{CompressedInputStream}} happen in the thread running the corresponding {{StreamDeserializingTask}}. * The aforementioned well-ordered sequential fashion seems to be POSITION followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not update {{current}} with the above calculation in case {{current}} is already too far ahead (i.e. the new section is not starting within the current buffer). was (Author: dimitarndimitrov): {quote}The problem is that when {{CompressedInputStream#position()}} is called, the new position might be in the middle of a buffer. We need to remember that offset, and subtract that value when updating {{current}} in {{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double counted on the first call to {{#reBuffer()}} after {{#position()}} as we add the {{buffer.position()}} to {{current}}. {{current}} already accounts for those offset bytes when {{#position()}} was called. {quote} [~jasobrown], isn't that equivalent (although a bit more complex) to just setting {{current}} to the last reached/read position in the stream when rebuffering? (i.e. {{current = streamOffset + buffer.position()}}). I might be missing something, but the role of {{currentBufferOffset}} seems to be solely to "align" {{current}} and {{streamOffset}} the first time after a new section is started. Then {{current += buffer.position() - currentBufferOffse expands to }}{{current = -current- + buffer.position() + streamOffset - -current- }}which is the same as {{current = streamOffset + buffer.position()}}. After that first time, {{current}} naturally follows {{streamOffset}} without the need of any adjustment, but it seems more natural to express this as {{streamOffset + buffer.position()}} instead of the new expression or the old {{current + buffer.position()}}. To me, it's also a bit more intuitive and easier to understand (hopefully it's also right in addition to intuitive :)). The equivalence above would hold true if {{current}} and {{streamOffset}} don't change their value in the meantime, but I think this is ensured by the well-ordered sequential fashion in which the decompressing and the offset bookkeeping functionality of {{CompressedInputStream}} happen in the thread running the corresponding {{StreamDeserializingTask}}. * The aforementioned well-ordered sequential fashion seems to be POSITION followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not update {{current}} with the above calculation in case {{current}} is already too far ahead (i.e. the new section is not starting within the current buffer). > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL:
[jira] [Commented] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542961#comment-16542961 ] Dimitar Dimitrov commented on CASSANDRA-13938: -- {quote}The problem is that when {{CompressedInputStream#position()}} is called, the new position might be in the middle of a buffer. We need to remember that offset, and subtract that value when updating {{current}} in {{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double counted on the first call to {{#reBuffer()}} after {{#position()}} as we add the {{buffer.position()}} to {{current}}. {{current}} already accounts for those offset bytes when {{#position()}} was called. {quote} [~jasobrown], isn't that equivalent (although a bit more complex) to just setting {{current}} to the last reached/read position in the stream when rebuffering? (i.e. {{current = streamOffset + buffer.position()}}). I might be missing something, but the role of {{currentBufferOffset}} seems to be solely to "align" {{current}} and {{streamOffset}} the first time after a new section is started. Then {{current += buffer.position() - currentBufferOffse expands to }}{{current = -current- + buffer.position() + streamOffset - -current- }}which is the same as {{current = streamOffset + buffer.position()}}. After that first time, {{current}} naturally follows {{streamOffset}} without the need of any adjustment, but it seems more natural to express this as {{streamOffset + buffer.position()}} instead of the new expression or the old {{current + buffer.position()}}. To me, it's also a bit more intuitive and easier to understand (hopefully it's also right in addition to intuitive :)). The equivalence above would hold true if {{current}} and {{streamOffset}} don't change their value in the meantime, but I think this is ensured by the well-ordered sequential fashion in which the decompressing and the offset bookkeeping functionality of {{CompressedInputStream}} happen in the thread running the corresponding {{StreamDeserializingTask}}. * The aforementioned well-ordered sequential fashion seems to be POSITION followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not update {{current}} with the above calculation in case {{current}} is already too far ahead (i.e. the new section is not starting within the current buffer). > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Repair >Reporter: Nate McCall >Assignee: Jason Brown >Priority: Critical > Fix For: 4.x > > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user
[jira] [Commented] (CASSANDRA-14092) Max ttl of 20 years will overflow localDeletionTime
[ https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343378#comment-16343378 ] Dimitar Dimitrov commented on CASSANDRA-14092: -- Some review comments on the dtest and trunk changes: * On the dtest change: ## Shouldn't the dtest docstring [here|https://github.com/apache/cassandra-dtest/commit/83c73ef0a3cbe50232d3a9eea4fd26c877ea58db#diff-a8f4dac4af77196a8c7881abd067a5b9R345] say something related to the TTL problem? ## The start time [here|https://github.com/apache/cassandra-dtest/commit/83c73ef0a3cbe50232d3a9eea4fd26c877ea58db#diff-a8f4dac4af77196a8c7881abd067a5b9R348] seems redundant ## It may be good to extract the max TTL value in a variable - we may decide to keep a version of this test after we patch by just reducing that value, but before we fix it nicely * On the trunk change: ## Maybe it's my English, but [this wording|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-5414c0e96996be355c3aff1184ec859aR48] sounds a bit confusing to me, using "maximum supported date" and "limit date" for the same thing. Thoughts? If you're also hesitant, what do you think about "Rows that should expire after that date would still expire on that date."? ## You can quickly mention the relevant JIRA ticket [here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-b7ca4b9c415e93b6cbfb31daf90cc598R185] ## Qualify the static access to {{Cell.sanitizeLocalDeletionTime}} [here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-b7ca4b9c415e93b6cbfb31daf90cc598R53] ## Could you please add some comments/Javadoc for [{{Cell.sanitizeLocalDeletionTime}}|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-3e9f1fc67f99d27e92a3eb32201d8ca6R311]? I would assume that {{NO_TTL}} and {{NO_DELETION_TIME}} are needed to determine whether the cell is an expiring one, an expired one, or a tombstone, but I'm not too sure ## There are missing spaces between the boolean arguments of the delegation call for some of the unit tests (e.g. [here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-0d8cf6ca6ed99c947903359c1beaf386R74]) > Max ttl of 20 years will overflow localDeletionTime > --- > > Key: CASSANDRA-14092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14092 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Blocker > Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2 > > > CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year > 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for > {{localDeletionTime}}. > It turns out that next year the {{localDeletionTime}} will start overflowing > with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20 > years) > Integer.MAX_VALUE}}), so we should remove this limitation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13801: - Status: Patch Available (was: Open) > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov >Priority: Minor > Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, > c13801-3.11-testall.png, c13801-trunk-testall.png > > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13801: - Attachment: c13801-2.2-testall.png c13801-3.0-testall.png c13801-3.11-testall.png c13801-trunk-testall.png > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov >Priority: Minor > Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, > c13801-3.11-testall.png, c13801-trunk-testall.png > > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280752#comment-16280752 ] Dimitar Dimitrov edited comment on CASSANDRA-13801 at 12/6/17 7:33 PM: --- It turns out that the problem does not necessarily require altering the compaction strategy. It seems to be rooted in a potential problem with counting the CF compaction requests, that can eventually lead to a skipped background compaction. The wrong counting can happen if the counting multiset increment [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197] gets delayed and happens after the corresponding counting multiset decrement already happened [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284]. Here are the branches with the proposed changes, as well as a Byteman test that can be used to demonstrate the issue. testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky test failing). dtest results will be added soon. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2] | [testall|^c13801-2.2-testall.png] | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0] | [testall|^c13801-3.0-testall.png] | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11] | [testall|^c13801-3.11-testall.png] | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk] | [testall|^c13801-trunk-testall.png] | was (Author: dimitarndimitrov): It turns out that the problem does not necessarily require altering the compaction strategy. It seems to be rooted in a potential problem with counting the CF compaction requests, that can eventually lead to a skipped background compaction. The wrong counting can happen if the counting multiset increment [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197] gets delayed and happens after the corresponding counting multiset decrement already happened [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284]. Here are the branches with the proposed changes, as well as a Byteman test that can be used to demonstrate the issue. testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky test failing). dtest results will be added soon. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2] | [testall|^c13801-2.2-testall.png] | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0] | [testall|^c13801-3.0-testall.png] | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11] | [testall|^c13801-3.11-testall.png] | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk] | [testall|^c13801-2.2-testall.png] | > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov >Priority: Minor > Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, > c13801-3.11-testall.png, c13801-trunk-testall.png > > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280752#comment-16280752 ] Dimitar Dimitrov commented on CASSANDRA-13801: -- It turns out that the problem does not necessarily require altering the compaction strategy. It seems to be rooted in a potential problem with counting the CF compaction requests, that can eventually lead to a skipped background compaction. The wrong counting can happen if the counting multiset increment [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197] gets delayed and happens after the corresponding counting multiset decrement already happened [here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284]. Here are the branches with the proposed changes, as well as a Byteman test that can be used to demonstrate the issue. testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky test failing). dtest results will be added soon. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2] | [testall|^c13801-2.2-testall.png] | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0] | [testall|^c13801-3.0-testall.png] | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11] | [testall|^c13801-3.11-testall.png] | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk] | [testall|^c13801-2.2-testall.png] | > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov >Priority: Minor > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166103#comment-16166103 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 12:44 PM: Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and here are the new test results that confirm that. Unfortunately the runs are still not green. Nevertheless the failures seen in the baseline and my run are almost identical, with 2 failures from cassci not showing up in my run, and 1 failure from my run not showing up in cassci. * To better compare the failures, sort the cassci results by test name, as the results from my run have also been sorted this way. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [*testall*|^c13692-testall-results-updated.PNG] | [*dtest*|^c13692-dtest-results-updated.PNG] ([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/]) | was (Author: dimitarndimitrov): Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and here are the new test results that confirm that. Unfortunately the runs are still not green. Nevertheless the failures seen in the baseline and my run are almost identical, with 2 failures from cassci not showing up in my run, and 1 failure from my run not showing up in cassci. * To better compare the failures, sort the cassci results by test name, as the results from my run have also been sorted this way. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [*dtest*|^c13692-dtest-results-updated.PNG] ([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/]) | > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, > c13692-testall-results-updated.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory ==
[jira] [Issue Comment Deleted] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Comment: was deleted (was: Adding screenshots from CI results.) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, > c13692-testall-results-updated.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Comment: was deleted (was: Adding updated screenshots from CI dtest results.) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, > c13692-testall-results-updated.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Attachment: c13692-testall-results-updated.PNG > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, > c13692-testall-results-updated.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Status: Patch Available (was: In Progress) Marking this as having a submitted patch ready for a review. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:29 AM: Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | testall results look good for all branches, but there's a common theme of {{consistency_test.TestConsistency.test_13747}} dtests failing, in addition to the common-expected-to-be-unrelated dtest failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey], do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} results look good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey], do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) >
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:29 AM: Some additional observations, after taking yet another look at the test results: * Although very similar, the 3.11 testall failures are not exactly the same as the ones in the baseline. * The trunk dtest failures seem to diverge from the pattern of "common-expected-to-be-unrelated failures plus test_13747 failures". I'll try to see whether this can be attributed to flakiness (looking closer at the results, re-running the CI run on the same branch, running another CI run on a clean branch copy of the trunk, etc.) was (Author: dimitarndimitrov): Some additional observations, after taking yet another look at the test results: * Although very similar, the 3.11 {{testall}} failures are not exactly the same as the ones in the baseline. * The trunk {{dtest}} failures seem to diverge from the pattern of "common-expected-to-be-unrelated failures plus test_13747 failures". I'll try to see whether this can be attributed to flakiness (looking closer at the results, re-running the CI run on the same branch, running another CI run on a clean branch copy of the trunk, etc.) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:28 AM: I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the results are as follows: * No more {{consistency_test.TestConsistency.test_13747}} failures for the 3.0 and 3.11 changes. Small differences between actual and expected dtest failures, which seem to be flaking out. * The trunk changes are still hitting what seems to be an existing problem plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see https://cassci.datastax.com/job/trunk_dtest/). ** The problem is currently being investigated. Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the testall and dtest criteria, only the trunk changes need to be verified, after the problem affecting trunk gets resolved. I'll post an update once this happens, but in the meantime, it's possible to mark this as ready for review. was (Author: dimitarndimitrov): I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the results are as follows: * No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 3.11 changes. Small differences between actual and expected dtest failures, which seem to be flaking out. * The trunk changes are still hitting what seems to be an existing problem plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see https://cassci.datastax.com/job/trunk_dtest/). ** The problem is currently being investigated. Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the testall and dtest criteria, only the trunk changes need to be verified, after the problem affecting trunk gets resolved. I'll post an update once this happens, but in the meantime, it's possible to mark this as ready for review. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) >
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152425#comment-16152425 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:27 AM: All observed dtest failures, including the {{consistency_test.TestConsistency.test_13747}} failures, reproduce exactly the same on brand new copies of the cassandra-3.11 and trunk branches of my apache/cassandra fork. My fork is by now tens of commits behind the origin, so I'll update the fork, and re-run the CI jobs for the 3.0, 3.11, and trunk branches. I'm expecting to see a much more consistent picture this time. was (Author: dimitarndimitrov): All observed {{dtest}} failures, including the consistency_test.TestConsistency.test_13747 failures, reproduce exactly the same on brand new copies of the cassandra-3.11 and trunk branches of my apache/cassandra fork. My fork is by now tens of commits behind the origin, so I'll update the fork, and re-run the CI jobs for the 3.0, 3.11, and trunk branches. I'm expecting to see a much more consistent picture this time. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail:
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Attachment: c13692-dtest-results-updated.PNG > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-dtest-results-updated.PNG, c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166103#comment-16166103 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and here are the new test results that confirm that. Unfortunately the runs are still not green. Nevertheless the failures seen in the baseline and my run are almost identical, with 2 failures from cassci not showing up in my run, and 1 failure from my run not showing up in cassci. * To better compare the failures, sort the cassci results by test name, as the results from my run have also been sorted this way. | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results-updated.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [*dtest*|^c13692-dtest-results-updated.PNG] ([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/]) | > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); >
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Attachment: c13692-3.0-dtest-results-updated.PNG c13692-3.11-dtest-results-updated.PNG Adding updated screenshots from CI dtest results. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, > c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155061#comment-16155061 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Here's the table with the updated test results (in bold, trunk dtest stability issues notwithstanding): | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [*dtest*|^c13692-3.0-dtest-results-updated.PNG] ([*dtest-baseline*|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [*dtest*|^c13692-3.11-dtest-results-updated.PNG] ([*dtest-baseline*|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/5/17 9:50 PM: -- I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the results are as follows: * No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 3.11 changes. Small differences between actual and expected dtest failures, which seem to be flaking out. * The trunk changes are still hitting what seems to be an existing problem plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see https://cassci.datastax.com/job/trunk_dtest/). ** The problem is currently being investigated. Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the testall and dtest criteria, only the trunk changes need to be verified, after the problem affecting trunk gets resolved. I'll post an update once this happens, but in the meantime, it's possible to mark this as ready for review. was (Author: dimitarndimitrov): I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the results are as follows: * No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 3.11 changes. Small differences between actual and expected dtest failures, which seem to be flaking out. * The trunk changes are still hitting what seems to be an existing problem plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see https://cassci.datastax.com/job/trunk_dtest/). ** The problem is currently being investigated. Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the testall and dtest criteria, only the trunk changes need to be verified, after the problem affecting trunk gets resolved. I'll post an update once this happens, but in the meantime, I'd assume it's safe to mark this as "Awaiting Feedback". > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); >
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the results are as follows: * No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 3.11 changes. Small differences between actual and expected dtest failures, which seem to be flaking out. * The trunk changes are still hitting what seems to be an existing problem plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see https://cassci.datastax.com/job/trunk_dtest/). ** The problem is currently being investigated. Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the testall and dtest criteria, only the trunk changes need to be verified, after the problem affecting trunk gets resolved. I'll post an update once this happens, but in the meantime, I'd assume it's safe to mark this as "Awaiting Feedback". > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13703) Using min_compress_ratio <= 1 causes corruption
[ https://issues.apache.org/jira/browse/CASSANDRA-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153675#comment-16153675 ] Dimitar Dimitrov edited comment on CASSANDRA-13703 at 9/5/17 1:56 PM: -- +1 - from my still largely layman perspective, the change looks good. A couple of small nits: * {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to {{shouldCheckCrc()}}) may be removed or at least its visibility could be reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 and 204 in this patch. * The no-argument / single-argument factory methods for Snappy and LZ4 compressions in CompressionParams.java seem to differ in the values that they default to for min compression ratio and max compressed length. ** For Snappy, if nothing is specified, or only chunk length is specified, a default min compression ratio of 1.1 is used, and therefore max compressed length ends up somewhere roughly around 90% of chunk length. ** For LZ4, if nothing is specified, or only chunk length is specified, a default max compressed length of chunk length is used, and therefore min compression ratio ends up at 1.0 (I'm not sure if a precision error is possible there). Edit: Of course, take this review with the appropriate rock-sized grain of salt. was (Author: dimitarndimitrov): +1 - from my still largely layman perspective, the change looks good. A couple of small nits: * {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to {{shouldCheckCrc()}}) may be removed or at least its visibility could be reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 and 204 in this patch. * The no-argument / single-argument factory methods for Snappy and LZ4 compressions in CompressionParams.java seem to differ in the values that they default to for min compression ratio and max compressed length. ** For Snappy, if nothing is specified, or only chunk length is specified, a default min compression ratio of 1.1 is used, and therefore max compressed length ends up somewhere roughly around 90% of chunk length. ** For LZ4, if nothing is specified, or only chunk length is specified, a default max compressed length of chunk length is used, and therefore min compression ratio ends up at 1.0 (I'm not sure if a precision error is possible there). > Using min_compress_ratio <= 1 causes corruption > --- > > Key: CASSANDRA-13703 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13703 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Blocker > Fix For: 4.x > > Attachments: patch > > > This is because chunks written uncompressed end up below the compressed size > threshold. Demonstrated by applying the attached patch meant to improve the > testing of the 10520 changes, and running > {{CompressedSequentialWriterTest.testLZ4Writer}}. > The default {{min_compress_ratio: 0}} is not affected as it never writes > uncompressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13703) Using min_compress_ratio <= 1 causes corruption
[ https://issues.apache.org/jira/browse/CASSANDRA-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153675#comment-16153675 ] Dimitar Dimitrov commented on CASSANDRA-13703: -- +1 - from my still largely layman perspective, the change looks good. A couple of small nits: * {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to {{shouldCheckCrc()}}) may be removed or at least its visibility could be reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 and 204 in this patch. * The no-argument / single-argument factory methods for Snappy and LZ4 compressions in CompressionParams.java seem to differ in the values that they default to for min compression ratio and max compressed length. ** For Snappy, if nothing is specified, or only chunk length is specified, a default min compression ratio of 1.1 is used, and therefore max compressed length ends up somewhere roughly around 90% of chunk length. ** For LZ4, if nothing is specified, or only chunk length is specified, a default max compressed length of chunk length is used, and therefore min compression ratio ends up at 1.0 (I'm not sure if a precision error is possible there). > Using min_compress_ratio <= 1 causes corruption > --- > > Key: CASSANDRA-13703 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13703 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Blocker > Fix For: 4.x > > Attachments: patch > > > This is because chunks written uncompressed end up below the compressed size > threshold. Demonstrated by applying the attached patch meant to improve the > testing of the 10520 changes, and running > {{CompressedSequentialWriterTest.testLZ4Writer}}. > The default {{min_compress_ratio: 0}} is not affected as it never writes > uncompressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152425#comment-16152425 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- All observed {{dtest}} failures, including the consistency_test.TestConsistency.test_13747 failures, reproduce exactly the same on brand new copies of the cassandra-3.11 and trunk branches of my apache/cassandra fork. My fork is by now tens of commits behind the origin, so I'll update the fork, and re-run the CI jobs for the 3.0, 3.11, and trunk branches. I'm expecting to see a much more consistent picture this time. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 12:03 PM: --- Some additional observations, after taking yet another look at the test results: * Although very similar, the 3.11 {{testall}} failures are not exactly the same as the ones in the baseline. * The trunk {{dtest}} failures seem to diverge from the pattern of "common-expected-to-be-unrelated failures plus test_13747 failures". I'll try to see whether this can be attributed to flakiness (looking closer at the results, re-running the CI run on the same branch, running another CI run on a clean branch copy of the trunk, etc.) was (Author: dimitarndimitrov): Some additional observations, after taking yet another look at the test results: * Although very similar, the 3.11 {{testall}} failures are not exactly the same as the ones in the baseline. * The trunk {{dtest}} failures seem to diverge from the pattern of "common-expected-to-be-unrelated failures plus test_13747 failures". I'll try to see whether this can be attributed to flakiness (looking closer to the results, re-running the CI run on the same branch, running another CI run on a clean branch copy of the trunk, etc.) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150351#comment-16150351 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Ah, sorry, I should have at least attached the failure logs - I didn't attach the build artifacts, as I wasn't sure if they were sanitized with regard to non-public data. I'll sync with a more knowledgeable colleague, and get back to you with the necessary info. P.S. Like you've probably noticed, I'm still new to one of the more visible presences here, and many of the steps in the process are a bit hazy to me - I'll make sure to improve quickly on that though :) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Some additional observations, after taking yet another look at the test results: * Although very similar, the 3.11 {{testall}} failures are not exactly the same as the ones in the baseline. * The trunk {{dtest}} failures seem to diverge from the pattern of "common-expected-to-be-unrelated failures plus test_13747 failures". I'll try to see whether this can be attributed to flakiness (looking closer to the results, re-running the CI run on the same branch, running another CI run on a clean branch copy of the trunk, etc.) > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:51 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} results look good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey], do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}",
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:50 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.PNG] | [dtest|^c13692-2.2-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.PNG] | [dtest|^c13692-3.0-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.PNG] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); >
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:49 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 dtests failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory);
[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov updated CASSANDRA-13692: - Attachment: c13692-2.2-dtest-results.PNG c13692-2.2-testall-results.PNG c13692-3.0-dtest-results.PNG c13692-3.0-testall-results.PNG c13692-3.11-dtest-results.PNG c13692-3.11-testall-results.PNG c13692-dtest-results.PNG c13692-testall-results.PNG Adding screenshots from CI results. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > Attachments: c13692-2.2-dtest-results.PNG, > c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, > c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, > c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, > c13692-testall-results.PNG > > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:46 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:46 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:45 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:44 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:44 AM: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? was (Author: dimitarndimitrov): Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Okay, here are the branches with the proposed changes: | [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2] | [testall|^c13692-2.2-testall-results.png] | [dtest|^c13692-2.2-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/]) | | [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0] | [testall|^c13692-3.0-testall-results.png] | [dtest|^c13692-3.0-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/]) | | [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11] | [testall|^c13692-3.11-testall-results.png] ([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/]) | [dtest|^c13692-3.11-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/]) | | [trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692] | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] ([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/]) | {{testall}} looks good for all branches, but there's a common theme of consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to the common-expected-to-be-unrelated {{dtest}} failures. My assumption is that this is related to CASSANDRA-13747 (the comments there seem to corroborate that). [~iamaleksey] , do you have an idea if that could be the case? > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"),
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146000#comment-16146000 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 8/30/17 8:37 AM: --- Sorry for the late promised update. I'm currently finalizing the test, which is a bit of a mix between {{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and {{CompactionsBytemanTest}}. I reckon that (1) the {{OutOfSpaceTest}} approach translates well enough for checking whether failure policies are correctly triggered in this case; (2) the trickiest part (for me) would be to figure out whether the way {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets reached and executed, is correct and relevant. was (Author: dimitarndimitrov): Sorry for the late promised update. I'm currently finalizing the test, which is a bit of a mix between {{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and {{CompactionsBytemanTest}}. I reckon that (1) the {{OutOfSpaceTest}}s approach translates well enough for checking whether failure policies are correctly triggered in this case; (2) the trickiest part (for me) would be to figure out whether the way {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets reached and executed, is correct and relevant. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146000#comment-16146000 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- Sorry for the late promised update. I'm currently finalizing the test, which is a bit of a mix between {{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and {{CompactionsBytemanTest}}. I reckon that (1) the {{OutOfSpaceTest}}s approach translates well enough for checking whether failure policies are correctly triggered in this case; (2) the trickiest part (for me) would be to figure out whether the way {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets reached and executed, is correct and relevant. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143538#comment-16143538 ] Dimitar Dimitrov edited comment on CASSANDRA-13692 at 8/28/17 9:12 AM: --- It looks like a good approach here would be to write a test reproducing the problem (with one test for each branch throwing a {{RuntimeException}}), then apply what looks like the obvious solution (replacing the {{RuntimeExceptions}} with {{FSWriteErrors}}) and see if the failure policy is triggered. {{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but I'll also need to understand better under what conditions exactly is {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} triggered. I'll post another update later today. was (Author: dimitarndimitrov): It looks like a good approach here would be to write a test reproducing the problem (with one test for each branch throwing a {{RuntimeException}}), then apply what looks like the obvious solution (replacing the {{RuntimeExceptions }}with {{FSWriteErrors}}) and see if the failure policy is triggered. {{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but I'll also need to understand better under what conditions exactly is {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} triggered. I'll post another update later today. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143538#comment-16143538 ] Dimitar Dimitrov commented on CASSANDRA-13692: -- It looks like a good approach here would be to write a test reproducing the problem (with one test for each branch throwing a {{RuntimeException}}), then apply what looks like the obvious solution (replacing the {{RuntimeExceptions }}with {{FSWriteErrors}}) and see if the failure policy is triggered. {{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but I'll also need to understand better under what conditions exactly is {{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} triggered. I'll post another update later today. > CompactionAwareWriter_getWriteDirectory throws incompatible exceptions > -- > > Key: CASSANDRA-13692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13692 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Hao Zhong >Assignee: Dimitar Dimitrov > Labels: lhf > > The CompactionAwareWriter_getWriteDirectory throws RuntimeException: > {code} > public Directories.DataDirectory getWriteDirectory(Iterable > sstables, long estimatedWriteSize) > { > File directory = null; > for (SSTableReader sstable : sstables) > { > if (directory == null) > directory = sstable.descriptor.directory; > if (!directory.equals(sstable.descriptor.directory)) > { > logger.trace("All sstables not from the same disk - putting > results in {}", directory); > break; > } > } > Directories.DataDirectory d = > getDirectories().getDataDirectoryForFile(directory); > if (d != null) > { > long availableSpace = d.getAvailableSpace(); > if (availableSpace < estimatedWriteSize) > throw new RuntimeException(String.format("Not enough space to > write %s to %s (%s available)", > > FBUtilities.prettyPrintMemory(estimatedWriteSize), > d.location, > > FBUtilities.prettyPrintMemory(availableSpace))); > logger.trace("putting compaction results in {}", directory); > return d; > } > d = getDirectories().getWriteableLocation(estimatedWriteSize); > if (d == null) > throw new RuntimeException(String.format("Not enough disk space > to store %s", > > FBUtilities.prettyPrintMemory(estimatedWriteSize))); > return d; > } > {code} > However, the thrown exception does not trigger the failure policy. > CASSANDRA-11448 fixed a similar problem. The buggy code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new RuntimeException("Insufficient disk space to write " + > writeSize + " bytes"); > return directory; > } > {code} > The fixed code is: > {code} > protected Directories.DataDirectory getWriteDirectory(long writeSize) > { > Directories.DataDirectory directory = > getDirectories().getWriteableLocation(writeSize); > if (directory == null) > throw new FSWriteError(new IOException("Insufficient disk space > to write " + writeSize + " bytes"), ""); > return directory; > } > {code} > The fixed code throws FSWE and triggers the failure policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitar Dimitrov reassigned CASSANDRA-13801: Assignee: Dimitar Dimitrov > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
Dimitar Dimitrov created CASSANDRA-13801: Summary: CompactionManager sometimes wrongly determines that a background compaction is running for a particular table Key: CASSANDRA-13801 URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 Project: Cassandra Issue Type: Bug Components: Compaction Reporter: Dimitar Dimitrov Sometimes after writing different rows to a table, then doing a blocking flush, if you alter the compaction strategy, then run background compaction and wait for it to finish, {{CompactionManager}} may decide that there's an ongoing compaction for that same table. This may happen even though logs don't indicate that to be the case (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org