[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

2019-11-12 Thread Dimitar Dimitrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972639#comment-16972639
 ] 

Dimitar Dimitrov commented on CASSANDRA-15368:
--

Thanks for chasing this down, [~benedict]!

I'm glad it turned out that, as initially suspected, you're pretty good at this 
stuff, and the issue was not lurking from before, but more or less necessitated 
by the fix for CASSANDRA-15367. Then I guess it makes the most sense if you 
continue and take care of this.

> Failing to flush Memtable without terminating process results in permanent 
> data loss
> 
>
> Key: CASSANDRA-15368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log, Local/Memtable
>Reporter: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of 
> {{ReplayPosition}}, since there are only weak ordering constraints when 
> rolling over to a new {{Memtable}} - the last operations for the old 
> {{Memtable}} may obtain their {{ReplayPosition}} after the first operations 
> for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate 
> the entire range on flush.  Ordinarily we only invalidate records when all 
> prior {{Memtable}} have also successfully flushed.  However, in the event of 
> a flush that does not terminate the process (either because of disk failure 
> policy, or because it is a software error), the later flush is able to 
> invalidate the region of the commit log that includes records that should 
> have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated 
> flush failure, as we use commit log boundaries written to our flushed 
> sstables to filter {{ReplayPosition}} on recovery, which is meant to 
> replicate our {{Memtable}} flush behaviour above.  However, we do not know 
> that earlier flushes have completed, and they may complete successfully 
> out-of-order.  So any flush that completes before the process terminates, but 
> began after another flush that _doesn’t_ complete before the process 
> terminates, has the potential to cause permanent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

2019-11-06 Thread Dimitar Dimitrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968613#comment-16968613
 ] 

Dimitar Dimitrov commented on CASSANDRA-15368:
--

Thanks for the super-quick reply, [~benedict]!
I'll definitely check out your patch for CASSANDRA-15367.
As for this problem, if me taking potentially (much) longer to fix it isn't a 
problem for you, I can surely take a stab.

Also here's the analysis that I mentioned in my previous reply - all comments 
appreciated.

h4. Defining the problem

Let's assume we have a single table (no indexes, no MVs) that's being 
continuously written to when a single flush for it is requested.
 We want to examine if we can have the old memtable accepting a write with a 
higher CL position, and the new memtable
 accepting a write with a lower CL position - the latter also implies the old 
memtable rejecting that write.
 * Below we'll be calling the write with the higher CL position *HW*, its 
assigned {{OpOrder.Group}} (and the action for assigning it) *HW group*, and 
its assigned CL position (and the action for assigning it) *HW position*.
 * Similar for the write with the lower CL position - *LW*, *LW group*, and *LW 
position*.

So to get the (un)desired ordering, we need the following specific results from 
3 executions of {{Memtable.accepts(OpOrder.Group, CommitLogPosition)}}:
 - {{oldMemtable.accepts()}} (called *HW accept?* below), which should 
return true
 - {{oldMemtable.accepts()}} (called *LW accept?* below), which should 
return false
 - {{newMemtable.accepts()}}, which should return true (not necessary for 
the analysis below)

h4. Some constraints

 A. For each of the writes, the {{OpOrder.Group}} assignment happens-before the 
CL position allocation for the corresponding write, which happens-before the 
{{Memtable.accepts(OpOrder.Group, CommitLogPosition)}} call for the 
corresponding write.
 * HW group --hb-> HW position --hb-> HW accept?
 * LW group --hb-> LW position --hb-> LW accept?

B. The CL position allocations are totally (and numerically) ordered by 
happens-before, due to the way {{CommitLogSegment}}-s are advanced and the way 
their internal {{allocatePosition}} markers are CAS-ed.
 * LW position --hb-> HW position

C. If {{writeBarrier.issue()}} in the {{Flush}} ctor happens-before HW group, 
then the final upper CL bound for the old memtable (called *UB* below) has been 
set, and is guaranteed to be less than HW position, but then HW accept? is 
guaranteed to return false (because it will see {{writeBarrier}} as not 
{{null}}, and HW position would be guaranteed to be more than UB) => 
contradiction
 * If {{writeBarrier.issue()}} --hb-> HW group => UB --hb-> HW group => UB 
--hb-> HW position => contradiction
 * Therefore HW group --hb-> {{writeBarrier.issue()}}
 * Note that this was not true before the fix for CASSANDRA-8383.

D. If {{writeBarrier.issue()}} happens-before LW group, then UB has been set, 
and is guaranteed to be less than LW position, and therefore less than HW 
position. Also {{writeBarrier.issue()}} would happen-before HW position, which 
would happen-before HW accept?. That means that HW accept? will see 
{{writeBarrier}} as not {{null}}, and UB as set and less than HW position, so 
is guaranteed to return false => contradiction
 * If {{writeBarrier.issue()}} --hb-> LW group => UB --hb-> LW position --hb-> 
HW position && {{writeBarrier.issue()}} --hb-> HW accept? => contradiction
 * Therefore LW group --hb-> {{writeBarrier.issue()}}

E. As a corollary of C. and D., LW group and HW group should both be before the 
barrier issued by the flush, and therefore *the placements of LW and HW will 
both be determined by LW position, HW position, and UB*.

h4. The case work

In order for HW accept? to return true:
# ...it could be seeing {{writeBarrier}} as {{null}}, which means to have 
started before the {{writeBarrier}} is set in {{oldMemtable.setDiscarding}}.
## This implies that LW accept? is started after HW accept? has started - 
otherwise LW accept? would also have seen {{writeBarrier}} as {{null}} and 
returned true already => contradiction
 ## So LW accept? has started after HW accept? has started, and needs to return 
false because of LW position (see E. why it cannot be due to LW group).
 This could happen only if UB has been set and is less than LW position. But as 
setting UB happens after {{oldMemtable.setDiscarding}}, and HW accept? had 
started before the {{writeBarrier}} is set in {{oldMemtable.setDiscarding}}, UB 
should be at least HW position, which is more than LW position => contradiction
 #* If HW accept? start --hb-> writeBarrier set in 
{{oldMemtable.setDiscarding}} => HW position --hb-> writeBarrier set in 
{{oldMemtable.setDiscarding}} --hb-> UB => LW position --hb-> UB => 
contradiction
 #* Therefore writeBarrier set in {{oldMemtable.setDiscarding}} --hb-> HW 
accept? start
 # ...it could have been 

[jira] [Comment Edited] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

2019-11-06 Thread Dimitar Dimitrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968261#comment-16968261
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-15368 at 11/6/19 11:50 AM:


[~benedict], I assume this is something that you're planning to take up 
yourself, but let me know if you can use a volunteer in any way. 

Also can you please help me understand some of the details around the 
pre-conditions for this problem?

I'm probably missing something, but I still can't understand:
 * how _*the last operations for the old Memtable may obtain their 
ReplayPosition after the first operations for the new Memtable*_ can hold true 
after CASSANDRA-8383.
 * how _*Unfortunately, we treat the Memtable range as contiguous, and 
invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 
(with some interaction with CASSANDRA-9669).

I'm also wondering, is _*More problematically, this can also occur on restart 
without any associated flush failure, as we use commit log boundaries written 
to our flushed sstables to filter ReplayPosition on recovery*_ related to 
{{CommitLogReplayer#firstNotCovered(Collection>)}}
 and its caveats?

P.S. Specifically for the upper bound of the old memtable being above the lower 
bound of the new memtable, I've tried to explicitly write down the possible 
orderings, and I can't see how that could happen - I'll format and post my 
notes in a separate comment a bit later.


was (Author: dimitarndimitrov):
[~benedict], I assume this is something that you're planning to take up 
yourself, but let me know if you can use a volunteer in any way. 

Also can you please help me understand some of the details around the 
pre-conditions for this problem?

I'm probably mising something, but I still can't understand:
 * how _*the last operations for the old Memtable may obtain their 
ReplayPosition after the first operations for the new Memtable*_ can hold true 
after CASSANDRA-8383.
 * how _*Unfortunately, we treat the Memtable range as contiguous, and 
invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 
(with some interaction with CASSANDRA-9669).

I'm also wondering, is _*More problematically, this can also occur on restart 
without any associated flush failure, as we use commit log boundaries written 
to our flushed sstables to filter ReplayPosition on recovery*_ related to 
{{CommitLogReplayer#firstNotCovered(Collection>)}}
 and its caveats?

P.S. Specifically for the upper bound of the old memtable being above the lower 
bound of the new memtable, I've tried to explicitly write down the possible 
orderings, and I can't see how that could happen - I'll format and post my 
notes in a separate comment a bit later.

> Failing to flush Memtable without terminating process results in permanent 
> data loss
> 
>
> Key: CASSANDRA-15368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log, Local/Memtable
>Reporter: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of 
> {{ReplayPosition}}, since there are only weak ordering constraints when 
> rolling over to a new {{Memtable}} - the last operations for the old 
> {{Memtable}} may obtain their {{ReplayPosition}} after the first operations 
> for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate 
> the entire range on flush.  Ordinarily we only invalidate records when all 
> prior {{Memtable}} have also successfully flushed.  However, in the event of 
> a flush that does not terminate the process (either because of disk failure 
> policy, or because it is a software error), the later flush is able to 
> invalidate the region of the commit log that includes records that should 
> have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated 
> flush failure, as we use commit log boundaries written to our flushed 
> sstables to filter {{ReplayPosition}} on recovery, which is meant to 
> replicate our {{Memtable}} flush behaviour above.  However, we do not know 
> that earlier flushes have completed, and they may complete successfully 
> out-of-order.  So any flush that completes before the process terminates, but 
> began after another flush that _doesn’t_ complete before the process 
> terminates, has the potential to cause permanent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

2019-11-06 Thread Dimitar Dimitrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968261#comment-16968261
 ] 

Dimitar Dimitrov commented on CASSANDRA-15368:
--

[~benedict], I assume this is something that you're planning to take up 
yourself, but let me know if you can use a volunteer in any way. 

Also can you please help me understand some of the details around the 
pre-conditions for this problem?

I'm probably mising something, but I still can't understand:
 * how _*the last operations for the old Memtable may obtain their 
ReplayPosition after the first operations for the new Memtable*_ can hold true 
after CASSANDRA-8383.
 * how _*Unfortunately, we treat the Memtable range as contiguous, and 
invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 
(with some interaction with CASSANDRA-9669).

I'm also wondering, is _*More problematically, this can also occur on restart 
without any associated flush failure, as we use commit log boundaries written 
to our flushed sstables to filter ReplayPosition on recovery*_ related to 
{{CommitLogReplayer#firstNotCovered(Collection>)}}
 and its caveats?

P.S. Specifically for the upper bound of the old memtable being above the lower 
bound of the new memtable, I've tried to explicitly write down the possible 
orderings, and I can't see how that could happen - I'll format and post my 
notes in a separate comment a bit later.

> Failing to flush Memtable without terminating process results in permanent 
> data loss
> 
>
> Key: CASSANDRA-15368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log, Local/Memtable
>Reporter: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of 
> {{ReplayPosition}}, since there are only weak ordering constraints when 
> rolling over to a new {{Memtable}} - the last operations for the old 
> {{Memtable}} may obtain their {{ReplayPosition}} after the first operations 
> for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate 
> the entire range on flush.  Ordinarily we only invalidate records when all 
> prior {{Memtable}} have also successfully flushed.  However, in the event of 
> a flush that does not terminate the process (either because of disk failure 
> policy, or because it is a software error), the later flush is able to 
> invalidate the region of the commit log that includes records that should 
> have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated 
> flush failure, as we use commit log boundaries written to our flushed 
> sstables to filter {{ReplayPosition}} on recovery, which is meant to 
> replicate our {{Memtable}} flush behaviour above.  However, we do not know 
> that earlier flushes have completed, and they may complete successfully 
> out-of-order.  So any flush that completes before the process terminates, but 
> began after another flush that _doesn’t_ complete before the process 
> terminates, has the potential to cause permanent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2019-01-04 Thread Dimitar Dimitrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734130#comment-16734130
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

I have reworked the changes as discussed, and am currently testing.

Initial unit test results are relatively good - 3.0 and 3.11 seem OK (no 
failures), and trunk seems to have a bunch of unrelated failures (e.g. 
{{SingleSSTableLCSTaskTest}} failing with an OOME, 
{{org.apache.cassandra.dht.tokenallocator}} tests failing with a NPE in 
{{DatabaseDescriptor.diagnosticEventsEnabled}}). 2.2 has some failures in 
{{ScrubTest}} and {{SSTableRewriterTest}} that I'd like to take a closer look 
at.

I still don't have initial results for dtests, due to these being harder to 
land on a CI VM, and longer to run. I'll update when I have something to report 
though.

Here are the draft changes, in case anyone is interested:
| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 |

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>Priority: Major
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, 
> c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, 
> c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the 

[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2018-12-15 Thread Dimitar Dimitrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722172#comment-16722172
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

First, apologies for making this look like a 19th century trans-continental 
chess correspondence game...

I think you're absolutely right - after 
[{{849a438690aa97a361227781108cc90355dcbcd9}}|https://github.com/apache/cassandra/commit/849a438690aa97a361227781108cc90355dcbcd9],
 we return solely candidates from some subset of 
{{Directories.dataDirectories}}, all of which are initialized in the 
{{clinit}}, so currently it doesn't look like there's a case in which the error 
handling could be hit.

I agree with your suggestion (although I'm a tiny bit sad that the 
oh-so-laborious test would go as well).
I'll dust off the ancient branches that I had for this and update here with the 
corresponding patches soon.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>Priority: Major
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, 
> c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, 
> c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2018-12-15 Thread Dimitar Dimitrov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Status: In Progress  (was: Awaiting Feedback)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>Priority: Major
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results-updated.PNG, 
> c13692-3.0-dtest-results.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results-updated.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results-updated.PNG, 
> c13692-dtest-results.PNG, c13692-testall-results-updated.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2018-09-10 Thread Dimitar Dimitrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542961#comment-16542961
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13938 at 9/11/18 5:52 AM:
---

{quote}The problem is that when {{CompressedInputStream#position()}} is called, 
the new position might be in the middle of a buffer. We need to remember that 
offset, and subtract that value when updating {{current}} in 
{{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double 
counted on the first call to {{#reBuffer()}} after {{#position()}} as we add 
the {{buffer.position()}} to {{current}}. {{current}} already accounts for 
those offset bytes when {{#position()}} was called.
{quote}
[~jasobrown], isn't that equivalent (although a bit more complex) to just 
setting {{current}} to the last reached/read position in the stream when 
rebuffering? (i.e. {{current = streamOffset + buffer.position()}}).

I might be missing something, but the role of {{currentBufferOffset}} seems to 
be solely to "align" {{current}} and {{streamOffset}} the first time after a 
new section is started. Then {{current += buffer.position() - 
currentBufferOffset}} expands to {{current = -current- + buffer.position() + 
streamOffset - -current- }} which is the same as {{current = streamOffset + 
buffer.position()}}. After that first time, {{current}} naturally follows 
{{streamOffset}} without the need of any adjustment, but it seems more natural 
to express this as {{streamOffset + buffer.position()}} instead of the new 
expression or the old {{current + buffer.position()}}. To me, it's also a bit 
more intuitive and easier to understand (hopefully it's also right in addition 
to intuitive :)).

The equivalence above would hold true if {{current}} and {{streamOffset}} don't 
change their value in the meantime, but I think this is ensured by the 
well-ordered sequential fashion in which the decompressing and the offset 
bookkeeping functionality of {{CompressedInputStream}} happen in the thread 
running the corresponding {{StreamDeserializingTask}}.
 * The aforementioned well-ordered sequential fashion seems to be POSITION 
followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not 
update {{current}} with the above calculation in case {{current}} is already 
too far ahead (i.e. the new section is not starting within the current buffer).


was (Author: dimitarndimitrov):
{quote}The problem is that when {{CompressedInputStream#position()}} is called, 
the new position might be in the middle of a buffer. We need to remember that 
offset, and subtract that value when updating {{current}} in 
{{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double 
counted on the first call to {{#reBuffer()}} after {{#position()}} as we add 
the {{buffer.position()}} to {{current}}. {{current}} already accounts for 
those offset bytes when {{#position()}} was called.
{quote}
[~jasobrown], isn't that equivalent (although a bit more complex) to just 
setting {{current}} to the last reached/read position in the stream when 
rebuffering? (i.e. {{current = streamOffset + buffer.position()}}).

I might be missing something, but the role of {{currentBufferOffset}} seems to 
be solely to "align" {{current}} and {{streamOffset}} the first time after a 
new section is started. Then {{current += buffer.position() - 
currentBufferOffse expands to }}{{current = -current- + buffer.position() + 
streamOffset - -current- }}which is the same as {{current = streamOffset + 
buffer.position()}}. After that first time, {{current}} naturally follows 
{{streamOffset}} without the need of any adjustment, but it seems more natural 
to express this as {{streamOffset + buffer.position()}} instead of the new 
expression or the old {{current + buffer.position()}}. To me, it's also a bit 
more intuitive and easier to understand (hopefully it's also right in addition 
to intuitive :)).

The equivalence above would hold true if {{current}} and {{streamOffset}} don't 
change their value in the meantime, but I think this is ensured by the 
well-ordered sequential fashion in which the decompressing and the offset 
bookkeeping functionality of {{CompressedInputStream}} happen in the thread 
running the corresponding {{StreamDeserializingTask}}.
 * The aforementioned well-ordered sequential fashion seems to be POSITION 
followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not 
update {{current}} with the above calculation in case {{current}} is already 
too far ahead (i.e. the new section is not starting within the current buffer).

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: 

[jira] [Commented] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2018-07-13 Thread Dimitar Dimitrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542961#comment-16542961
 ] 

Dimitar Dimitrov commented on CASSANDRA-13938:
--

{quote}The problem is that when {{CompressedInputStream#position()}} is called, 
the new position might be in the middle of a buffer. We need to remember that 
offset, and subtract that value when updating {{current}} in 
{{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double 
counted on the first call to {{#reBuffer()}} after {{#position()}} as we add 
the {{buffer.position()}} to {{current}}. {{current}} already accounts for 
those offset bytes when {{#position()}} was called.
{quote}
[~jasobrown], isn't that equivalent (although a bit more complex) to just 
setting {{current}} to the last reached/read position in the stream when 
rebuffering? (i.e. {{current = streamOffset + buffer.position()}}).

I might be missing something, but the role of {{currentBufferOffset}} seems to 
be solely to "align" {{current}} and {{streamOffset}} the first time after a 
new section is started. Then {{current += buffer.position() - 
currentBufferOffse expands to }}{{current = -current- + buffer.position() + 
streamOffset - -current- }}which is the same as {{current = streamOffset + 
buffer.position()}}. After that first time, {{current}} naturally follows 
{{streamOffset}} without the need of any adjustment, but it seems more natural 
to express this as {{streamOffset + buffer.position()}} instead of the new 
expression or the old {{current + buffer.position()}}. To me, it's also a bit 
more intuitive and easier to understand (hopefully it's also right in addition 
to intuitive :)).

The equivalence above would hold true if {{current}} and {{streamOffset}} don't 
change their value in the meantime, but I think this is ensured by the 
well-ordered sequential fashion in which the decompressing and the offset 
bookkeeping functionality of {{CompressedInputStream}} happen in the thread 
running the corresponding {{StreamDeserializingTask}}.
 * The aforementioned well-ordered sequential fashion seems to be POSITION 
followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not 
update {{current}} with the above calculation in case {{current}} is already 
too far ahead (i.e. the new section is not starting within the current buffer).

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
>  Issue Type: Bug
>  Components: Repair
>Reporter: Nate McCall
>Assignee: Jason Brown
>Priority: Critical
> Fix For: 4.x
>
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I 
> was not able to make a repair command work. Further, the exception seemed to 
> trigger a nasty failure state that basically shuts down the netty connections 
> for messaging *and* CQL on the nodes transferring back data to the node being 
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a 
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
>   CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> table: test_data
> table_definition: |
>   CREATE TABLE test_data (
>   key text,
>   ts bigint,
>   val text,
>   PRIMARY KEY (key, ts)
>   ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (ts DESC) AND
>   bloom_filter_fp_chance=0.01 AND
>   caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.00 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
>   - name: key
> population: uniform(1..5000) # 50 million records available
>   - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
>   - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
>   partitions: fixed(1) # only one insert per batch for individual partitions
>   select: fixed(1)/1 # each insert comes in one at a time
>   batchtype: UNLOGGED
> queries:
>   single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
>   series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user 

[jira] [Commented] (CASSANDRA-14092) Max ttl of 20 years will overflow localDeletionTime

2018-01-29 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343378#comment-16343378
 ] 

Dimitar Dimitrov commented on CASSANDRA-14092:
--

Some review comments on the dtest and trunk changes:
 * On the dtest change:
 ## Shouldn't the dtest docstring 
[here|https://github.com/apache/cassandra-dtest/commit/83c73ef0a3cbe50232d3a9eea4fd26c877ea58db#diff-a8f4dac4af77196a8c7881abd067a5b9R345]
 say something related to the TTL problem?
 ## The start time 
[here|https://github.com/apache/cassandra-dtest/commit/83c73ef0a3cbe50232d3a9eea4fd26c877ea58db#diff-a8f4dac4af77196a8c7881abd067a5b9R348]
 seems redundant
 ## It may be good to extract the max TTL value in a variable - we may decide 
to keep a version of this test after we patch by just reducing that value, but 
before we fix it nicely
 * On the trunk change:
 ## Maybe it's my English, but [this 
wording|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-5414c0e96996be355c3aff1184ec859aR48]
 sounds a bit confusing to me, using "maximum supported date" and "limit date" 
for the same thing. Thoughts? If you're also hesitant, what do you think about 
"Rows that should expire after that date would still expire on that date."?
 ## You can quickly mention the relevant JIRA ticket 
[here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-b7ca4b9c415e93b6cbfb31daf90cc598R185]
 ## Qualify the static access to {{Cell.sanitizeLocalDeletionTime}} 
[here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-b7ca4b9c415e93b6cbfb31daf90cc598R53]
 ## Could you please add some comments/Javadoc for 
[{{Cell.sanitizeLocalDeletionTime}}|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-3e9f1fc67f99d27e92a3eb32201d8ca6R311]?
 I would assume that {{NO_TTL}} and {{NO_DELETION_TIME}} are needed to 
determine whether the cell is an expiring one, an expired one, or a tombstone, 
but I'm not too sure
 ## There are missing spaces between the boolean arguments of the delegation 
call for some of the unit tests (e.g. 
[here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v2#diff-0d8cf6ca6ed99c947903359c1beaf386R74])

> Max ttl of 20 years will overflow localDeletionTime
> ---
>
> Key: CASSANDRA-14092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14092
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Blocker
> Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2
>
>
> CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year 
> 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for 
> {{localDeletionTime}}.
> It turns out that next year the {{localDeletionTime}} will start overflowing 
> with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20 
> years) > Integer.MAX_VALUE}}), so we should remove this limitation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-12-06 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13801:
-
Status: Patch Available  (was: Open)

> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -
>
> Key: CASSANDRA-13801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Dimitar Dimitrov
>Assignee: Dimitar Dimitrov
>Priority: Minor
> Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, 
> c13801-3.11-testall.png, c13801-trunk-testall.png
>
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-12-06 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13801:
-
Attachment: c13801-2.2-testall.png
c13801-3.0-testall.png
c13801-3.11-testall.png
c13801-trunk-testall.png

> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -
>
> Key: CASSANDRA-13801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Dimitar Dimitrov
>Assignee: Dimitar Dimitrov
>Priority: Minor
> Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, 
> c13801-3.11-testall.png, c13801-trunk-testall.png
>
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-12-06 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280752#comment-16280752
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13801 at 12/6/17 7:33 PM:
---

It turns out that the problem does not necessarily require altering the 
compaction strategy.
It seems to be rooted in a potential problem with counting the CF compaction 
requests, that can eventually lead to a skipped background compaction.

The wrong counting can happen if the counting multiset increment 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197]
 gets delayed and happens after the corresponding counting multiset decrement 
already happened 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284].

Here are the branches with the proposed changes, as well as a Byteman test that 
can be used to demonstrate the issue.
testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky 
test failing).
dtest results will be added soon.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2]
 | [testall|^c13801-2.2-testall.png] |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0]
 | [testall|^c13801-3.0-testall.png] |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11]
 | [testall|^c13801-3.11-testall.png] |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk]
 | [testall|^c13801-trunk-testall.png] |



was (Author: dimitarndimitrov):
It turns out that the problem does not necessarily require altering the 
compaction strategy.
It seems to be rooted in a potential problem with counting the CF compaction 
requests, that can eventually lead to a skipped background compaction.

The wrong counting can happen if the counting multiset increment 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197]
 gets delayed and happens after the corresponding counting multiset decrement 
already happened 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284].

Here are the branches with the proposed changes, as well as a Byteman test that 
can be used to demonstrate the issue.
testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky 
test failing).
dtest results will be added soon.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2]
 | [testall|^c13801-2.2-testall.png] |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0]
 | [testall|^c13801-3.0-testall.png] |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11]
 | [testall|^c13801-3.11-testall.png] |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk]
 | [testall|^c13801-2.2-testall.png] |


> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -
>
> Key: CASSANDRA-13801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Dimitar Dimitrov
>Assignee: Dimitar Dimitrov
>Priority: Minor
> Attachments: c13801-2.2-testall.png, c13801-3.0-testall.png, 
> c13801-3.11-testall.png, c13801-trunk-testall.png
>
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-12-06 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280752#comment-16280752
 ] 

Dimitar Dimitrov commented on CASSANDRA-13801:
--

It turns out that the problem does not necessarily require altering the 
compaction strategy.
It seems to be rooted in a potential problem with counting the CF compaction 
requests, that can eventually lead to a skipped background compaction.

The wrong counting can happen if the counting multiset increment 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L197]
 gets delayed and happens after the corresponding counting multiset decrement 
already happened 
[here|https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L284].

Here are the branches with the proposed changes, as well as a Byteman test that 
can be used to demonstrate the issue.
testall results look good (3.0 and trunk each have 1 seemingly unrelated, flaky 
test failing).
dtest results will be added soon.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13801-2.2]
 | [testall|^c13801-2.2-testall.png] |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13801-3.0]
 | [testall|^c13801-3.0-testall.png] |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13801-3.11]
 | [testall|^c13801-3.11-testall.png] |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13801-trunk]
 | [testall|^c13801-2.2-testall.png] |


> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -
>
> Key: CASSANDRA-13801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Dimitar Dimitrov
>Assignee: Dimitar Dimitrov
>Priority: Minor
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166103#comment-16166103
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 12:44 PM:


Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and 
here are the new test results that confirm that.
Unfortunately the runs are still not green.
Nevertheless the failures seen in the baseline and my run are almost identical, 
with 2 failures from cassci not showing up in my run, and 1 failure from my run 
not showing up in cassci.
* To better compare the failures, sort the cassci results by test name, as the 
results from my run have also been sorted this way.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [*testall*|^c13692-testall-results-updated.PNG] | 
[*dtest*|^c13692-dtest-results-updated.PNG] 
([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/])
 |


was (Author: dimitarndimitrov):
Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and 
here are the new test results that confirm that.
Unfortunately the runs are still not green.
Nevertheless the failures seen in the baseline and my run are almost identical, 
with 2 failures from cassci not showing up in my run, and 1 failure from my run 
not showing up in cassci.
* To better compare the failures, sort the cassci results by test name, as the 
results from my run have also been sorted this way.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | 
[*dtest*|^c13692-dtest-results-updated.PNG] 
([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/])
 |

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, 
> c13692-testall-results-updated.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == 

[jira] [Issue Comment Deleted] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Comment: was deleted

(was: Adding screenshots from CI results.)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, 
> c13692-testall-results-updated.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Comment: was deleted

(was: Adding updated screenshots from CI dtest results.)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, 
> c13692-testall-results-updated.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Attachment: c13692-testall-results-updated.PNG

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG, 
> c13692-testall-results-updated.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Status: Patch Available  (was: In Progress)

Marking this as having a submitted patch ready for a review.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:29 AM:


Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

testall results look good for all branches, but there's a common theme of 
{{consistency_test.TestConsistency.test_13747}} dtests failing, in addition to 
the common-expected-to-be-unrelated dtest failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that).

[~iamaleksey], do you have an idea if that could be the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} results look good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that).

[~iamaleksey], do you have an idea if that could be the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
>  

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:29 AM:


Some additional observations, after taking yet another look at the test results:
* Although very similar, the 3.11 testall failures are not exactly the same as 
the ones in the baseline.
* The trunk dtest failures seem to diverge from the pattern of 
"common-expected-to-be-unrelated failures plus test_13747 failures". I'll try 
to see whether this can be attributed to flakiness (looking closer at the 
results, re-running the CI run on the same branch, running another CI run on a 
clean branch copy of the trunk, etc.)


was (Author: dimitarndimitrov):
Some additional observations, after taking yet another look at the test results:
* Although very similar, the 3.11 {{testall}} failures are not exactly the same 
as the ones in the baseline.
* The trunk {{dtest}} failures seem to diverge from the pattern of 
"common-expected-to-be-unrelated failures plus test_13747 failures". I'll try 
to see whether this can be attributed to flakiness (looking closer at the 
results, re-running the CI run on the same branch, running another CI run on a 
clean branch copy of the trunk, etc.)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:28 AM:


I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the 
results are as follows:
* No more {{consistency_test.TestConsistency.test_13747}} failures for the 3.0 
and 3.11 changes. Small differences between actual and expected dtest failures, 
which seem to be flaking out.
* The trunk changes are still hitting what seems to be an existing problem 
plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see 
https://cassci.datastax.com/job/trunk_dtest/).
** The problem is currently being investigated.

Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the 
testall and dtest criteria, only the trunk changes need to be verified, after 
the problem affecting trunk gets resolved. I'll post an update once this 
happens, but in the meantime, it's possible to mark this as ready for review.


was (Author: dimitarndimitrov):
I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the 
results are as follows:
* No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 
3.11 changes. Small differences between actual and expected dtest failures, 
which seem to be flaking out.
* The trunk changes are still hitting what seems to be an existing problem 
plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see 
https://cassci.datastax.com/job/trunk_dtest/).
** The problem is currently being investigated.

Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the 
testall and dtest criteria, only the trunk changes need to be verified, after 
the problem affecting trunk gets resolved. I'll post an update once this 
happens, but in the meantime, it's possible to mark this as ready for review.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152425#comment-16152425
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/14/17 11:27 AM:


All observed dtest failures, including the 
{{consistency_test.TestConsistency.test_13747}} failures, reproduce exactly the 
same on brand new copies of the cassandra-3.11 and trunk branches of my 
apache/cassandra fork.
My fork is by now tens of commits behind the origin, so I'll update the fork, 
and re-run the CI jobs for the 3.0, 3.11, and trunk branches.
I'm expecting to see a much more consistent picture this time.


was (Author: dimitarndimitrov):
All observed {{dtest}} failures, including the 
consistency_test.TestConsistency.test_13747 failures, reproduce exactly the 
same on brand new copies of the cassandra-3.11 and trunk branches of my 
apache/cassandra fork.
My fork is by now tens of commits behind the origin, so I'll update the fork, 
and re-run the CI jobs for the 3.0, 3.11, and trunk branches.
I'm expecting to see a much more consistent picture this time.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Attachment: c13692-dtest-results-updated.PNG

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-dtest-results-updated.PNG, c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-14 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166103#comment-16166103
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Okay, it looks like the {{trunk}} dtest abort problems have been fixed - and 
here are the new test results that confirm that.
Unfortunately the runs are still not green.
Nevertheless the failures seen in the baseline and my run are almost identical, 
with 2 failures from cassci not showing up in my run, and 1 failure from my run 
not showing up in cassci.
* To better compare the failures, sort the cassci results by test name, as the 
results from my run have also been sorted this way.

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results-updated.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | 
[*dtest*|^c13692-dtest-results-updated.PNG] 
([*dtest-baseline*|https://cassci.datastax.com/job/trunk_dtest/1654/testReport/])
 |

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
>   

[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-06 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Attachment: c13692-3.0-dtest-results-updated.PNG
c13692-3.11-dtest-results-updated.PNG

Adding updated screenshots from CI dtest results.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-dtest-results-updated.PNG, c13692-3.0-testall-results.PNG, 
> c13692-3.11-dtest-results.PNG, c13692-3.11-dtest-results-updated.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-06 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155061#comment-16155061
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Here's the table with the updated test results (in bold, trunk dtest stability 
issues notwithstanding):

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[*dtest*|^c13692-3.0-dtest-results-updated.PNG] 
([*dtest-baseline*|https://cassci.datastax.com/job/cassandra-3.0_dtest/989/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [*dtest*|^c13692-3.11-dtest-results-updated.PNG] 
([*dtest-baseline*|https://cassci.datastax.com/job/cassandra-3.11_dtest/165/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-05 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/5/17 9:50 PM:
--

I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the 
results are as follows:
* No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 
3.11 changes. Small differences between actual and expected dtest failures, 
which seem to be flaking out.
* The trunk changes are still hitting what seems to be an existing problem 
plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see 
https://cassci.datastax.com/job/trunk_dtest/).
** The problem is currently being investigated.

Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the 
testall and dtest criteria, only the trunk changes need to be verified, after 
the problem affecting trunk gets resolved. I'll post an update once this 
happens, but in the meantime, it's possible to mark this as ready for review.


was (Author: dimitarndimitrov):
I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the 
results are as follows:
* No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 
3.11 changes. Small differences between actual and expected dtest failures, 
which seem to be flaking out.
* The trunk changes are still hitting what seems to be an existing problem 
plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see 
https://cassci.datastax.com/job/trunk_dtest/).
** The problem is currently being investigated.

Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the 
testall and dtest criteria, only the trunk changes need to be verified, after 
the problem affecting trunk gets resolved. I'll post an update once this 
happens, but in the meantime, I'd assume it's safe to mark this as "Awaiting 
Feedback".

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> 

[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-05 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154343#comment-16154343
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

I ended up rebasing the 3.0, 3.11, and trunk versions of the change, and the 
results are as follows:
* No more consistency_test.TestConsistency.test_13747 failures for the 3.0 and 
3.11 changes. Small differences between actual and expected dtest failures, 
which seem to be flaking out.
* The trunk changes are still hitting what seems to be an existing problem 
plaguing cassci.datastax.com trunk dtest jobs for the last 10 days or so (see 
https://cassci.datastax.com/job/trunk_dtest/).
** The problem is currently being investigated.

Now it looks like the 2.2, 3.0, and 3.11 changes can be accepted as passing the 
testall and dtest criteria, only the trunk changes need to be verified, after 
the problem affecting trunk gets resolved. I'll post an update once this 
happens, but in the meantime, I'd assume it's safe to mark this as "Awaiting 
Feedback".

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13703) Using min_compress_ratio <= 1 causes corruption

2017-09-05 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153675#comment-16153675
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13703 at 9/5/17 1:56 PM:
--

+1 - from my still largely layman perspective, the change looks good.

A couple of small nits:
* {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to 
{{shouldCheckCrc()}}) may be removed or at least its visibility could be 
reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 
and 204 in this patch.
* The no-argument / single-argument factory methods for Snappy and LZ4 
compressions in CompressionParams.java seem to differ in the values that they 
default to for min compression ratio and max compressed length.
** For Snappy, if nothing is specified, or only chunk length is specified, a 
default min compression ratio of 1.1 is used, and therefore max compressed 
length ends up somewhere roughly around 90% of chunk length.
** For LZ4, if nothing is specified, or only chunk length is specified, a 
default max compressed length of chunk length is used, and therefore min 
compression ratio ends up at 1.0 (I'm not sure if a precision error is possible 
there).

Edit: Of course, take this review with the appropriate rock-sized grain of salt.


was (Author: dimitarndimitrov):
+1 - from my still largely layman perspective, the change looks good.

A couple of small nits:
* {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to 
{{shouldCheckCrc()}}) may be removed or at least its visibility could be 
reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 
and 204 in this patch.
* The no-argument / single-argument factory methods for Snappy and LZ4 
compressions in CompressionParams.java seem to differ in the values that they 
default to for min compression ratio and max compressed length.
** For Snappy, if nothing is specified, or only chunk length is specified, a 
default min compression ratio of 1.1 is used, and therefore max compressed 
length ends up somewhere roughly around 90% of chunk length.
** For LZ4, if nothing is specified, or only chunk length is specified, a 
default max compressed length of chunk length is used, and therefore min 
compression ratio ends up at 1.0 (I'm not sure if a precision error is possible 
there).

> Using min_compress_ratio <= 1 causes corruption
> ---
>
> Key: CASSANDRA-13703
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13703
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Blocker
> Fix For: 4.x
>
> Attachments: patch
>
>
> This is because chunks written uncompressed end up below the compressed size 
> threshold. Demonstrated by applying the attached patch meant to improve the 
> testing of the 10520 changes, and running 
> {{CompressedSequentialWriterTest.testLZ4Writer}}.
> The default {{min_compress_ratio: 0}} is not affected as it never writes 
> uncompressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13703) Using min_compress_ratio <= 1 causes corruption

2017-09-05 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153675#comment-16153675
 ] 

Dimitar Dimitrov commented on CASSANDRA-13703:
--

+1 - from my still largely layman perspective, the change looks good.

A couple of small nits:
* {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to 
{{shouldCheckCrc()}}) may be removed or at least its visibility could be 
reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 
and 204 in this patch.
* The no-argument / single-argument factory methods for Snappy and LZ4 
compressions in CompressionParams.java seem to differ in the values that they 
default to for min compression ratio and max compressed length.
** For Snappy, if nothing is specified, or only chunk length is specified, a 
default min compression ratio of 1.1 is used, and therefore max compressed 
length ends up somewhere roughly around 90% of chunk length.
** For LZ4, if nothing is specified, or only chunk length is specified, a 
default max compressed length of chunk length is used, and therefore min 
compression ratio ends up at 1.0 (I'm not sure if a precision error is possible 
there).

> Using min_compress_ratio <= 1 causes corruption
> ---
>
> Key: CASSANDRA-13703
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13703
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Blocker
> Fix For: 4.x
>
> Attachments: patch
>
>
> This is because chunks written uncompressed end up below the compressed size 
> threshold. Demonstrated by applying the attached patch meant to improve the 
> testing of the 10520 changes, and running 
> {{CompressedSequentialWriterTest.testLZ4Writer}}.
> The default {{min_compress_ratio: 0}} is not affected as it never writes 
> uncompressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-04 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152425#comment-16152425
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

All observed {{dtest}} failures, including the 
consistency_test.TestConsistency.test_13747 failures, reproduce exactly the 
same on brand new copies of the cassandra-3.11 and trunk branches of my 
apache/cassandra fork.
My fork is by now tens of commits behind the origin, so I'll update the fork, 
and re-run the CI jobs for the 3.0, 3.11, and trunk branches.
I'm expecting to see a much more consistent picture this time.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 12:03 PM:
---

Some additional observations, after taking yet another look at the test results:
* Although very similar, the 3.11 {{testall}} failures are not exactly the same 
as the ones in the baseline.
* The trunk {{dtest}} failures seem to diverge from the pattern of 
"common-expected-to-be-unrelated failures plus test_13747 failures". I'll try 
to see whether this can be attributed to flakiness (looking closer at the 
results, re-running the CI run on the same branch, running another CI run on a 
clean branch copy of the trunk, etc.)


was (Author: dimitarndimitrov):
Some additional observations, after taking yet another look at the test results:
* Although very similar, the 3.11 {{testall}} failures are not exactly the same 
as the ones in the baseline.
* The trunk {{dtest}} failures seem to diverge from the pattern of 
"common-expected-to-be-unrelated failures plus test_13747 failures". I'll try 
to see whether this can be attributed to flakiness (looking closer to the 
results, re-running the CI run on the same branch, running another CI run on a 
clean branch copy of the trunk, etc.)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To 

[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150351#comment-16150351
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Ah, sorry, I should have at least attached the failure logs - I didn't attach 
the build artifacts, as I wasn't sure if they were sanitized with regard to 
non-public data.
I'll sync with a more knowledgeable colleague, and get back to you with the 
necessary info.

P.S. Like you've probably noticed, I'm still new to one of the more visible 
presences here, and many of the steps in the process are a bit hazy to me - 
I'll make sure to improve quickly on that though :)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150355#comment-16150355
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Some additional observations, after taking yet another look at the test results:
* Although very similar, the 3.11 {{testall}} failures are not exactly the same 
as the ones in the baseline.
* The trunk {{dtest}} failures seem to diverge from the pattern of 
"common-expected-to-be-unrelated failures plus test_13747 failures". I'll try 
to see whether this can be attributed to flakiness (looking closer to the 
results, re-running the CI run on the same branch, running another CI run on a 
clean branch copy of the trunk, etc.)

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:51 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} results look good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that).

[~iamaleksey], do you have an idea if that could be the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:50 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.PNG] | 
[dtest|^c13692-2.2-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.PNG] | 
[dtest|^c13692-3.0-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.PNG] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.PNG] | [dtest|^c13692-dtest-results.PNG] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
>   

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:49 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 dtests failing, in addition to the 
common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);

[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov updated CASSANDRA-13692:
-
Attachment: c13692-2.2-dtest-results.PNG
c13692-2.2-testall-results.PNG
c13692-3.0-dtest-results.PNG
c13692-3.0-testall-results.PNG
c13692-3.11-dtest-results.PNG
c13692-3.11-testall-results.PNG
c13692-dtest-results.PNG
c13692-testall-results.PNG

Adding screenshots from CI results.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
> Attachments: c13692-2.2-dtest-results.PNG, 
> c13692-2.2-testall-results.PNG, c13692-3.0-dtest-results.PNG, 
> c13692-3.0-testall-results.PNG, c13692-3.11-dtest-results.PNG, 
> c13692-3.11-testall-results.PNG, c13692-dtest-results.PNG, 
> c13692-testall-results.PNG
>
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:46 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] | [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:46 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |

| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:45 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 | [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |

| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:44 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 9/1/17 8:44 AM:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?


was (Author: dimitarndimitrov):
Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace 

[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-09-01 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150212#comment-16150212
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Okay, here are the branches with the proposed changes:

| 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...dimitarndimitrov:c13692-2.2]
 | [testall|^c13692-2.2-testall-results.png] | 
[dtest|^c13692-2.2-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...dimitarndimitrov:c13692-3.0]
 | [testall|^c13692-3.0-testall-results.png] | 
[dtest|^c13692-3.0-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/])
 |
| 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...dimitarndimitrov:c13692-3.11]
 | [testall|^c13692-3.11-testall-results.png] 
([testall-baseline|https://cassci.datastax.com/job/cassandra-3.11_testall/lastCompletedBuild/testReport/])
 |
 [dtest|^c13692-3.11-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/cassandra-3.11_dtest/lastCompletedBuild/testReport/])
 |
| 
[trunk|https://github.com/apache/cassandra/compare/trunk...dimitarndimitrov:c13692]
 | [testall|^c13692-testall-results.png] |
 [dtest|^c13692-dtest-results.png] 
([dtest-baseline|https://cassci.datastax.com/job/trunk_dtest/lastCompletedBuild/testReport/])
 |

{{testall}} looks good for all branches, but there's a common theme of 
consistency_test.TestConsistency.test_13747 {{dtest}}s failing, in addition to 
the common-expected-to-be-unrelated {{dtest}} failures.
My assumption is that this is related to CASSANDRA-13747 (the comments there 
seem to corroborate that). [~iamaleksey] , do you have an idea if that could be 
the case?

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), 

[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-08-30 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146000#comment-16146000
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 8/30/17 8:37 AM:
---

Sorry for the late promised update.
I'm currently finalizing the test, which is a bit of a mix between 
{{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and 
{{CompactionsBytemanTest}}.

I reckon that (1) the {{OutOfSpaceTest}} approach translates well enough for 
checking whether failure policies are correctly triggered in this case; (2) the 
trickiest part (for me) would be to figure out whether the way 
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets 
reached and executed, is correct and relevant.


was (Author: dimitarndimitrov):
Sorry for the late promised update.
I'm currently finalizing the test, which is a bit of a mix between 
{{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and 
{{CompactionsBytemanTest}}.

I reckon that (1) the {{OutOfSpaceTest}}s approach translates well enough for 
checking whether failure policies are correctly triggered in this case; (2) the 
trickiest part (for me) would be to figure out whether the way 
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets 
reached and executed, is correct and relevant.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-08-29 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146000#comment-16146000
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

Sorry for the late promised update.
I'm currently finalizing the test, which is a bit of a mix between 
{{OutOfSpaceTest}}, {{CompactionAwareWriterTest}}, and 
{{CompactionsBytemanTest}}.

I reckon that (1) the {{OutOfSpaceTest}}s approach translates well enough for 
checking whether failure policies are correctly triggered in this case; (2) the 
trickiest part (for me) would be to figure out whether the way 
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} gets 
reached and executed, is correct and relevant.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-08-28 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143538#comment-16143538
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13692 at 8/28/17 9:12 AM:
---

It looks like a good approach here would be to write a test reproducing the 
problem (with one test for each branch throwing a {{RuntimeException}}), then 
apply what looks like the obvious solution (replacing the {{RuntimeExceptions}} 
with {{FSWriteErrors}}) and see if the  failure policy is triggered.
{{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but 
I'll also need to understand better under what conditions exactly is  
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} 
triggered.

I'll post another update later today.


was (Author: dimitarndimitrov):
It looks like a good approach here would be to write a test reproducing the 
problem (with one test for each branch throwing a {{RuntimeException}}), then 
apply what looks like the obvious solution (replacing the {{RuntimeExceptions 
}}with {{FSWriteErrors}}) and see if the  failure policy is triggered.
{{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but 
I'll also need to understand better under what conditions exactly is  
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} 
triggered.

I'll post another update later today.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-08-28 Thread Dimitar Dimitrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143538#comment-16143538
 ] 

Dimitar Dimitrov commented on CASSANDRA-13692:
--

It looks like a good approach here would be to write a test reproducing the 
problem (with one test for each branch throwing a {{RuntimeException}}), then 
apply what looks like the obvious solution (replacing the {{RuntimeExceptions 
}}with {{FSWriteErrors}}) and see if the  failure policy is triggered.
{{org.apache.cassandra.cql3.OutOfSpaceTest}} may be a good starting point, but 
I'll also need to understand better under what conditions exactly is  
{{CompactionAwareWriter#getWriteDirectory(Iterable, long)}} 
triggered.

I'll post another update later today.

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>Assignee: Dimitar Dimitrov
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-08-25 Thread Dimitar Dimitrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitar Dimitrov reassigned CASSANDRA-13801:


Assignee: Dimitar Dimitrov

> CompactionManager sometimes wrongly determines that a background compaction 
> is running for a particular table
> -
>
> Key: CASSANDRA-13801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Dimitar Dimitrov
>Assignee: Dimitar Dimitrov
>
> Sometimes after writing different rows to a table, then doing a blocking 
> flush, if you alter the compaction strategy, then run background compaction 
> and wait for it to finish, {{CompactionManager}} may decide that there's an 
> ongoing compaction for that same table.
> This may happen even though logs don't indicate that to be the case 
> (compaction may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table

2017-08-25 Thread Dimitar Dimitrov (JIRA)
Dimitar Dimitrov created CASSANDRA-13801:


 Summary: CompactionManager sometimes wrongly determines that a 
background compaction is running for a particular table
 Key: CASSANDRA-13801
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13801
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
Reporter: Dimitar Dimitrov


Sometimes after writing different rows to a table, then doing a blocking flush, 
if you alter the compaction strategy, then run background compaction and wait 
for it to finish, {{CompactionManager}} may decide that there's an ongoing 
compaction for that same table.
This may happen even though logs don't indicate that to be the case (compaction 
may still be running for system_schema tables).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org