[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-03-19 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197690#comment-15197690
 ] 

Benjamin Lerer commented on CASSANDRA-10971:


I ran the tests on CI for 3.5 and they were flapping.
I add a look at the tests and found 2 problems:
* Some tests were using {{new CommitLog()}} with the same directory that 
{{CommitLog.INSTANCE}} which was causing 2 commit log instance to run at the 
same time with 2 differents configurations. This was resulting on some commit 
log files not being deleted for the {{replay_StandardMmapped}} test.
* As the unit tests are run in random orders the configuration changes made by 
the compression and encrytion tests were affecting other test when 
{{resetUnsafe}} was used.

I fixed the problems by using only {{CommitLog.INSTANCE}} in all the tests and 
restoring the initial configuration parameters after each test that was 
modifying them.

Ran the test on CI and it looks that we are good to go. \o/  
Thanks for all the work [~aweisberg]

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-03-15 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195790#comment-15195790
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


I'll take a look at it. It's not passing on OS X on trunk for me at all. It 
does pass on Linux.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-03-15 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195040#comment-15195040
 ] 

Benjamin Lerer commented on CASSANDRA-10971:


[~aweisberg] Sorry, I missed the lat ticket updates.

I am +1 on the patch. I am only having an issue with 
{{org.apache.cassandra.db.commitlog.CommitLogTest.replay_Encrypted}} it always 
timeout on CI and fail on my machine. I do not think that the patch is the 
reason for the problem but I will be more confident if the test was passing. 
Does it work on your machine? 

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-29 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172141#comment-15172141
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


Ooops. I guess you don't need to since the decrement is associated with a 
wakeup anyways.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-29 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172127#comment-15172127
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


That would work. You just need to add a poke to the CLSM thread when 
decrementing the counter otherwise it won't know it can create the segment now. 
I'll get that done.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-23 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158576#comment-15158576
 ] 

Benjamin Lerer commented on CASSANDRA-10971:


Could we not just keep your latest design but track the number of buffers in 
use and use it has a limit? Something like 
[this|https://github.com/apache/cassandra/compare/trunk...blerer:10971-trunk].


{quote}As a nit, I think it might be safer to use an instanceof 
FileDirectSegment for enforceSegmentLimit in trunk. In case somebody decide to 
add a new sub-class to FileDirectSegment{quote}

Forget about that, I did not read the code properly.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-22 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157763#comment-15157763
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


I think that brings us back to where we started with the original design that 
asynchronously supplies the buffer when it becomes available. I think I can do 
it without all the {{Future}}s nonsense by poking the CLSM thread when the 
buffer becomes available.

I need to update that version anyways because of the changes that occurred 
since this ticket was started.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-22 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157651#comment-15157651
 ] 

Benjamin Lerer commented on CASSANDRA-10971:


If I am not mistaken, a {{CompressedSegment}} or {{FileDirectSegment}} will 
release its buffer once it has been fully written to the disk whereas segments 
will stay active until they are recycled. By consequence, it might be better to 
use as limit the number of non fully written segments rather than the number of 
active ones. 
It seems that it could be done by counting the number of available segments 
which have a non-null buffer. 

As a nit, I think it might be safer to use an {{instanceof FileDirectSegment}} 
for {{enforceSegmentLimit}} in trunk. In case somebody decide to add a new 
sub-class to {{FileDirectSegment}} 

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-18 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153110#comment-15153110
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


Pushed an updated and much more succinct version. Tests are running now.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-18 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152981#comment-15152981
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


The memory mapped implementation doesn't need/want to bound the number of 
buffers in flight. Backpressure comes from the operating system which will 
block writer threads when there isn't enough free memory to buffer writes.

You are right that this would be simpler if the {{CLSM}} maintained the bound. 
It's already being woken up every time a segment is discarded. I'll rewrite it 
that way. I'll only have it bound if there is comrpression.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-02-17 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150187#comment-15150187
 ] 

Benjamin Lerer commented on CASSANDRA-10971:


My understanding of the problem is that in the case where the commit log cannot 
flush to the disk fast enough, due to the compression overhead, the 
{{CommitLogSegmentManager}} will keep on creating new {{CompressedSegments}}. 
As each of those segments will use a new buffer (the ones of the pool being all 
in use), Cassandra can run out of memory. 

Will it not be simpler to add backpressure by limiting the number of active 
segments?

What I mean is, if the {{CommitLogSegmentManager}} stops allocating new 
segments once a certain number of active segments has been reached, it will 
make the {{CommitLog.add}} method blocking until some segments have been 
reclaimed. 

It seems to me that, even in the case of {{MemoryMappedSegment}}, we should be 
able to apply back pressure, if the disk cannot handle the load. Am I wrong on 
that? 

As I am not a CommitLog expert I might have missed something.

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10971) Compressed commit log has no backpressure and can OOM

2016-01-06 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086097#comment-15086097
 ] 

Ariel Weisberg commented on CASSANDRA-10971:


|[trunk 
code|https://github.com/apache/cassandra/compare/trunk...aweisberg:CASSANDRA-10971-trunk?expand=1]|[utest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-trunk-dtest/]|
|[3.0 
code|https://github.com/apache/cassandra/compare/cassandra-3.0...aweisberg:CASSANDRA-10971-3.0?expand=1]|[utest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-3.0-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-3.0-dtest/]|
|[2.2 
code|https://github.com/apache/cassandra/compare/cassandra-2.2...aweisberg:CASSANDRA-10971-2.2?expand=1]|[utest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-2.2-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-10971-2.2-dtest/]|

> Compressed commit log has no backpressure and can OOM
> -
>
> Key: CASSANDRA-10971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> I validated this via a unit test that slowed the ability of the log to drain 
> to the filesystem. The compressed commit log will keep allocating buffers 
> pending compression until it OOMs.
> I have a fix that am not very happy with because the whole signal a thread to 
> allocate a segment that depends on a resource that may not be available 
> results in some obtuse usage of {{CompleatableFuture}} to rendezvous 
> available buffers with {{CommitLogSegmentManager}} thread waiting to finish 
> constructing a new segment. The {{CLSM}} thread is in turn signaled by the 
> thread(s) that actually wants to write to the next segment, but aren't able 
> to do it themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)