[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-28 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184270#comment-16184270
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

Thanks for reviewing~

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.0
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-28 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183862#comment-16183862
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[Utest|http://jenkins-cassandra.datastax.lan/view/Dev/view/jasonstack/job/jasonstack-CASSANDRA-13299-trunk-testall/lastCompletedBuild/testReport/]:
 3 failed, passed on local
[Dtest|http://jenkins-cassandra.datastax.lan/view/Dev/view/jasonstack/job/jasonstack-CASSANDRA-13299-trunk-dtest/lastCompletedBuild/testReport/]:
 either passed on local or failed on trunk


> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.x
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-27 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183689#comment-16183689
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

Thanks for the fix.

bq. Could you also modify complexThrottleWithTombstoneTest to test range 
deletions?

Added.

bq. I think that instead of throwing an AssertionError when the returned 
iterator is not exhausted, we could simply exhaust it
+1

bq. Right now we're verifying the results with all the nodes UP, but it's 
possible that another node responds the query even though one of the 
inconsistent nodes did not stream correctly. I think we should check the 
results on each node individually (with the others down) to ensure they 
streamed data correctly from other nodes.
bq. Add range deletions since that's when the range tombstones special cases 
will be properly exercised.
Added.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.x
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-26 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180923#comment-16180923
 ] 

Paulo Motta commented on CASSANDRA-13299:
-

Thanks for the updates, this is looking good! I managed to reproduce an OOM 
when repairing a wide partition with 100K rows and verified that this patch 
avoids the OOM by splitting the partition in multiple batches (found 
CASSANDRA-13899 on the way). Awesome job!

While splitting on the happy case is working nicely, we need to ensure the 
range tombstone handling (specially range deletions) is working correctly and 
well tested before committing this.

I noticed that the previous {{ThrottledUnfilteredIterator}} implementation 
could [potentially 
return|https://github.com/jasonstack/cassandra/blob/b8cb49035d3cf77198b31df7c51a174fffe3edaf/src/java/org/apache/cassandra/db/rows/ThrottledUnfilteredIterator.java#L166]
 {{throttle+2}} unfiltereds, differently from the documentation which states 
that the maximum number of unfiltereds per batch is {{throttle+1}}. I also 
noticed that the case when there is a row between two markers was not being 
tested by existing tests, since we need a range deletion to reproduce this 
scenario. I fixed this and added more tests on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/47d8ca3592cb6382bb4c308720646395306a0a69].
 Could you also modify 
[complexThrottleWithTombstoneTest|https://github.com/jasonstack/cassandra/commit/b8cb49035d3cf77198b31df7c51a174fffe3edaf#diff-5162644c24391628b339b88c3619427cR66]
 to test range deletions?

The previous change 
[requires|https://github.com/pauloricardomg/cassandra/commit/47d8ca3592cb6382bb4c308720646395306a0a69#diff-2acee8fea5cd82a51fda4af6e38faf13R60]
 the minimum throttle size to be 2, otherwise it would not be possible to make 
progress on the iterator in the presence of open and close markers.

I think that instead of throwing an {{AssertionError}} when the returned 
iterator is not exhausted, we could simply exhaust it, effectively skipping 
entries, since this might be a possible usage of 
{{ThrottledUnfilteredIterator}} so I did this on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/04ed5ecb5183195601950fc9efd2ca9123596487].

I also added an utility method 
{{ThrottledUnfilteredIterator.throttle(UnfilteredPartitionIterator 
partitionIterator, int maxBatchSize)}} to allow throttling an 
{{UnfilteredPartitionIterator}} transparently and used that on 
{{StreamReceiveTask}} [on this 
commit|https://github.com/pauloricardomg/cassandra/commit/4f8c3b8faa2644133d301ac7bf7b748f7ec265ee].

I had another look at the {{throttled_partition_update_test}} 
[dtest|https://github.com/riptano/cassandra-dtest/commit/f3307adef349f232ec0ae64e902164684f32cca0]
 and think we can make the following improvements:
* Right now we're [verifying the 
results|https://github.com/riptano/cassandra-dtest/commit/f3307adef349f232ec0ae64e902164684f32cca0#diff-62ba429edee6a4681782f078246c9893R1410]
 with all the nodes UP, but it's possible that another node responds the query 
even though one of the inconsistent nodes did not stream correctly. I think we 
should check the results on each node individually (with the others down) to 
ensure they streamed data correctly from other nodes.
*  Add [range deletions|https://issues.apache.org/jira/browse/CASSANDRA-6237] 
since that's when the range tombstones special cases will be properly exercised.

Please let me know what do you think about these suggestions.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.x
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The 

[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-18 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169634#comment-16169634
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

Thanks for the feedback. Rebased with lastest trunk, dtest is unstable due to 
netty..

bq. Make ThrottledUnfilteredIterator an Iterator instead 
of using hasNextGroup and resetLimit which is analogous to hasNext and next.

Extended {{AbstractIterator}} and implements 
{{computeNext()}}

{quote}
Move to org.apache.cassandra.db.rows package
Add simple javadoc explaining what it does
Move cassandra.mv.mutation.row.count out of ThrottledUnfilteredIterator, and 
maybe rename it to cassandra.repair.mutation_repair_rows_per_batch (or similar, 
since it's also used for CDC).
{quote}
+1 fixed
bq. Add unit test to ThrottledUnfilteredIterator to make sure it's generating 
range tombstones correctly
Added {{ThrottledUnfilteredIteratorTest}}

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166039#comment-16166039
 ] 

Paulo Motta commented on CASSANDRA-13299:
-

Thanks, the patch looks good from an initial look, great job! Some minor 
comments:
* Generalize {{ThrottledUnfilteredIterator}} since it can also be useful 
outside of streaming package:
** Make {{ThrottledUnfilteredIterator}} an {{Iterator}} 
instead of using {{hasNextGroup}} and {{resetLimit}} which is analogous to 
{{hasNext}} and {{next}}.
** Move to {{org.apache.cassandra.db.rows}} package
** Add simple javadoc explaining what it does
** Move {{cassandra.mv.mutation.row.count}} out of 
{{ThrottledUnfilteredIterator}}, and maybe rename it to 
{{cassandra.repair.mutation_repair_rows_per_batch}} (or similar, since it's 
also used for CDC).
* Add unit test to {{ThrottledUnfilteredIterator}} to make sure it's generating 
range tombstones correctly

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-28 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143793#comment-16143793
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[~brstgt] could you give some feedback?

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-22 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137759#comment-16137759
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[~brstgt] thanks :)

Found one more issue related to RangeTombstoneMarker in MV when writing dtest.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-21 Thread Benjamin Roth (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134749#comment-16134749
 ] 

Benjamin Roth commented on CASSANDRA-13299:
---

Sorry for the late response, I was on vacation. No, I am not working on that 
ticket. But thanks a lot for your efforts (not only) on that ticket!

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134733#comment-16134733
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13299-trunk]
[dtest|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13299 ]

Changes:

1. Throttle by number of base unfiltered. default is 100. 
2. A pair of open/close range tombstone could have any number of unshadowed 
rows in between, in the patch, simply cache the range tombstones to avoid 
exceeding the limit. And apply cached range tombstones, in next batch.

Note:
One partition deletion or a range deletion could cause huge number of view rows 
to be removed, thus view mutation may fail to apply due to WTE or 
max_mutation_size, but it could be resolved separately in CASSANDRA-12783. 
Here, I only address the issue of holding entire partition into memory when 
repairing base with mv.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122903#comment-16122903
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[~brstgt] Hi benjamin, are you working on this ticket?

I think there isn't a perfect base mutation size or number of base rows in a 
mutation that fits all data models.  Your suggested Min(16MB, 
max_mutation_size) should be good enough.

First target is to reduce memory pressure for huge partition with MV in repair. 

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-03-05 Thread Benjamin Roth (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896460#comment-15896460
 ] 

Benjamin Roth commented on CASSANDRA-13299:
---

Relating to CASSANDRA-11670 this would also allow to write all streamed 
mutations to commitlog without problems.
I also propose to do so with small streams (see CASSANDRA-13290). Writing small 
streams (e.g. < 100KB) to commitlog does not require a flush at the end of 
stream receive. This avoids tons of flushes if tons of tiny streams are sent 
during a repair session.

These are maybe apples and oranges but fixing all these ends makes the whole 
process less error prone and probably perform better.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they are processed.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. As 
> longer a MV partition is locked during a stream, the higher chances are that 
> WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)