[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-25 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13299:
---
Fix Version/s: (was: 4.0)
   4.x

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.x
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-25 Thread Nate McCall (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate McCall updated CASSANDRA-13299:

Fix Version/s: 4.0

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.0
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-25 Thread Nate McCall (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate McCall updated CASSANDRA-13299:

Component/s: Materialized Views

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-14 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-13299:

Status: Open  (was: Patch Available)

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-07 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13299:
-
Reviewer: Paulo Motta

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-07 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13299:
-
Status: Patch Available  (was: Awaiting Feedback)

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13299:
-
Status: Awaiting Feedback  (was: Open)

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-03-05 Thread Benjamin Roth (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Roth updated CASSANDRA-13299:
--
Description: 
I see a potential OOM, when a stream (e.g. repair) goes through the write path 
as it is with MVs.

StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
and they again produce mutations. So every partition creates a single mutation, 
which in case of (very) big partitions can result in (very) big mutations. 
Those are created on heap and stay there until they finished processing.

I don't think it is necessary to create a single mutation for each partition. 
Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
UnfilteredRowIterator and a max size and spits out PartitionUpdates to be used 
to create and apply mutations?
The max size should be something like min(reasonable_absolute_max_size, 
max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
could be like 16M or sth.
A mutation shouldn't be too large as it also affects MV partition locking. The 
longer a MV partition is locked during a stream, the higher chances are that 
WTE's occur during streams.
I could also imagine that a max number of updates per mutation regardless of 
size in bytes could make sense to avoid lock contention.

Love to get feedback and suggestions, incl. naming suggestions.


  was:
I see a potential OOM, when a stream (e.g. repair) goes through the write path 
as it is with MVs.

StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
and they again produce mutations. So every partition creates a single mutation, 
which in case of (very) big partitions can result in (very) big mutations. 
Those are created on heap and stay there until they are processed.

I don't think it is necessary to create a single mutation for each partition. 
Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
UnfilteredRowIterator and a max size and spits out PartitionUpdates to be used 
to create and apply mutations?
The max size should be something like min(reasonable_absolute_max_size, 
max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
could be like 16M or sth.
A mutation shouldn't be too large as it also affects MV partition locking. As 
longer a MV partition is locked during a stream, the higher chances are that 
WTE's occur during streams.
I could also imagine that a max number of updates per mutation regardless of 
size in bytes could make sense to avoid lock contention.

Love to get feedback and suggestions, incl. naming suggestions.



> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)