[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13299: --- Fix Version/s: (was: 4.0) 4.x > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views >Reporter: Benjamin Roth >Assignee: ZhaoYang > Fix For: 4.x > > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate McCall updated CASSANDRA-13299: Fix Version/s: 4.0 > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views >Reporter: Benjamin Roth >Assignee: ZhaoYang > Fix For: 4.0 > > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate McCall updated CASSANDRA-13299: Component/s: Materialized Views > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views >Reporter: Benjamin Roth >Assignee: ZhaoYang > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-13299: Status: Open (was: Patch Available) > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: ZhaoYang > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-13299: - Reviewer: Paulo Motta > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: ZhaoYang > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-13299: - Status: Patch Available (was: Awaiting Feedback) > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: ZhaoYang > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-13299: - Status: Awaiting Feedback (was: Open) > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: ZhaoYang > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
[ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Roth updated CASSANDRA-13299: -- Description: I see a potential OOM, when a stream (e.g. repair) goes through the write path as it is with MVs. StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they again produce mutations. So every partition creates a single mutation, which in case of (very) big partitions can result in (very) big mutations. Those are created on heap and stay there until they finished processing. I don't think it is necessary to create a single mutation for each partition. Why don't we implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max size and spits out PartitionUpdates to be used to create and apply mutations? The max size should be something like min(reasonable_absolute_max_size, max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth. A mutation shouldn't be too large as it also affects MV partition locking. The longer a MV partition is locked during a stream, the higher chances are that WTE's occur during streams. I could also imagine that a max number of updates per mutation regardless of size in bytes could make sense to avoid lock contention. Love to get feedback and suggestions, incl. naming suggestions. was: I see a potential OOM, when a stream (e.g. repair) goes through the write path as it is with MVs. StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they again produce mutations. So every partition creates a single mutation, which in case of (very) big partitions can result in (very) big mutations. Those are created on heap and stay there until they are processed. I don't think it is necessary to create a single mutation for each partition. Why don't we implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max size and spits out PartitionUpdates to be used to create and apply mutations? The max size should be something like min(reasonable_absolute_max_size, max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth. A mutation shouldn't be too large as it also affects MV partition locking. As longer a MV partition is locked during a stream, the higher chances are that WTE's occur during streams. I could also imagine that a max number of updates per mutation regardless of size in bytes could make sense to avoid lock contention. Love to get feedback and suggestions, incl. naming suggestions. > Potential OOMs and lock contention in write path streams > > > Key: CASSANDRA-13299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13299 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth > > I see a potential OOM, when a stream (e.g. repair) goes through the write > path as it is with MVs. > StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators > and they again produce mutations. So every partition creates a single > mutation, which in case of (very) big partitions can result in (very) big > mutations. Those are created on heap and stay there until they finished > processing. > I don't think it is necessary to create a single mutation for each partition. > Why don't we implement a PartitionUpdateGeneratorIterator that takes a > UnfilteredRowIterator and a max size and spits out PartitionUpdates to be > used to create and apply mutations? > The max size should be something like min(reasonable_absolute_max_size, > max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size > could be like 16M or sth. > A mutation shouldn't be too large as it also affects MV partition locking. > The longer a MV partition is locked during a stream, the higher chances are > that WTE's occur during streams. > I could also imagine that a max number of updates per mutation regardless of > size in bytes could make sense to avoid lock contention. > Love to get feedback and suggestions, incl. naming suggestions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)