[ https://issues.apache.org/jira/browse/CASSANDRA-13569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035393#comment-16035393 ]
Matt Byrd commented on CASSANDRA-13569: --------------------------------------- Sure n.p [~spo...@gmail.com] Yes, so adding jitter in MIGRATION_DELAY_IN_MS could help when we're past: {code:java} | runtimeMXBean.getUptime() < MIGRATION_DELAY_IN_MS) {code} However it doesn't help on startup. Initially in trying to solve CASSANDRA-11748, I did also think about adding random the delay for even this branch (where we've only been up a short amount of time). This just didn't seem that straightforward to do and also guarantee that we wouldn't hit the problem described in CASSANDRA-11748. How do you know what is enough random delay? what if you actually delay getting the schema legitimately? I suppose the concerns in this ticket are similar but not exactly the same as CASSANDRA-11748, though I admit that rate limiting the number of schema pulls per endpoint to one at a time seems sensible and might possibly help a bit with CASSANDRA-11748. The schema is being pulled repeatedly from the same instances in CASSANDRA-11748, but I'm not sure rate limiting alone as described above will definitely solve it, perhaps it will make it less likely to OOM, but we're still going to have a lot of incoming serialised schemas from lots of nodes and we're still left with this sort of rough limit to scalability of "number of nodes * size of serialised schema" (albeit perhaps with a different threshold). Maybe some upcoming changes in CASSANDRA-10699 and related tickets may make the problem CASSANDRA-11748 even less likely, since part of the problem is that we're sending the entire serialised schema inside a mutation, which can end up being quite large if you have lots of tables or lots of columns in lots of tables. Also, for reference I believe the migration delay was added in the following ticket, in order to give a schema alteration sufficient time to propagate from the node where it changed, and not have a migration task race with this change and pull the whole schema instead of receive the delta: https://issues.apache.org/jira/browse/CASSANDRA-5025 > Schedule schema pulls just once per endpoint > -------------------------------------------- > > Key: CASSANDRA-13569 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13569 > Project: Cassandra > Issue Type: Improvement > Components: Distributed Metadata > Reporter: Stefan Podkowinski > Assignee: Stefan Podkowinski > Fix For: 3.0.x, 3.11.x, 4.x > > > Schema mismatches detected through gossip will get resolved by calling > {{MigrationManager.maybeScheduleSchemaPull}}. This method may decide to > schedule execution of {{MigrationTask}}, but only after using a > {{MIGRATION_DELAY_IN_MS = 60000}} delay (for reasons unclear to me). > Meanwhile, as long as the migration task hasn't been executed, we'll continue > to have schema mismatches reported by gossip and will have corresponding > {{maybeScheduleSchemaPull}} calls, which will schedule further tasks with the > mentioned delay. Some local testing shows that dozens of tasks for the same > endpoint will eventually be executed and causing the same, stormy behavior > for this very endpoints. > My proposal would be to simply not schedule new tasks for the same endpoint, > in case we still have pending tasks waiting for execution after > {{MIGRATION_DELAY_IN_MS}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org