[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15804750#comment-15804750 ] ASF subversion and git services commented on SOLR-9503: --- Commit 2b66d0cb127b5e3e92a0f988aa7ba10690227ac3 in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2b66d0c ] SOLR-9503: NPE in Replica Placement Rules when using Overseer Role with other rules > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen >Assignee: Noble Paul > Attachments: SOLR-9503.patch, SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15804740#comment-15804740 ] ASF subversion and git services commented on SOLR-9503: --- Commit cd4f908d5ba223e615920be73285b7c5cc57704a in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cd4f908 ] SOLR-9503: NPE in Replica Placement Rules when using Overseer Role with other rules > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen >Assignee: Noble Paul > Attachments: SOLR-9503.patch, SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800758#comment-15800758 ] Noble Paul commented on SOLR-9503: -- [~TimOwen] The fix looks fine. I could commit it if we could add a JUnit > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen >Assignee: Noble Paul > Attachments: SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798416#comment-15798416 ] Tim Owen commented on SOLR-9503: Is anyone able to take a look at this fix - maybe [~noble.paul]? I hope the assumptions I've made in the diff are correct. We've been using it in production for a few months, in our custom build of Solr. Would be nice to roll it in upstream. > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen > Attachments: SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798437#comment-15798437 ] Noble Paul commented on SOLR-9503: -- I missed it tim. I shall take a look at it tomorrow > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen > Attachments: SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9503) NPE in Replica Placement Rules when using Overseer Role with other rules
[ https://issues.apache.org/jira/browse/SOLR-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484252#comment-15484252 ] Tim Owen commented on SOLR-9503: As an aside, I noticed that `Rule.Operand.GREATER_THAN` seems to be missing an override for `public int compare(Object n1Val, Object n2Val)` .. but compare only appears to be used when sorting the live nodes, so maybe it's not a big deal? > NPE in Replica Placement Rules when using Overseer Role with other rules > > > Key: SOLR-9503 > URL: https://issues.apache.org/jira/browse/SOLR-9503 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Rules, SolrCloud >Affects Versions: 6.2, master (7.0) >Reporter: Tim Owen > Attachments: SOLR-9503.patch > > > The overseer role introduced in SOLR-9251 works well if there's only a single > Rule for replica placement e.g. {code}rule=role:!overseer{code} but when > combined with another rule, e.g. > {code}rule=role:!overseer=host:*,shard:*,replica:<2{code} it can result > in a NullPointerException (in Rule.tryAssignNodeToShard) > This happens because the code builds up a nodeVsTags map, but it only has > entries for nodes that have values for *all* tags used among the rules. This > means not enough information is available to other rules when they are being > checked during replica assignment. In the example rules above, if we have a > cluster of 12 nodes and only 3 are given the Overseer role, the others do not > have any entry in the nodeVsTags map because they only have the host tag > value and not the role tag value. > Looking at the code in ReplicaAssigner.getTagsForNodes, it is explicitly only > keeping entries that fulfil the constraint of having values for all tags used > in the rules. Possibly this constraint was suitable when rules were > originally introduced, but the Role tag (used for Overseers) is unlikely to > be present for all nodes in the cluster, and similarly for sysprop tags which > may or not be set for a node. > My patch removes this constraint, so the nodeVsTags map contains everything > known about all nodes, even if they have no value for a given tag. This > allows the rule combination above to work, and doesn't appear to cause any > problems with the code paths that use the nodeVsTags map. They handle null > values quite well, and the tests pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org