[jira] [Commented] (SOLR-12050) UTILIZENODE does not enforce policy rules

2018-03-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383222#comment-16383222
 ] 

ASF subversion and git services commented on SOLR-12050:


Commit 23aee00213a2c48bd578bcf01a5ed435b0bdc881 in lucene-solr's branch 
refs/heads/master from noble
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=23aee00 ]

SOLR-12031: Refactor Policy framework to make simulated changes affect more 
than a single node
SOLR-12050: UTILIZENODE does not enforce policy rules


> UTILIZENODE does not enforce policy rules
> -
>
> Key: SOLR-12050
> URL: https://issues.apache.org/jira/browse/SOLR-12050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-12050.log.txt
>
>
> I've been poking around TestUtilizeNode and some of it's recent jenkins 
> failures -- AFAICT the {{UTILIZENODE}} is not behaving correctly per it's 
> current documentation...
> bq. It tries to fix any policy violations first and then it tries to move 
> some load off of the most loaded nodes according to the preferences
> ...based on my testing w/a slightly modified testcase that does additional 
> logging/asserts, it will frequently choose to move a "random" replica to 
> move, even when there are existing replicas that violate the policy.
> I will be commiting my current improvements to the test while citing this 
> issue, and marking the test \@AwaitsFix  Then i'll attach some logs/comments 
> showing what i mean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12050) UTILIZENODE does not enforce policy rules

2018-03-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383217#comment-16383217
 ] 

ASF subversion and git services commented on SOLR-12050:


Commit 888c6260f122d03beec03615469dbed444ab62e7 in lucene-solr's branch 
refs/heads/branch_7x from noble
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=888c626 ]

SOLR-12031: Refactor Policy framework to make simulated changes affect more 
than a single node
SOLR-12050: UTILIZENODE does not enforce policy rules


> UTILIZENODE does not enforce policy rules
> -
>
> Key: SOLR-12050
> URL: https://issues.apache.org/jira/browse/SOLR-12050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-12050.log.txt
>
>
> I've been poking around TestUtilizeNode and some of it's recent jenkins 
> failures -- AFAICT the {{UTILIZENODE}} is not behaving correctly per it's 
> current documentation...
> bq. It tries to fix any policy violations first and then it tries to move 
> some load off of the most loaded nodes according to the preferences
> ...based on my testing w/a slightly modified testcase that does additional 
> logging/asserts, it will frequently choose to move a "random" replica to 
> move, even when there are existing replicas that violate the policy.
> I will be commiting my current improvements to the test while citing this 
> issue, and marking the test \@AwaitsFix  Then i'll attach some logs/comments 
> showing what i mean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12050) UTILIZENODE does not enforce policy rules

2018-03-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382930#comment-16382930
 ] 

Hoss Man commented on SOLR-12050:
-


I've attached a sample log file from running this test after my assert/logging 
updates, if you look for the new logging messages, it's pretty easy to see that 
while the 2nd UTILIZE command is causing a replica to be moved onto the new 
node (jettyY), it seems to be completley ignoring the fact that there is a core 
hosted on a "blacklist" (per the policy) port (jettyX) that should be the first 
candidate for being moved...

{noformat}
  // in this particular run, the first UTILIZENODE command works,
  // it moves a replica off a random node to jettyX/3
  //
  // (allthough see TODO in test -- based on how the docs are worded,
  // it's not clear if there's any requirement that it do so)
  
9201 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode Sending UTILIZE command for jettyX 
(127.0.0.1:3_solr)
9204 INFO  (qtp1498399719-45) [n:127.0.0.1:33567_solr] 
o.a.s.h.a.CollectionsHandler Invoked Collection Action :utilizenode with params 
node=127.0.0.1:3_solr=UTILIZENODE=javabin=2 and 
sendToOCPQueue=true
  ...
9355 INFO  
(OverseerThreadFactory-20-thread-3-processing-n:127.0.0.1:46180_solr) 
[n:127.0.0.1:46180_solr] o.a.s.c.a.c.MoveReplicaCmd Replica will be moved 
to node 127.0.0.1:3_solr: 
core_node8:{"core":"utilizenodecoll_shard2_replica_n7","base_url":"http://127.0.0.1:33567/solr","node_name":"127.0.0.1:33567_solr","state":"active","type":"NRT"}
9361 INFO  
(OverseerThreadFactory-20-thread-3-processing-n:127.0.0.1:46180_solr) 
[n:127.0.0.1:46180_solr] o.a.s.c.a.c.AddReplicaCmd Node Identified 
127.0.0.1:3_solr for creating new replica
  ...
10078 INFO  (qtp1498399719-45) [n:127.0.0.1:33567_solr] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections 
params={node=127.0.0.1:3_solr=UTILIZENODE=javabin=2} 
status=0 QTime=874
  
  // next up, sanity check which replicas jettyX/3 now has,
  // then set a new policy saying that port 3 should have 0 replicas...

10079 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode jettyX replicas prior to being blacklisted: 
[core_node10:{"core":"utilizenodecoll_shard2_replica_n9","base_url":"http://127.0.0.1:3/solr","node_name":"127.0.0.1:3_solr","state":"recovering","type":"NRT"}]
10079 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode Setting new policy to blacklist jettyX 
(127.0.0.1:3_solr) port=3
  ...
10143 INFO  (qtp1498399719-27) [n:127.0.0.1:33567_solr] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/autoscaling 
params={wt=javabin=2} status=0 QTime=59

  // now spin up another new node: jettyY/55619, 
  // redundently sanity check the replicas on jettyX again,

10144 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode Spinning up additional jettyY...
  ...
10361 INFO  (zkConnectionManagerCallback-78-thread-1) [] 
o.a.s.c.c.ConnectionManager zkClient has connected
10365 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode jettyX replicas prior to utilizing jettyY: 
[core_node10:{"core":"utilizenodecoll_shard2_replica_n9","base_url":"http://127.0.0.1:3/solr","node_name":"127.0.0.1:3_solr","state":"recovering","type":"NRT"}]

  // Now send a UTILIZENODE command for jettyY/55619,
  // this *should* move the replica from jettyX->jettyY
  // (in order to resolve the existing policy violation)

10365 INFO  (TEST-TestUtilizeNode.test-seed#[78A4DE08FC5237FE]) [] 
o.a.s.c.TestUtilizeNode Sending UTILIZE command for jettyY 
(127.0.0.1:55619_solr)
10366 INFO  (qtp1498399719-45) [n:127.0.0.1:33567_solr] 
o.a.s.h.a.CollectionsHandler Invoked Collection Action :utilizenode with params 
node=127.0.0.1:55619_solr=UTILIZENODE=javabin=2 and 
sendToOCPQueue=true
  ...
10448 INFO  
(OverseerThreadFactory-20-thread-4-processing-n:127.0.0.1:46180_solr) 
[n:127.0.0.1:46180_solr] o.a.s.c.a.c.MoveReplicaCmd Replica will be moved 
to node 127.0.0.1:55619_solr: 
core_node6:{"core":"utilizenodecoll_shard2_replica_n5","base_url":"http://127.0.0.1:46180/solr","node_name":"127.0.0.1:46180_solr","state":"active","type":"NRT","leader":"true"}
10450 INFO  
(OverseerThreadFactory-20-thread-4-processing-n:127.0.0.1:46180_solr) 
[n:127.0.0.1:46180_solr] o.a.s.c.a.c.AddReplicaCmd Node Identified 
127.0.0.1:55619_solr for creating new replica
  ...
12710 INFO  (qtp1498399719-45) [n:127.0.0.1:33567_solr] 
o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections 
params={node=127.0.0.1:55619_solr=UTILIZENODE=javabin=2} 
status=0 QTime=2343

  // but as you can see above, the replica that's added to jettyY/55619
  // comes from a completley different node on port 

[jira] [Commented] (SOLR-12050) UTILIZENODE does not enforce policy rules

2018-03-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382922#comment-16382922
 ] 

ASF subversion and git services commented on SOLR-12050:


Commit e2b3a97587a4387ab138252354d819ce253b625f in lucene-solr's branch 
refs/heads/branch_7x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e2b3a97 ]

SOLR-12050: mark TestUtilizeNode as AwaitsFix as well as adding additional 
logging/assertions to help see what the bug is

(cherry picked from commit 0424d9c06ba52037024ce5f0f678b2aca8c34fb7)


> UTILIZENODE does not enforce policy rules
> -
>
> Key: SOLR-12050
> URL: https://issues.apache.org/jira/browse/SOLR-12050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> I've been poking around TestUtilizeNode and some of it's recent jenkins 
> failures -- AFAICT the {{UTILIZENODE}} is not behaving correctly per it's 
> current documentation...
> bq. It tries to fix any policy violations first and then it tries to move 
> some load off of the most loaded nodes according to the preferences
> ...based on my testing w/a slightly modified testcase that does additional 
> logging/asserts, it will frequently choose to move a "random" replica to 
> move, even when there are existing replicas that violate the policy.
> I will be commiting my current improvements to the test while citing this 
> issue, and marking the test \@AwaitsFix  Then i'll attach some logs/comments 
> showing what i mean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12050) UTILIZENODE does not enforce policy rules

2018-03-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16382923#comment-16382923
 ] 

ASF subversion and git services commented on SOLR-12050:


Commit 0424d9c06ba52037024ce5f0f678b2aca8c34fb7 in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0424d9c ]

SOLR-12050: mark TestUtilizeNode as AwaitsFix as well as adding additional 
logging/assertions to help see what the bug is


> UTILIZENODE does not enforce policy rules
> -
>
> Key: SOLR-12050
> URL: https://issues.apache.org/jira/browse/SOLR-12050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> I've been poking around TestUtilizeNode and some of it's recent jenkins 
> failures -- AFAICT the {{UTILIZENODE}} is not behaving correctly per it's 
> current documentation...
> bq. It tries to fix any policy violations first and then it tries to move 
> some load off of the most loaded nodes according to the preferences
> ...based on my testing w/a slightly modified testcase that does additional 
> logging/asserts, it will frequently choose to move a "random" replica to 
> move, even when there are existing replicas that violate the policy.
> I will be commiting my current improvements to the test while citing this 
> issue, and marking the test \@AwaitsFix  Then i'll attach some logs/comments 
> showing what i mean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org