[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733453#comment-16733453
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit aee7acdf71444ae7d863dcb2b86a41f604c6a434 in lucene-solr's branch 
refs/heads/branch_7x from Cassandra Targett
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=aee7acd ]

SOLR-13050: make italicized note into a real NOTE block


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733452#comment-16733452
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit ec43d100d1dd429829758a4f672a37536e447ed0 in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ec43d10 ]

SOLR-13050: make italicized note into a real NOTE block


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732908#comment-16732908
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit 490f560377b849bb21862302ca13195e23778f0f in lucene-solr's branch 
refs/heads/branch_7x from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=490f560 ]

SOLR-13050: Add a note to the ref guide.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732906#comment-16732906
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit dbcb24506227b973d086675448c3b262a142b502 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dbcb245 ]

SOLR-13050: Add a note to the ref guide.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732434#comment-16732434
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit 1e735a1128256c4da7ebccd639a9d6579077aebe in lucene-solr's branch 
refs/heads/branch_7x from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1e735a1 ]

SOLR-13050: Fix the test so that .system events are collected again.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732429#comment-16732429
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit e5fda5d6f1cc6010e6b3565aa4bd70be31621692 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e5fda5d ]

SOLR-13050: Fix the test so that .system events are collected again.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732199#comment-16732199
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit 0f0b80ff63f6d1f4e746c6463f442a48f26100f3 in lucene-solr's branch 
refs/heads/branch_7x from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f0b80f ]

SOLR-13050: Fix another test that could accidentally kill the .system leader 
node.
Improve fallback in SystemLogListener when target collection is not present.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732176#comment-16732176
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit b9457b78d573c133e1beeb50cdf1fc786ae0c0f4 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b9457b7 ]

SOLR-13050: Fix another test that could accidentally kill the .system leader 
node.
Improve fallback in SystemLogListener when target collection is not present.


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2019-01-02 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731976#comment-16731976
 ] 

Andrzej Bialecki  commented on SOLR-13050:
--

No components depend on the events being stored in the {{.system}} collection 
because this listener is optional (it's added by default to all new triggers 
but can be removed), so this failure should have no impact on proper 
functioning of autoscaling.

However, there may be other tests that depend on this functionality - several 
tests verify that certain events are present but I'm not sure if they all make 
sure not to kill the {{.system}} leader, so I'm going to check this.

> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716647#comment-16716647
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit a2199c72d40c8aaf55dd9ca20816c2aa1ee805ea in lucene-solr's branch 
refs/heads/jira/http2 from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a2199c7 ]

SOLR-13050: add workaround for issue to SystemLogListenerTest

make sure the node we kill isn't the .system collection leader


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2018-12-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716030#comment-16716030
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit a2199c72d40c8aaf55dd9ca20816c2aa1ee805ea in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a2199c7 ]

SOLR-13050: add workaround for issue to SystemLogListenerTest

make sure the node we kill isn't the .system collection leader


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

2018-12-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716031#comment-16716031
 ] 

ASF subversion and git services commented on SOLR-13050:


Commit 1dbce5ed44b9610b702380eb58722e41532caa45 in lucene-solr's branch 
refs/heads/branch_7x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1dbce5e ]

SOLR-13050: add workaround for issue to SystemLogListenerTest

make sure the node we kill isn't the .system collection leader

(cherry picked from commit a2199c72d40c8aaf55dd9ca20816c2aa1ee805ea)


> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -
>
> Key: SOLR-13050
> URL: https://issues.apache.org/jira/browse/SOLR-13050
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
> Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> 
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org