[ 
https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731976#comment-16731976
 ] 

Andrzej Bialecki  commented on SOLR-13050:
------------------------------------------

No components depend on the events being stored in the {{.system}} collection 
because this listener is optional (it's added by default to all new triggers 
but can be removed), so this failure should have no impact on proper 
functioning of autoscaling.

However, there may be other tests that depend on this functionality - several 
tests verify that certain events are present but I'm not sure if they all make 
sure not to kill the {{.system}} leader, so I'm going to check this.

> SystemLogListener can "lose" record of nodeLost event when node lost is/was 
> .system collection leader
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13050
>                 URL: https://issues.apache.org/jira/browse/SOLR-13050
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>         Attachments: SOLR-13050.test-workaround.patch, 
> jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the 
> {{.system}} collection to record event history is that in the case of a 
> {{nodeLost}} event for the {{.system}} collection's leader, there is a window 
> of time during leader election where trying to add the "Document" 
> representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a 
> Solr client, is informed that the "add" failed, but it doesn't/can't do much 
> to deal with this situation other then to "log" (to the slf4j Logger) that it 
> wasn't able to add the doc.
> ----
> I'm not sure how much of a "real world" impact this has on users, but I 
> noticed the issue while diagnosing a jenkins test failure and wanted to track 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to