[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-18 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501152#comment-14501152
 ] 

Shai Erera commented on SOLR-7408:
--

There are two ctors which aren't called by Solr code. Is SolrCore considered 
public API, and so we should keep these ctors, or can I remove them?

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-17 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499217#comment-14499217
 ] 

Shai Erera commented on SOLR-7408:
--

bq. Though I think I understand what you're saying here, can you elaborate more 
on this?

If we wanted to change the code such that we put a listener in the map on a 
SolrCore creation, and remove it from the map on a SolrCore close, I believe we 
wouldn't be running into such concurrency issues. In a sense, this is what is 
done when all is *good*: a SolrCore puts a listener in its ctor, and removes it 
in its close().

But if something goes *wrong*, we may leave dangling listeners, of SolrCore 
instances that no longer exist. This is what I believe 
({{CoreAdminHandler.handleCreateAction}} attempts to do -- if a core creation 
failed, it attempts to unregister all listeners of a configDir from the map, 
and lets {{unregister}} decide if the entry itself can be removed or not. This 
ensures that we won't be left w/ dangling listeners that will never be released 
- what I referred to as leaking listeners.

The code in {{unregister}} relies on the same logic that introduces the bug -- 
if there is core in SolrCores which references this configDir, remove all 
listeners. The problem is that a core registers a listener, before it is put in 
SolrCores, and hence the race condition.

I would personally prefer that we stop removing all listeners, and let a core 
take care of itself, but I don't know how safe is Solr code in that regard. 
I.e. are all places that create a SolrCore clean up after it in the event of a 
failure? Clearly {{CoreAdminHandler.handleCreateAction}} doesn't, which got me 
thinking what other places don't do that as well.

But, if we want to change the logic like that, we can certainly look at all the 
places that do {{new SolrCore(...)}} and make sure they call 
{{SolrCore.close()}} in the event of any failure.

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-17 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499394#comment-14499394
 ] 

Anshum Gupta commented on SOLR-7408:


I like the idea of calling {{SolrCore.close()}} and letting that handle the 
responsibility of unregistering (already happens).
Does it make more sense to have this in a different JIRA or at least change the 
title/summary of this one to highlight the new/updated intention?


 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499354#comment-14499354
 ] 

Noble Paul commented on SOLR-7408:
--

looks good to me

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-17 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499636#comment-14499636
 ] 

Shai Erera commented on SOLR-7408:
--

I will update the title of this JIRA and handle it here. I like this better 
than doing what I consider more of a hack to the code and later change it. 
SolrCore is initialized in two places, so shouldn't be complicated to ensure it 
is closed in case of errors.

While I'm at it, I'll try to simplify the ctor by breaking it out to some 
auxiliary methods, instead of having a 250 lines method!

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-17 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499956#comment-14499956
 ] 

Anshum Gupta commented on SOLR-7408:


+1 to both of those!

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7408) Race condition can cause a config directory listener to be removed

2015-04-16 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498933#comment-14498933
 ] 

Anshum Gupta commented on SOLR-7408:


Thanks Shai. The patch looks good to me in what it does. 

Though I think I understand what you're saying here, can you elaborate more on 
this:
bq. However, it seems like we call unregister even in the event of errors 
(CoreAdminHandler.handleCreateAction), forcing the removal of the configDir 
listener, and it's not clear if we also call SolrCore.close(). So we may start 
leaking listeners that will never be removed ..

 Race condition can cause a config directory listener to be removed
 --

 Key: SOLR-7408
 URL: https://issues.apache.org/jira/browse/SOLR-7408
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: SOLR-7408.patch, SOLR-7408.patch


 This has been reported here: http://markmail.org/message/ynkm2axkdprppgef, 
 and I was able to reproduce it in a test, although I am only able to 
 reproduce if I put break points and manually simulate the problematic context 
 switches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org