[ 
https://issues.apache.org/jira/browse/SOLR-5043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701720#comment-14701720
 ] 

Hoss Man commented on SOLR-5043:
--------------------------------

Ram: the approach you're suggesting was explicitly discussed in the past and 
deemed a bad idea because it means that if DNS changes there is no way for a 
solr admin to trigger a "refresh" of the "cached" hostname -- which is why the 
existing code explicitly has the comment "not static, so core reload will 
refresh" -- that way users could at least trigger a core reload when doing 
things like server swaps (or disaster recovery fail over of whole colos using 
subnet aliases, etc...)

bq. Having this on a per-instance basis like it does currently would also mean 
that you have one more thread running per core, even if temporarily, that might 
cause issues if you have lots of cores starting up at the same time in a JVM.

which is why the suggestion of using a single threaded CompletionService, 
overwritting a single static variable each time there is a core reload, is a 
better one (that unfortunately noone has ever got arround to implementing)

bq. ...on the flip side, with this patch, you might sometimes not get the 
hostname when you expect it (so technically it's a functional difference)

It's always been the case that if there is a DNS problem you'll get {{null}} 
instead of the canonical hostname, and at a later time you might start getting 
a new/different/correct hostname (currently if/when DNS is fixed and hte core 
reloads) so improvements that return null if/when a hostname is unresolvable 
shouldn't be ruled out just for that reason.

----

The more we talk about the various trade offs involved with this type of 
problem, the more and more I ultimately feel like we really shouldn't invest 
too much complexity in the code just to account for people with bad/broken DNS 
configurations.

my current feeling is:
* we should "fix" the current SystemInfoHandler init logic to log errors when 
there are DNS problems so they should up in the logs, but otherwise leave 
things along.
* For situations like SOLR-7884 i think an advanced, seriously expert, system 
property that says "hey solr, under no circumstances should you try to do any 
DNS lookups because i know my DNS is busted" would be a good idea, and should 
be implemented by a generic helper method for use both here and in the various 
parts of the disributed update/search logic in the cloud code. (with forbidden 
API checks to prevent any future code other then this helper method from doing 
DNS related methods)



> hostanme lookup in SystemInfoHandler should be refactored to not block core 
> (re)load
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-5043
>                 URL: https://issues.apache.org/jira/browse/SOLR-5043
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>         Attachments: SOLR-5043-lazy.patch, SOLR-5043.patch
>
>
> SystemInfoHandler currently lookups the hostname of the machine on it's init, 
> and caches for it's lifecycle -- there is a comment to the effect that the 
> reason for this is because on some machines (notably ones with wacky DNS 
> settings) looking up the hostname can take a long ass time in some JVMs...
> {noformat}
>   // on some platforms, resolving canonical hostname can cause the thread
>   // to block for several seconds if nameservices aren't available
>   // so resolve this once per handler instance 
>   //(ie: not static, so core reload will refresh)
> {noformat}
> But as we move forward with a lot more multi-core, solr-cloud, dynamically 
> updated instances, even paying this cost per core-reload is expensive.
> we should refactoring this so that SystemInfoHandler instances init 
> immediately, with some kind of lazy loading of the hostname info in a 
> background thread, (especially since hte only real point of having that info 
> here is for UI use so you cna keep track of what machine you are looking at)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to