[ 
https://issues.apache.org/jira/browse/SOLR-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gus Heck updated SOLR-13457:
----------------------------
    Description: 
Presently, Solr has a variety of timeouts for various connections or 
operations. These timeouts have been added, tweaked and refined and in some 
cases made configurable in an ad-hoc manner by the contributors of individual 
features. Throughout the history of the project. This is all well and good 
until one experiences a timeout during an otherwise valid use case and needs to 
adjust it.

This has also made managing timeouts in unit tests "interesting" as noted in 
SOLR-13389.

Probably nobody has the spare time to do a tour de force through the code and 
coordinate every single timeout, so in this ticket I'd like to establish a 
framework for categorizing time outs, a standard for how we make each category 
configurable, and then add sub-tickets to address individual timeouts.

The intention is that eventually, there will be no "magic number" timeout 
values in code, and one can predict where to find the configuration for a 
timeout by determining it's category.

Initial strawman categories (feel free to knock down or suggest alternatives):
 # *Feature-Instance Timeout*: Timeouts that relate to a particular 
instantiation of a feature, for example a database connection timeout for a 
connection to a particular database by DIH. These should be set in the 
configuration of that instance.
 # *Optional Feature Timeout*: A timeout that only has meaning in the context 
of a particular feature that is not required for solr to function... i.e. 
something that can be turned on or off. Perhaps a timeout for communication 
with an external ldap for authentication purposes. These should be configured 
in the same configuration that enables this feature.
 # *Global System Timeout*: A timeout that will always be an active part of 
Solr these should be configured in a new <timeouts> section of solr.xml. For 
example the Jetty thread idle timeout, or the default timeout for http calls 
between nodes.
 # *Node Specific Timeout*: A timeout which may differ on different nodes. I 
don't know of any of these, but I'll grant the possibility. These (and only 
these) should be set by setting system properties. If we don't have any of 
these, that's just fine :).
 # *Client Timeout*: These are timeouts in solrj code that are active in code 
running outside the server. They should be configurable via java api, and via a 
config file of some sortĀ from a single location defined in a sysprop or sourced 
from classpath (in that order). When run on the server, the solrj code should 
look for a *Global System Timeout* setting before consulting sysprops or 
classpath.

*Note that in no case is a hard-coded value the correct solution.*

If we get a consensus on categories and their locations, then the next step is 
to begin adding sub tickets to bring specific timeouts into compliance. Every 
such ticket should include an update to the section of the ref guide 
documenting the configuration to which the timeout has been added (e.g. docs 
for solr.xml for Global System Timeouts) describing what exactly is affected by 
the timeout, the maximum allowed value and how zero and negative numbers are 
handled.

It is of course true that some of these values will have the potential to 
destroy system performance or integrity, and that should be mentioned in the 
update to documentation.

  was:
Presently, Solr has a variety of timeouts for various connections or 
operations. These timeouts have been added, tweaked and refined and in some 
cases made configurable in an ad-hoc manner by the contributors of individual 
features. Throughout the history of the project. This is all well and good 
until one experiences a timeout during an otherwise valid use case and needs to 
adjust it.

This has also made managing timeouts in unit tests "interesting" as noted in 
SOLR-13389.

Probably nobody has the spare time to do a tour de force through the code and 
coordinate every single timeout, so in this ticket I'd like to establish a 
framework for categorizing time outs, a standard for how we make each category 
configurable, and then add sub-tickets to address individual timeouts.

The intention is that eventually, there will be no "magic number" timeout 
values in code, and one can predict where to find the configuration for a 
timeout by determining it's category.

Initial strawman categories (feel free to knock down or suggest alternatives):
 # *Feature-Instance Timeout*: Timeouts that relate to a particular 
instantiation of a feature, for example a database connection timeout for a 
connection to a particular database by DIH. These should be set in the 
configuration of that instance.
 # *Optional Feature Timeout*: A timeout that only has meaning in the context 
of a particular feature that is not required for solr to function... i.e. 
something that can be turned on or off. Perhaps a timeout for communication 
with an external ldap for authentication purposes. These should be configured 
in the same configuration that enables this feature.
 # *Global System Timeout*: A timeout that will always be an active part of 
Solr these should be configured in a new <timeouts> section of solr.xml. For 
example the Jetty thread idle timeout, or the default timeout for http calls 
between nodes.
 # *Node Specific Timeout*: A timeout which may differ on different nodes. I 
don't know of any of these, but I'll grant the possibility. These (and only 
these) should be set by setting system properties. If we don't have any of 
these, that's just fine :).

*Note that in no case is a hard-coded value the correct solution.*

If we get a consensus on categories and their locations, then the next step is 
to begin adding sub tickets to bring specific timeouts into compliance. Every 
such ticket should include an update to the section of the ref guide 
documenting the configuration to which the timeout has been added (e.g. docs 
for solr.xml for Global System Timeouts) describing what exactly is affected by 
the timeout, the maximum allowed value and how zero and negative numbers are 
handled.

It is of course true that some of these values will have the potential to 
destroy system performance or integrity, and that should be mentioned in the 
update to documentation.


> Managing Timeout values in Solr
> -------------------------------
>
>                 Key: SOLR-13457
>                 URL: https://issues.apache.org/jira/browse/SOLR-13457
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: master (9.0)
>            Reporter: Gus Heck
>            Priority: Major
>
> Presently, Solr has a variety of timeouts for various connections or 
> operations. These timeouts have been added, tweaked and refined and in some 
> cases made configurable in an ad-hoc manner by the contributors of individual 
> features. Throughout the history of the project. This is all well and good 
> until one experiences a timeout during an otherwise valid use case and needs 
> to adjust it.
> This has also made managing timeouts in unit tests "interesting" as noted in 
> SOLR-13389.
> Probably nobody has the spare time to do a tour de force through the code and 
> coordinate every single timeout, so in this ticket I'd like to establish a 
> framework for categorizing time outs, a standard for how we make each 
> category configurable, and then add sub-tickets to address individual 
> timeouts.
> The intention is that eventually, there will be no "magic number" timeout 
> values in code, and one can predict where to find the configuration for a 
> timeout by determining it's category.
> Initial strawman categories (feel free to knock down or suggest alternatives):
>  # *Feature-Instance Timeout*: Timeouts that relate to a particular 
> instantiation of a feature, for example a database connection timeout for a 
> connection to a particular database by DIH. These should be set in the 
> configuration of that instance.
>  # *Optional Feature Timeout*: A timeout that only has meaning in the context 
> of a particular feature that is not required for solr to function... i.e. 
> something that can be turned on or off. Perhaps a timeout for communication 
> with an external ldap for authentication purposes. These should be configured 
> in the same configuration that enables this feature.
>  # *Global System Timeout*: A timeout that will always be an active part of 
> Solr these should be configured in a new <timeouts> section of solr.xml. For 
> example the Jetty thread idle timeout, or the default timeout for http calls 
> between nodes.
>  # *Node Specific Timeout*: A timeout which may differ on different nodes. I 
> don't know of any of these, but I'll grant the possibility. These (and only 
> these) should be set by setting system properties. If we don't have any of 
> these, that's just fine :).
>  # *Client Timeout*: These are timeouts in solrj code that are active in code 
> running outside the server. They should be configurable via java api, and via 
> a config file of some sortĀ from a single location defined in a sysprop or 
> sourced from classpath (in that order). When run on the server, the solrj 
> code should look for a *Global System Timeout* setting before consulting 
> sysprops or classpath.
> *Note that in no case is a hard-coded value the correct solution.*
> If we get a consensus on categories and their locations, then the next step 
> is to begin adding sub tickets to bring specific timeouts into compliance. 
> Every such ticket should include an update to the section of the ref guide 
> documenting the configuration to which the timeout has been added (e.g. docs 
> for solr.xml for Global System Timeouts) describing what exactly is affected 
> by the timeout, the maximum allowed value and how zero and negative numbers 
> are handled.
> It is of course true that some of these values will have the potential to 
> destroy system performance or integrity, and that should be mentioned in the 
> update to documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to